The meta-bug, curse of the bootstrap

 

This post is rather technical, but it is important to be aware that a bootstrap does not always progress smoothly. One can be locked up in a situation where the mistake that led to this situation prevents us to find or correct it. We are like the short-sighted who cannot find his spectacles because they are not on his nose.
The initial version of a program usually includes many bugs. Luckily, it is easy to find most of them, because they often occur in parts that has just been modified, and the first use of the program clearly shows that there is a mistake: one has to look at the last modifications.

Unfortunately, this is not the case in a bootstrap, especially when one uses declarative knowledge for using declarative knowledge. There are two kinds of knowledge: knowledge of a particular domain such as mathematics, chess or architecture, and  meta-knowledge that indicates how to use knowledge. We have a reflective system; reflectivity is at the heart of the interest of this approach, and of our difficulties: meta-knowledge is given in a declarative form, so that it is also used for using itself, for instance for translating itself into procedures.

The meta-bug is a bug in the meta-knowledge, so that all kinds of knowledge will be misused, including meta-knowledge. Particularly, when it is used for writing a program, it will create bugs in this program. The difficulty is that the knowledge used for creating this program is correct, but the knowledge that creates it is incorrect. Therefore a bug appears in a program created from knowledge that is perfectly correct. Finding the bug is difficult because it can appear anywhere, in parts that were always running faultlessly.

Some meta-bugs may be evident, and it is easy to correct them. For instance, the rule indicating to put a semi-colon at the end of a C instruction has been deleted. Programs will be generated with as many bugs as instructions, but they are evident: adding a rule indicating to write the semi-colon will easily correct this meta-bug. However, most errors are not so easy to find.

First, we have to solve two problems: finding the bug, then finding the meta-bug that leads to this error. We have already said that it may be difficult to find the bug because it can occur in any part of the program, which has not been modified since a long time, and which was always giving satisfactory results. The knowledge is correct, it not the reason of the bug. It is its use which is incorrect: this can happen anywhere.

Once the bug has been found, we have to find the meta-bug that leads to it. This may be easier, since it likely occurs in the last bits of meta-knowledge that has been modified. However, searching the bug and the meta-bug may be made more difficult because the meta-bug can also disrupt the tools that are helping to debug!

Finally, at worst, it is often difficult to correct the meta-bug once it has been found, because the meta-bug can forbid its correction: the system may be blocked, it is no longer operational. We are in the same situation as when we have put the key into the letterbox: if we had it, it would be easy to open the box and take it.
Two methods can be used when the system is blocked. We can restart from an old backup made before the meta-bug, but we have to make all the modifications made since this backup, naturally except the wrong one. We can also modify all the incorrect programs so that the system becomes operational; this is possible if we have not to make too many modifications. Once this has been done, as the system is now running, we can make all the changes in the knowledge that are necessary so that the meta-bug will be removed for good.

Finding and correcting a meta-bug requires a lot of time. I am finding and correcting several bugs each day, at least one day is necessary for removing a meta-bug.

Unfortunately, the dreadful meta-meta-bug may also happen: it creates meta-bugs that will later create bugs. In this case, it is very difficult to find the meta-bug  since it may happen anywhere; moreover, we have to find three bugs one after the other. In almost 30 years, it happened only twice, but each time I had to waste a week for restoring the system. Theoretically, this ascent has no limit, but I never had to go beyond the third level.

One thought on “The meta-bug, curse of the bootstrap

  1. The system must be robust enough to errors so as to be viable even if it is partly « broken », in particular when its designer makes a faulty change.
    In my experience, a system must include much expertise to find out errors, wherever they come from. How much of such knowledge is necessary? I had one third of the system which was devoted to that. That sounds much! Maybe it was because it was in its infancy, and also because the designer had not learnt yet good practices.

    Another aspect, which I did not experiment, is to have redundant knowledge at least for things to do that could be lethal to the system. This could be e.g. old and new versions both kept and potentially active.

    The last aspect, the most mysterious one today, is to know how such a system can work despite errors in its behavior. My experience is that this happens – there is an error, and the system can even notice it; however the system copes to « fall back on its feet » after a number of steps, e.g. by rerunning previous bootstrapping steps, even with some of them still involving faulty components.

    So, meta-bugs can be sometimes dreadful, but also often forgiving.

Leave a Reply to Jean-Luc Dormoy Cancel reply

Your email address will not be published. Required fields are marked *