Algorithmic information theory

Algorithmicity and programmability in natural computing with the Game of Life as in silico case study

Posted in Algorithmic information theory, Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on September 4th, 2014 by Hector Zenil – Be the first to comment

In a previous article, I suggested a method for testing the algorithmicity of a natural/physical process using the concept of Levin’s universal distribution. In this new paper published by the Journal of Experimental & Theoretical Artificial Intelligence, I explain this method in the context of the problem formulated by Floridi concerning the testability of pancomputationalism. Then, I introduce a behavioural battery of programmability tests for natural computation, as an example of a computational philosophy approach. That is to tackle a simplified version of a complex philosophical question with a computer experiment. I go on to demonstrate the application of this novel approach in a case study featuring Conway’s Game of Life. I also briefly discuss another question raised by Floridi, concerning a grand unified theory of information, which I think is deeply connected to the grand unification of physics. You can access the paper for free here.

How Humans perceive the world is biased by how patterns are distributed in Nature and their intrinsic complexity

Posted in Algorithmic information theory, Complexity, Computer Science, General, New Ideas on August 27th, 2014 by Hector Zenil – Be the first to comment

A new paper of mine with my colleagues, and Algorithmic Nature Lab members, Nicolas Gauvrit and Fernando Soler-Toscano just came out.

Using previously generated and new experimental data together with new methods to calculate the algorithmic complexity of 2-dimensional objects, we were able to find that when humans assess the complexity of an image (a small 4×4 pattern), their rating is correlated to the algorithmic complexity of the image mediated by the probability that such pattern appears in real world scenes. In other words, humans are biased both towards patterns in the world and algorithmic complexity, but also patterns in the world are correlated to algorithmic complexity. This strengthens my claim for an algorithmic world, where patterns can be accounted for by an algorithmic production process.

The journal (Visual Cognition) allows 50 free electronic copies of the paper to be downloaded. Should you be interested in this paper and can’t access it otherwise, you can have a free copy, using the following e-print link.

Announcing the Online Algorithmic Complexity Calculator

Posted in Algorithmic information theory, Complexity, General on March 23rd, 2013 by Hector Zenil – Be the first to comment

We have made available a basic beta version of an Online Algorithmic Complexity Calculator implementing the methods we have developed in recent papers at the Algorithmic Nature lab.

The OACC provides a comprehensive framework of universal mathematical measures of algorithmic complexity for researchers and professionals. It retrieves objective numerical measures of randomness for potential applications in a very wide range of disciplines, from bioinformatics to psychometrics, from linguistics to economics.

It is based on several years of research devoted to new methods for evaluating the information content and algorithmic complexity. The description of the Coding Theorem method to deal with short strings is described in this paper. It currently retrieves numerical approximations to Kolmogorv complexity and Levin’s universal distribution (or Solomonoff algorithmic probability) for binary strings of short length, for which lossless compression algorithms fail as a method for approximation to program-size complexity, hence providing a complementary and useful alternative to compression algorithms. More algorithmic information measures, more data and more techniques will be incorporated gradually in the future, covering a wider range of objects such as longer binary strings, non-binary strings and n-dimensional objects (such as images).

It also includes a Short String Complexity Explorer (it may take some time to run if you open the link) tool developed in Mathematica, it runs oinline with a free player. It provides a comparison of the estimated Kolmogorov complexity (K), the algorithmic probability (m), Shannon’s Entropy (E) and compressibility (using Deflate) of a string. It also provides an indication of the relative complexity among all other strings for which K has been approximated and its distribution rank.

Calculating a Universal Distribution to Approximate Kolmogorov-Chaitin Complexity

Posted in Algorithmic information theory, Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on December 12th, 2012 by Hector Zenil – Be the first to comment

Computing the incomputable has always been a challenge. For example, in finding the busiest Turing machines (Rado) given a number of symbols and states (whimsically called busy beavers). This means either finding Turing machines that, starting from an empty input, produce more non-blank symbols in their output tapes before halting than any other Turing machine of the same size, or Turing machines that, also starting from an empty input, have the greatest runtime before halting than any other Turing machine of the same size. Both problems are ultimately undecidable because of the Turing-complete capabilities of Turing machines, as proven by Alan Turing himself (that is, the capability of some Turing machines to simulate any other Turing machine).

In this new paper we describe how we have managed to calculate an approximation of a so-called Universal Distribution (aka Levin’s semi-measure) which connects the frequency of production of a string to its Kolmogorov complexity (K). The chief advantage of calculating an approximation of the Universal Distribution is that it is an incremental process over an average of a large number of Turing machines. One doesn’t get rid of the constant from the invariance theorem in the theory of algorithmic information theory (for example when Kolmogorov complexity is measured using 2 different universal Turing machines), yet one seems to have to make fewer arbitrary decisions.

One of the main advantages is that one can better deal with strings of very short lengths. Think about it! If one wished to approximate K for a single bit by using compression algorithms, the lossless compression algorithm would not be able to compress the single bit any further. And this not only happens for a single bit but for all strings up to a certain minimal length for which lossless compression algorithms are simply unsuitable (recall that a compression algorithm includes the decompression instructions together with the data in the new compressed object in order to make it self-decompressible).

The usefulness of lossless compression algorithms as a method for approximating K derives from the fact that compression is a sufficient test of non-randomness. This is because K is, more precisely than an uncomputable function, upper semi-computable, meaning that one can estimate upper bounds. The lossless compressed length of an object s (e.g. a string) is therefore an upper bound on K(s). The usefulness of the Coding Theorem Method (the theorem presented in this paper) will ultimately come down to whether it is useful in applications, which is the main motivation behind its conception, given the failure of compression algorithms to provide information about the possible K(s) values for s that are too short (shorter, say, than the typical size of the length of the instructions that a lossless compression algorithm has to add to the compressed object).

The state of the art of the Coding Theorem Method can be gleaned from this paper, recently made public by my colleagues and I, and announcing the release of the calculation of a universal distribution based on (5,2), that is, all Turing Machines with 5 states and 2 symbols: Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines

It represents a major improvement over this previous calculation of mine, that required new and more powerful techniques to deal with a semi-computable distribution. It improves our previous work in terms both of accuracy and coverage of number of short strings and validates previous calculations of universal distributions, showing the incremental nature of the method to be fully compatible with the other calculated universal distributions with smaller samples of small Turing machines (and for which the known Busy Beaver values could be used).

In this other paper we explain why our approximations of K are real-number values, showing that strict integer-value program size follows our Coding Theorem Method, and thus that ours constitutes a more fine-grained measure. It is also shown that logical depth departs from both strict program-size and the Coding Theorem Method evaluations, being in agreement with the theory for all these 3 measures. The paper is available online at: Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures

In the next post I will be announcing and also briefly explaining the results from another paper showing that not only can our Coding Theorem deal with short strings, but that we have found a way to validate the method by lossless compressibility. Moreover, we have found that in the transition period, where the Coding Theorem Method starts to be too expensive to be of practical use whereas the compression method starts to provide some results, the 2 methods are in great agreement with each other. Like an expensive microscope of great power (e.g. the LHC at CERN), our Coding Theorem Method requires an incredible amount of calculation. The trick is to know when to use a microscope–to know when a microscope is required rather than a telescope. We believe we are providing the tools to deal with the information content of the smallest entities in the computational universe.

An alternative method (to compression) for approximating the algorithmic complexity of strings

Posted in Algorithmic information theory, Foundations of Computation, New Ideas on July 20th, 2011 by Hector Zenil – Be the first to comment

The method introduced in my doctoral dissertation was featured in the French version of Scientific American Pour La Science in its July 2011 issue No. 405 under the title Le défi des faibles complexités.

Jean-Paul Delahaye points out that:

Comme les très petites durées ou longueurs, les faibles complexités sont délicates à évaluer. Paradoxalement, les méthodes d’évaluation demandent des calculs colossaux.
(Like long durations or very short lengths, weak complexities are tricky to evaluate. Paradoxically, the evaluation methods require colossal calculations.)

and he continues:

Pour les petites séquences, cette mesure est stable et conforme à notre idée de la complexité; pour les grandes, elle est, d’après le théorème mentionné [the invariance theorem, my comment], conforme à la meilleure mesure de complexité unanimement admise, la complexité de Kolmogorov. Que demander de plus?
(For short strings, this measure is stable and conforms to our idea of complexity; for long strings, according to the aforementioned theorem [the invariance theorem– my comment], it conforms to the best and universally accepted measure of complexity, Kolmogorov complexity. What more can one ask for?)



Imagine you are given 2 short strings and are asked to tell which one looks more random and which more structured. The strings are 0101 and 1011. Applying the idea of algorithmic complexity, the shorter the description the less random-looking. It would seem that the first string has a repeating pattern that can be taken advantage of in describing it. In plain English one may therefore say that the first string can be described as “zero and one twice” while the second one would require a longer description. In fact, there seem to be fewer descriptions for the first string than for the second (and also notice some ambiguity in plain English). The first may also allow the description “A zero followed by a one followed by a zero followed by a one” or “zero and one followed by the same” and perhaps other variations. Descriptions of the second one can include “one and zero followed by two ones” or “one zero one one,” just as the first one could simply be “zero one zero one,” which doesn’t look like a compressed version of the string but rather a straight translation into an expanded form of plain English.

All this by way of asking whether any of the two strings is without a doubt simpler than the other, or whether the apparent repetition in the first string makes us think that the string has a pattern despite the pattern being repeated only twice. Perhaps when one looks at such a string one gets the impression that it may belong to a larger string comprising alternations of 01 and one concludes that it is simpler than the second one. To leave the subjective realm, one would need to evaluate the algorithmic complexity of the strings and compare their respective values. The algorithmic complexity of a string is the shortest program (measured in bits) producing th string in question running on a universal Turing machine. It is inconvenient that there is no algorithm that, given a string, gives you the length of the shortest program that produces it. This is by reduction to the halting problem. Which means one cannot really measure with absolute certainty the algorithmic complexity of a string because it is uncomputable. It doesn’t mean, however, that one cannot approach it; one can often do so in an effective and useful way.

Traditionally, the way to approach the algorithmic complexity of a string was by using lossless compression algorithms. Lossless means that one can recover the original string from the compressed version. The use of lossless algorithms is preferable because there may be inconsistent ways of compressing strings that give the impression of compressing a string without corresponding to a measure of complexity (e.g., noise deletion is an effective way, but noise may be algorithmically random and the aim of algorithmic complexity is to precisely measure how random something is and not merely to gauge whether it looks random). The result of a compression algorithm is an upper bound of its algorithmic complexity. While one cannot ever tell when a string is not compressible, if one succeeds at somehow shortening a string one can tell that its algorithmic complexity cannot be larger than the compressed length.

One does not want to cheat and claim that one can compress any string into a bit if the decompression algorithm interprets that bit into the desired string. A fair compression algorithm can be defined as one that transforms a string into two pieces: one is the compressed version and the other the instructions to decompress the string, together accounting for the final length of the compressed version. In other words, it would seem as if you were adding the decompression algorithm to the compressed string so that the compressed string comes with its own decompression instructions. In the long run, there is a theorem (invariance) that guarantees that complexity values converge.

But for short strings (which are often the ones useful for practical applications), adding the decompression instructions to the compressed version makes the compressed string often, if not always, longer than the string itself. If the string is, for example, shorter than the size of the decompression algorithm, there will be no way to compress the string into something shorter still, simply because the decompression instructions are at least of the length of the original string. Moreover, the result is so dependent on the size of the decompression algorithm (because it is the greatest contributor to the overall length) that the final value of the compressed length is too unstable under different lossless compression/decompression algorithms.

For example, if one tries to compress a short string using Mathematica, one gets the following results:
StringLength@Compress[“0101″] = 30

Looking at the beginning of the compression line when plotting the lengths of strings (x axis) against their compressed lengths (y axis) one observes that it does not start at y=0 even when x=0.

This means that compressing the string 0101 with 4 bits took 46 characters (even more in bits). In Mathematica, strings begin to be compressed this way at about length 30. This is not a malfunction of Mathematica; it is the result of what I have explained. The Mathematica Compress function is actually based on the Deflate lossless compression algorithm, which is a combination of the LZ77 algorithm and Huffman coding, among the most popular lossless compression algorithms available, used in formats like ZIP, GZIP, GIF and PNG (these last two are therefore lossless compression image formats).

Zooming into the axis origin of the plot one can see that the beginning looks unstable. The string in question is a repetition of n 1s with n lying on the x axis (which represents different compression lengths rather than individual ones to avoid repetition and a step-like curve)

If one tries to compress 1011 one gets none other than the same value:
StringLength@Compress[“1011″] = 30

The instructions obviously take up some space in the final compressed length and they cannot be compressed themselves (if they were, they would in any case be the same for all strings, taking us back to the same situation). There is a limit for compression algorithms to compress short strings. So if one wished to tell which of these two strings were objectively more or less randomly complex by approximating their algorithmic complexity using a compression algorithm, it turns out that there is no way to do so.

On the other hand, given that the definition of algorithmic complexity based on compressibility says that the less compressible the more randomly complex a string, one could immediately say that a single bit, 0 or 1, is random for certain, i.e. has maximal algorithmic complexity, given that there is no way to further compress a single bit. In other words, there is no program shorter than 1 bit that can produce 0 or 1. The shortest descriptions of 0 and 1 are therefore 0 and 1 themselves. Hence 0 and 1, according to the compressibility approach, are random strings. It may seem to make no sense to characterize a single bit as random.

On the one hand, a single bit does not carry any information and on these grounds one may think of it as somehow random. If one thinks about whether one would have been able to predict 0 or 1 as the outcome of a process, given that there is no context because they occur alone, one may also conclude that their occurrence is somehow random. In other words, if you see a string like 010101 you may bet that the next bit is 0, but if you are provided with nothing there is no way you can favor any position on whether the bit to come is 0 or 1. So much for justifying that a single bit is random.

It is hard, however, to justify how 0 could look more random than, say, any other possible string. If 0 is random how is it relatively more complex than 00? Or 01? Intuition tells us that short strings shouldn’t be that random (more random than, for example, longer random-looking strings), so if a single bit is the most random among all finite strings, how could it be that there is such a phase transition from maximal random complexity to very low complexity of, say, strings of length 2, 3 or 5 bits long?

Since intuition tells us that something random should also be uncommon and rare, what if one asks how common 0 or 1 are as results of a program? There is a measure that gives the probability of a program’s producing a given string running on a universal Turing machine. This is a measure we used to present a new method to evaluate the complexity of strings, as an alternative to the traditional use of compression algorithms. The new method aims particularly to solve the problem of the evaluation of the complexity of short strings, as we’ve discussed. It is based on Levin-Solomonoff’s algorithmic probability and is connected back to algorithmic (Kolmogorov-Chaitin) complexity by way of Chaitin-Levin’s coding theorem.

Algorithmic probability says that it is not the case that a single bit is the most complex random string, but actually the most structured possible one and, more importantly , that the complexity transition is smooth, more in accordance with intuition.

It may be that it makes sense that a single bit can be regarded as both the most simple and the most complex of strings from different perspectives, and the advantage of the algorithmic probability approach is that it provides not only a different notion of the complexity of a single bit, one that is in keeping with intuition, but also that it generates a different outcome to the compressibility approach, even when the two measures are intimately related and asymptomatically produce the same results in the long term (for longer strings). I think the two views reflect different aspects of what a single bit represents.

The paper presenting the novel method for evaluating the algorithmic complexity of short strings was first proposed and sketched in Greg Chaitin’s 60th anniversary festschrift edited by Cris Calude (J-P. Delahaye & H. Zenil, “On the Kolmogorov-Chaitin complexity for short sequences,” Randomness and Complexity: From Leibniz to Chaitin, edited by C.S. Calude, World Scientific, 2007). The method uses an exhaustive and systematic search of Turing machines inspired by Wolfram’s NKS dictum, from which a frequency distribution of the halting machines is built and the Levin-Chaitin coding theorem applied to evaluate the algorithmic complexity of binary strings.

Chaitin pointed out (regarding our method) that:

…the dreaded theoretical hole in the foundations of algorithmic complexity turns out, in practice, not to be as serious as was previously assumed.

The full technical article is available in ArXiv Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness.

You can also look at the slides of the presentation I delivered at the Alan Turing amphitheater at the Computer Science Department of the University of Lille 1:

“The World is Either Algorithmic or Mostly Random” awarded a 3rd Place Prize in this year’s FQXi contest

Posted in Algorithmic information theory, Complexity, Foundations of Computation, Foundations of Physics, General, New Ideas on June 10th, 2011 by Hector Zenil – 2 Comments

Based on the combined ratings of the contest community and the panel of expert reviewers appointed by the FXQi, which included the members of the institute, I was awarded a 3rd Place Prize for my work The World is Either Algorithmic or Mostly Random in this year’s FQXi contest on the topic Is Reality Digital or Analog? sponsored by the Foundational Questions Institute. The winners were announced at this year’s World Science Festival in New York City.

My work can be summarized in one line as an explanation of the complexification process of the world, the process whereby we have evolved from a more primitive (random) state to the current organized state of the universe.

The essay is a summary of the philosophical branch of my current scientific research on finite algorithmic information theory. This philosophical branch is concerned with the exploration of the possible connections between algorithmic complexity and the physical world (or what happens in it). I propose the notion that the universe is likely digital, not as a claim about what the universe is ultimately made of but rather about the way it unfolds. Central to the argument are concepts of symmetry breaking and algorithmic probability, which are used as tools to compare the way patterns are distributed in our world to the way patterns are distributed in a simulated digital one. These concepts provide a framework for a discussion of the informational nature of reality. I argue that if the universe were analog, then the world would likely look more random, making it largely incomprehensible. The digital model has, however, an inherent beauty in its imposition of an upper limit and in the convergence in computational power to a maximal level of sophistication. Even if deterministic, that the world is digital doesn’t necessarily mean that the world is trivial or predictable, but rather that it is built up from operations that at the lowest scale are simple but that at a higher scale look complex and even random–though in appearance only.

How have we come from the early state of the universe (left) to the structures we find today (right)?

The arguments supporting my views are partially based on the findings of my research, epitomized by our most recent paper Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness available in ArXiv in which my co-author and I describe a method that combines several theoretical and experimental results to numerically approximate the algorithmic (Kolmogorov-Chaitin) complexity of bitstrings by using the concept of algorithmic probability, which is connected to algorithmic complexity by way of the (Levin-Chaitin) coding theorem.

An extended (and detailed) version of The World is Either Algorithmic or Mostly Random is forthcoming and will be eventually posted.

Compression-based Investigation of Cellular Automata, A Phase Transition Coefficient and a Conjecture Related to Universal Computation

Posted in Algorithmic information theory, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on August 22nd, 2010 by Hector Zenil – 3 Comments

In my new article Compression-based investigation of the dynamical properties of cellular automata and other systems, published in the Journal of Complex Systems (19:1), pages 1-28, I present a method for studying the qualitative behavior of cellular automata and other abstract computing machines based on the approximation of their program-size complexity using a general lossless compression algorithm. I show that the compression-based approach classifies cellular automata (CA) into clusters according to their heuristic behavior, with these clusters showing a correspondence with Wolfram’s main classes of systemic behavior. I also present a Gray code-based numbering scheme for initial conditions optimal for this kind of investigation, and a compression based method for estimating a characteristic exponent in the form of a phase transition coefficient measuring the resiliency or sensitivity of a system to its initial conditions. And I conjecture that universal systems have large transition coefficients.

I think this constitutes a novel framework for investigating the dynamical properties of cellular automata and other systems. Here I will discuss some of the main ideas and implications of the paper. A pdf version of the paper is available online on ArXiv

Algorithmic complexity: classification into Wolfram’s four classes

In A New Kind of Science and in several papers dating from the mid-1980s, Stephen Wolfram defined four classes into which cellular automata (CA) and several other systems evolve, each capturing a different qualitative behavior which evolves from the same ordered initial configuration (the simplest black cell).

  • Class 1 consists of CA whose dynamics reach a steady state regardless of the initial conditions.
  • Class 2 consists of CA whose long-term evolution produces periodic or nested structures.

These first two classes are simple, in the sense that their long-term evolution can be deduced from running the system for a small number of steps.

  • Class 3 CA produce structures that seem random.
  • Class 4 CA produce localized structures on a random looking background, and hence are complex looking.

Wolfram’s classification is heuristic, and the assignment of CA to the four classes is somewhat subjective. To the best of my knowledge, there is, to date, no universally agreed upon classification scheme for CA. I think, however, that there is no better approach than a pure program-size complexity measure, which is the approach I follow in this paper. I propose to continue the investigation later, using the same measure to discover other interesting properties and possible hierarchies.

An interesting question related to Wolfram’s classification concerns its dependence on the initial condition–chiefly because the classification was originally meant to be constructed by visual inspection over the evolution of a CA, and as we know, the evolution of a CA depends on its initial condition. This has been a major critique (Eppstein) of Wolfram’s classification, because the expectation is that the classification should be based on the evolution from an unordered (random) configuration.

Nevertheless, the classification is actually based on the potential of a CA to evolve into any of the possible behaviors from at least one initial configuration (the question is of course not answerable in finite time, since there is an infinite number of possible initial configurations). Wolfram’s classification may therefore be seen as being dependent on the initial condition of a CA.

It is not a surprise that one can, for example, construct a CA belonging to more than one of Wolfram’s four classes when starting from different initial configurations. Rule 110 belongs to class 4 because it is capable of universal computation–one can set up an initial configuration to ‘program’ rule 110 to carry out any computation (this being the very basic concept of a programmable computer).

For every CA rule there is a definite (but in general undecidable) answer to the question whether or not it is capable of universal computation (or in reachability terms, whether a CA will evolve into a certain configuration). The question only makes sense if the evolution of a CA depends on its initial configuration. No rule can be universal that fixes the initial configuration once and for all (there would be no way to input an instruction and carry out an arbitrary computation).

On the other hand, some rules, such as Rule 0, don’t produce different configurations relative to variant initial configurations. No matter how you may change the initial condition, there is no way to make it compute something other than what it actually computes for every other initial configuration.

A possible objection (made by David Eppstein) is that there are CAs that can be made to look as if they belonged to all classes by modifying their initial conditions. Which is true: a CA may belong to a certain class until, given another initial configuration, it is made to behave as if it belonged to another class.

My compression-based approach shows that Wolfram’s heuristic classification can actually be quantified by a measure which is clearly dependent on the initial conditions, while also being capable of detecting sensitivity to initial configurations, and hence of replacing the visual inspection. This hierarchical classification is well defined and is therefore a good candidate for a complexity measure.

The second part of my investigation actually takes advantage of the ability of CAs to behave differently in order to undertake a closer inspection and a novel classification, taking into account the average capability of a system to behave in different ways.

Differentiation from a priori approaches

The approach is different from others in that it is an a posteriori technique, unlike, for example, Langton’s lambda, a measure of the density of a CA rule. It is an a posteriori technique because unlike this lambda number, the compression-based approach requires the CA to evolve before saying anything about it, whereas Langton’s lambda is computed from the rules of the CA.

Langton’s lambda is simply the fraction of rules in which the new state of the cell is non-zero. For example, the rules of the elementary cellular automaton number 1 in Wolfram’s enumerating scheme are simple: all 3-tuples sent to 0 but one, the last one. Therefore Langton’s lambda is 1 over 8.

The lambda parameter of a CA is a number between 0 and 1. For example, if lambda is 0 (e.g. for ECA rule number 0), the evolution develops into a trivial state. Langton found that CA rules with lambda close to zero evolve into trivial states and CA rules close to 1 evolve into random-looking behavior, with complex behavior somewhere in between. It is near this transition that the most interesting CAs will lie, the ones that manifest the most complex behavior.

Unfortunately, classifying CAs with lambda values is more complicated than that, as one quickly faces undecidability. If it were possible to decide once and for all whether a CA is complex by computing its lambda value, without having to run the CA for a single step, one could solve all kinds of undecidable questions simply by looking at a CA’s rules.

The critical value for lambda is not a universal constant. Nor, for that matter, is my phase transition coefficient. But the main difference, as noted before, is that the compression-based approach actually looks at the evolution of the system rather than trying to figure everything out from the description of the CA.

The compression-based method represents a formal approach to Wolfram’s classification process, replacing the need for a visual inspection with a technique and a coefficient to determine to what class a CA belongs. The approach is compatible with what Stephen Wolfram himself has proposed in his NKS book, without contradicting any computability result or Wolfram’s own Principle of Computational Irreducibility, which says that while some computations may admit shortcuts that allow them to be performed more rapidly, others don’t, so that one cannot really tell what the evolution of a CA will be, except by running it.

Initial configuration numbering scheme

Ideally, one should feed a system with a natural sequence of initial configurations of gradually increasing complexity. Doing so ensures that qualitative changes in the evolution of the system are not attributable to discontinuities in its set of initial conditions.

What I propose is an initial configuration numbering scheme where two successive values differ by only one bit. To explore the qualitative behavior of a cellular automaton when starting from different initial configurations, the optimal method, because of its smoothness, is to follow this Gray encoding enumeration, in order to avoid any undesirable “jumps” attributable to the system’s having been fed with discontinuous initial configurations. By following the Gray code, an optimal numbering scheme was devised so that two consecutive initial conditions differed only in the simplest degree (by one bit). This simple convention will allow us to continue the investigation of the dynamical behavior of a system without being concerned about arbitrarily introducing artificial complexity when moving from one initial configuration to the next.

Phase transition detection and coefficient

I defined a measure based on the change of the asymptotic direction of the size of the compressed evolutions of a system for different initial configurations (following the proposed Gray-code enumeration of initial configurations). My phase transition coefficient yields an interesting classification: it measures the resiliency or sensitivity of a system to its initial conditions. So rules such as 30 and 0 appear close to each other. Odd as this may seem, this is because both, when their initial conditions are changed, behave in more or less the same way. In other words, there is no change in the qualitative behavior of these CAs when feeding them with different inputs, regardless of how different the inputs may be.

In this phase transition classification, for example, rules such as 122 and 89 appear next to each other, because, as the investigation proves, they are both CAs with relatively high phase transition coefficients, meaning that they are very sensitive to initial conditions, dramatically changing their qualitative behavior when starting from one rather than another initial configuration.

Phase transition and predictability

An obvious feature of universal systems is that they need to be capable of carrying information by reflecting changes made to the input in the output. In attempting to determine whether a system is capable of reaching universal computation, one may ask whether a system is capable of this in the first place, and how efficiently it does so. And this is what the phase transition actually measures, because it tells how well a system manages to respond to an input. Obviously, a system such as rule 0 or rule 255, which does not change regardless of the input, is trivially decidable. But a universal system should be capable of some reaction to external manipulation (the input to the system). The inverse, however, should not hold, because having a large transition coefficient by no means implies that the system will behave with the freedom required of a universal system if it is to emulate any possible computation (a case in point may be rule 22, which, despite having the largest transition coefficient, may not be versatile enough for universal computation).

The phase transition measure also implies that one may be able to predict the behavior of a system for an initial configuration with a degree of certainty based on the previous variability of the system in relation to an initial segment of initial configurations. It is an open question whether this is a lower bound. In other words, it is unclear whether looking at the behavior of a system for a certain length of time and for certain configurations will tell you anything about its behavior for other initial configurations or if it were allowed to evolve for a longer period of time. Experience says one would do well to predict future behavior on the basis of past behavior, and this may also be related to Israeli and Goldenfeld‘s very interesting findings. In 2004 they showed that some computationally irreducible elementary cellular automata have properties that are predictable at a coarse-grained level. They did so following a renormalization group (RG) technique, which refers to a mathematical apparatus that allows one to investigate the changes in a physical system as one views it at different distance scales.

They were able to provide a hierarchy of complexity in agreement with their apparent complexity. Israeli and Goldenfeld’s classification is also an a posteriori investigation, but it too is bedeviled by the unavoidable (and ultimately undecidable) statistical question, namely, whether one can keep on predicting for all initial conditions and for any number of steps, without having to run the system  forever and for all possible initial conditions. But unlike mine, their classification is partial, in the sense that one cannot always say whether the complexity of one CA is greater than that of another. Based on the comparison of their approximated compressed sizes, however, I could come up with a total order in full agreement with their apparent complexity as well as with Wolfram’s four classes (one that even yields the same number of classes).

An open question directed to me by Nigel Goldenfeld– in trying to understand the similarities and differences between their RG and my algorithmic approach–concerns how their hierarchy relates to mine. What they experimentally suggest is that the larger the scale of the transformations used, the more highly structured the objects and therefore the greater their algorithmic complexity.

For example, rule 110 is one rule about which my own phase transition classification says that, despite showing some sensitivity, it also shows some stability. Which means that one can say with some degree of certainty how it will look (and behave) for certain steps and certain initial configurations, unlike those at the top. This turns out to be predictable according to Israeli and Goldenfeld as well, at a coarse-grained level, after a scale transformation.

A sketch on a conjecture on the transition coefficient of Turing-universal systems

Based on two empirical facts, I conjecture that universal CAs should have a large transition coefficient, as 110 does. Rules such as 110 and 54 also turn out to be next to each other in the phase transition classification, both having large values (they are in the top 24, and both the top 24 and the bottom 22 are in the original paper: see the last 2 tables).

So I base this conjecture on the following empirical observations:

1. The only known universal elementary cellular automata figure at the top of this classification, and so do candidates such as rule 54, which appears right next to rule 110.

2. Universality seems to imply that a system should be capable of being controlled by the inputs, which our classification suggests those at the bottom are not, as all of them look the same no matter what the input, and may not be capable of carrying information through the system toward the output.

Other rules that some people think may be capable of some sophisticated computation (See paper in the Journal of Complex Systems by Paul-Jean Letourneau) also have large transition coefficients, such as rule 146, with a transition coefficient 0.124, ranking 39 out of the 256 elementary cellular automata.

As noted before, however, the transition coefficient is not a universal measure. In this case, coefficients were calculated for 600 steps in blocks of 50 for the first 500 initial conditions, which means that some rules may be capable of greater transitions but are somehow ‘slow’ at the selected number of steps and number of initial conditions. i.e. they take more than 500 initial conditions–in Gray code numbering–to show larger changes, or else larger changes are being missed because of the jumps in blocks of 50 steps, though this latter possibility is less likely.

The conjecture also seems to be in agreement with Wolfram’s claim (made at some of his oral presentations) that rule 30 (as a class 3 elementary cellular automaton) may be, according to his Principle of Computational Equivalence (PCE), computationally universal. But it may turn out that it is too difficult (perhaps impossible) to control in order to perform a computation, because it behaves too randomly.

Classifying objects by complexity

Posted in Algorithmic information theory, Complexity, Computer Science, New Ideas on June 2nd, 2010 by Hector Zenil – Be the first to comment

I have coauthored, with Jean-Paul Delahaye and Cedric Gaucherel, and made available today on arXiv a new paper entitled Image information content characterization and classification by physical complexity. In the paper we present a method for estimating the complexity of an image based on the concept of Bennett’s logical depth. Unlike the application of the concept of algorithmic complexity by itself, the addition of the concept of logical depth results in a characterization of objects by organizational (physical) complexity. We use this measure to classify images by their information content. The method provides a means for evaluating and classifying objects by way of their visual representations.

The method described in the paper ranks images based on their decompression times and the classification corresponds to the intuitive ranking resulting from a visual inspection, with things like microprocessors, human faces, cities, engines and fractals figuring at the top as the most complex objects; and random-looking images, which ranked high by algorithmic complexity, were ranked low according to the logical depth expectation, classified next to  trivial images such as the uniformly colored, indicating the characteristic feature of the measure of logical depth. A gradation of different complexities were found in the groups between, gradually increasing in complexity from bottom to top.

significant different groups

Complexity classification of images, from more complex to less complex(group descriptions on the right are followed by the average decompression times as approximations to Bennett's logical depth)

Along the paper we show that:

  • The concept of logical depth can be implemented as a feasible and applicable method to approach a real-world problem.
  • After studying several cases and tested several compression algorithms, the method described in this paper has shown to work and to be of use for identifying and classifying images by their apparent physical complexity.
  • The procedure described constitutes an unsupervised method for evaluating the information content of an image by physical complexity.
  • As the theory predicted, logical depth yields a reasonable measure of complexity that is different from the measure obtained by considering algorithmic complexity alone, while being in accordance with one’s intuitive expectations of greater and lesser complexity.
  • The paper is available here.

    On the Algorithmic Nature of the World

    Posted in Algorithmic information theory, Complexity, New Ideas on April 21st, 2010 by Hector Zenil – Be the first to comment

    In a new paper I’ve coauthored with Jean-Paul Delahaye, we propose a test based on the theory of algorithmic complexity and an experimental evaluation of Levin’s universal distribution to identify evidence in support of or in contravention of the claim that the world is algorithmic in nature.

    To this end we have undertaken a statistical comparison of the frequency distributions of data from physical sources on the one hand–repositories of information such as images, data stored in a hard drive, computer programs and DNA sequences–and the frequency distributions generated by purely algorithmic means on the other–by running abstract computing devices such as Turing machines, cellular automata and Post Tag systems. Statistical correlations were found and their significance measured.

    The paper is forthcoming as a book chapter by invitation of Gordana Dodig-Crnkovic, in Gordana Dodig-Crnkovic and Mark Burgin (eds.) Information and Computation by World Scientific, 2010.

    The paper is available online on arXiv.

    If the subject is of interest to you I invite you to regularly visit our research project main webpage: www.algorithmicnature.org/, where we are publishing results and updates.

    Evaluating the complexity of a living organism by its algorithmic complexity

    Posted in Algorithmic information theory, Complexity, Foundations of Biology, General, New Ideas on September 26th, 2009 by Hector Zenil – 6 Comments

    One of the greatest scientific achievements of the last century was the understanding of life in terms of information. We know today that the information for synthesizing the molecules that allow organisms to survive and replicate is encoded in the DNA. In the cell, DNA is copied to messenger RNA, and triplet codons in the messenger RNA are decoded in the process of translation to synthesize polymers of the natural 20 amino acids.

    Humans have been intrigued by the origin and mechanisms underlying complexity in nature coming from information contained in repositories such as the DNA. Darwin’s theory of evolution suggests that this complexity could evolve by natural selection acting successively on numerous small, heritable modifications.

    Darwin’s theory represents a great leap forward in our understanding of the fundamental processes behind life. However, there is a tendency to assume that evolution os the sole factor in designing nature while it may not actually be the main driving force behind the complexity of living organisms [If you wish to know more about the theory of evolution by means of natural selection, three respectable British institutions have set up special websites in celebration of Darwin’s 200th. anniversary: the University of Cambridge (with the original scanned text and even an audio version in mp3 format), the Open University and the BBC].

    Nature seems to use a specific toolkit of body features rather than totally random shapes. Like units of Lego, Nature assembles its forms from a limited set of elements. For example, despite the variety of living forms on the Earth, they do all seem to have a front-to-back line down the center of the body, and extremities (if any) on the sides, from flies who have a head at one end and a tail at the other, to worms, snakes and humans. Despite the randomness that may undermine any shared regularity among all animals in combinatoric terms, on a certain level, from a certain perspective, we are all similar in shape and features. Why didn’t evolution attempt other, completely different forms? And if it did, why were so few of them successful? Given the improbability of  several other shapes having been put into circulation without any of them winning out save the ones we all know, we could conclude that evolution never did attempt such a path, instead keeping to a small pool of tried and tested basic units whose survival has never been in jeopardy. There are some symmetries and general features that many animals share (more than can be explained by inheritance) that are not so easily explained in purely evolutionist terms. A remarkable example is the resemblance of all animals in their embryonic phase.

    Two teams of biologists (Walter Jakob Gehring and colleagues at the University of Basel, Switzerland, and Matthew Scott and Amy Weiner working with Thomas Kaufman at Indiana University, Bloomington) seem to have independently discovered toolkits that Nature appears to use that they have called homeobox containing genes.

    This discovery indicates that organisms use a set of very simple rules passed along to them (thus reducing the amount of randomness involved) to build a wide variety of forms from just a few basic possible body parts. To oversimplify somewhat, one can for instance imagine being able to copy/paste a code segment (the homeobox) and cause a leg to grow in the place where an antenna would normally be in an ant.

    This begins to sound much more like the footprint of computation rather than a special feature characterizing life, since it turns out that a few simple rules are responsible for the assembly of complex parts. Moreoever, this is consonant with what in Wolfram’s scheme of things life’s guiding force is said to be, viz. computation. And with what Chaitin has proposed as an algorithmic approach to life and evolution, as well as with my own research, which is an attempt to discover Nature’s basic hidden algorithmic nature.  All the operations involved in the replication process of organisms– replacing, copying, appending, joining, splitting–would seem to suggest the algorithmic nature of the process itself. A computational process.

    Based on my own research interests it is my strong belief that though by no means wrong, Darwin’s theory of evolution belongs within a larger theory of information and computation, according to which life has managed to speed up its rate of change by channeling information efficiently between generations, together with a rich exchange of information with the outside by a process that while seemingly random, is in fact the consequence of interaction with other algorithmic processes.

    Think a bit further about it. Evolution seems deeply connected to biology on Earth, but as part of a larger computation theory it might be applied anywhere in the universe just as the laws of physics do. Evolution may be formulated and explained as a problem of information transmission and channeling, pure communication between 2 points in time. If you want to efficiently gather and transmit information it may turn out that biological evolution may be not the cause but the consequence.

    The theory of algorithmic information (or simply AIT) on the other hand does not require a random initial configuration (unfortunately perhaps, nor any divine intervention) to have a program, when run, produce complicated output. This is in keeping with Wolfram’s finding that all over the computational universe there are simple programs with simple inputs generating complex output, what in NKS terms is called ‘intrinsic randomness’, yet is purely deterministic. Nor does AIT require the introduction of randomness during the computation itself. In other words, it seems that randomness plays no necessary role in producing complex organisms. Evolution seems to underlie change, its pace and direction, but it does not seem to constitute the driving force behind life.

    Evolution seems to be taking advantage of the algorithmic properties of living systems to fabricate new forms of life. To facilitate understanding of these body patterns the University of Utah has set up an illustrative website. Incidentally, this genetic toolkit based on the homeobox concept is surprisingly well captured in the Spore video game.

    In a recent article Greg Chaitin has proposed (Speculations on biology, information and complexity) that some of the properties of DNA and the accumulation of information in DNA may be better explained from a software perspective, as a computer program in constant development. When writing software, subroutines are used here and there all the time, and one usually creates an extra module or patch rather than rewrite a subroutine from scratch. This may correspond to what we see in DNA as redundant sections and ‘unused’ sections.

    In Chaitin’s opinion, DNA is essentially a programming language for building an organism and then running that organism. One may therefore be able to characterize the complexity of an organism by measuring the program-size complexity of its DNA. This seems to work well for the length of DNA, since the longest known sequence of DNA belongs to what is certainly the most sophisticated organism on this planet, i.e. homo sapiens.
    Chaitin proposes the following analogy:

    program -> COMPUTER -> output
    DNA ->
    DEVELOPMENT/PREGNANCY -> organism

    However, we encounter problems when attempting to view the process of animal replication in the same algorithmic terms. If, as the sophistication of homo sapiens would suggest, human DNA is the most complex repository of information, and given that DNA represents the shortest encoding capable of reproducing the organism itself, we would expect the replication runtime of human DNA to be of the same order relative to other animals’ replication times. But this is not the case. A gestation period table is available here. So what are we to make of the fact that the right complexity measure for living beings (the logical depth of an object as the actual measure of the organizational complexity of a living organism) does not produce the expected gestation times? One would expect the human gestation period to be the longest, but it is not.

    Charles Bennett defined the logical depth of an object as the time required by a universal computer to produce the object from its shortest description, i.e. the decompression time taken by the DNA from the fertilized egg of an animal (seen as a universal computer) to produce another organism of the same type. There seems to be more at stake, however, when trying to apply the concept to Chaitin’s replication analogy– issues ranging from when to determine the end of the replication (the gestation period?), to better times to give birth, to gestation times inherited from ancestral species, to the average size of organisms (elephants and giraffes seem to have the longest periods). Some hypotheses on period differences can be found here for example.

    If living organisms can be characterized in algorithmic terms as we think they can, we should be able to introduce all these variables and still get the expected values for the complexity measurement of an organism– seen as a computer program–reproducing another organism from its shortest encoding (the DNA being an approximation of it). A complete model encompassing the theory of evolution has yet to emerge. It seems to be on the horizon of AIT, as another application to biology, one that provides a mathematical explanation of life.

    In summary:
    So far, what we know is that DNA is the place where the information for replicating an animal is to be found. What’s being proposed above is that the information content in the DNA can be actually measured and effectively approximated as a distance measure of the complexity of an organism. If one can quantify these values one could, for instance, actually quantify an evolutionary step in mathematical terms.
    Also, evolution is not usually seen as part of a computational theory, but as an special feature of life. I think otherwise.
    Randomness has hitherto been thought to play a major role in evolution as it is mutation that drives the evolutionary process. But I suggest that this is not the case. It is just another part of the deterministic computation, as algorithmic information theory suggests.
    Finally, evolution has been thought of in terms of very small steps rather than building blocks and building over them as other scientists have found (which would explain why the theory of evolution has been bedeviled by questions which have not thus far been satisfactorily answered). This favors my computational view of the process of life, because it is based on what in software technology is seen as a subroutine orientation programming paradigm.

    In summary:

    • So far, what we know is that the DNA is the place where the information for replicating an animal is to be found. What’s being proposed above is that the information content in the DNA can be actually effectively approximated by means of its program-size complexity and logical depth to define a measure of the complexity of an organism. If one can quantify these values one could, for example, actually quantify an evolutionary step in mathematical terms. This would represent a first step toward encompassing Darwin’s theory of evolution within an algorithmic mathematical theory of life. Evolution is not usually seen as part of a computational theory, but as a special feature of life. The above suggests otherwise.
    • Randomness has hitherto been thought to play a major role in the evolution of species, as it is mutation that drives the evolutionary process. But I suggest that this is not the case. Rather I suggest that what appears to be random is actually part of a deterministic computation, which means that randomness plays no significant part in the process, while computation does.
    • Finally, evolution has hitherto been thought of as a process that advances by very small steps, rather than one that is capable of quickly building over blocks of code, as it might be actually the case. This new understanding favors the computational view I am putting forward here as playing a main role in the process of life, because it is based on what in software technology is the practice of a subroutine orientation programming paradigm: code reuse.

    The Shortest Universal Turing Machine Implementation Contest

    Posted in Algorithmic information theory, Complexity, Computability, Universality and Unsolvability, Foundations of Computation, New Ideas, Recreation on December 22nd, 2008 by Hector Zenil – Be the first to comment

    The Shortest Universal Turing Machine Implementation Contest

    ANNOUNCEMENT

    23 Dec – 2008

    http://www.mathrix.org/experimentalAIT/TuringMachine.html

    Contest Overview

    In the spirit of the busy beaver competition though related to program-size complexity, we are pleased to announce the “Shortest Universal Turing Machine Implementation Contest”.

    The contest is open-ended and open to anyone. To enter, a competitor must submit a universal machine implementation written in the language specified in the contest web site (C++) with smaller size values than the latest  record published on the web page.

    In order to take part in this competition it is necessary to submit the source code, to be compiled using the compiler program and version specified in the contest web site. It is important that you provide documentation of your code, either in an attached file or as commented text in the source code file.

    Each submitter must agree to be bound and abide by the rules. Submissions remain the sole property of the submitter(s), but should be released under the GNU General Public License (GPL)  so we may be permitted to make them available on  this web site for downloading and executing.

    Rules

    http://www.mathrix.org/experimentalAIT/TuringMachine.html (General Rules section)

    Team composition

    Players may enter alone or as teams of any size. Anyone is eligible to enter.

    Subscribe to the Newsletter

    A mailing list will be used to keep participants informed of news about the contest. You can subscribe to this mailing list at any time:

    Subscribe at here.
    Organizers

    Hector Zenil (IHPST and LIFL, Paris 1 University and Lille 1 University)
    Jean-Paul Delahaye (LIFL, Lille 1 University)

    Leibniz medallion comes to life after 300 years in celebration of Greg Chaitin’s career

    Posted in Algorithmic information theory, General on November 3rd, 2007 by Hector Zenil – Be the first to comment

    Stephen Wolfram and I designed a medal to celebrate Gregory Chaitin’s 60th birthday and his contributions to mathematics. Chaitin is one of the key founders of algorithmic information theory (AIT), which combines, among other elements, Shannon’s information theory and Turing’s theory of computability. He did this independently of Andrei Kolmogorov and Ray Solomonoff when Greg was still a teenager in the mid 1960s.

    Among Chaitin contributions are the definition of a random sequence via algorithmic incompressibility, his information-theoretic approach to Gödel’s incompleteness theorem and his halting probability epitomised by his Omega number. His work on Hilbert’s 10th problem has made him believe that in a sense there is randomness even in elementary arithmetic.

    The idea of the medal was to somehow replicate the Gottfried Leibniz medallion, an image of which appears at the bottom of Greg’s home page.

    Leibniz Medal Medallion

    Chaitin has spent his career working on foundational questions in mathematics and computation, and in some ways he has been a modernizer of Leibnizian ideas. Leibniz may have been the first computer scientist and information theorist. Early in his life he developed binary arithmetic.

    On January 2nd, 1697, Leibniz wrote a letter to Rudolf August, Duke of Braunschweig-Wolfenbüttel, in which he detailed the design of a commemorative coin or medallion which he suggested could be minted in silver. The design he described posited an analogy between “the creation of all from nothing through the omnipotence of God” and the fact that “all numbers [could] be created from zeros and ones”.

    So the medal does not commemorate Leibniz’s discovery of binary arithmetic. Rather, his description suggests a medal in which binary arithmetic glorifies God–and the duke. (He proposed that the obverse of the coin bear the Duke’s “face or monogram”). However, Leibniz’s religious ideas are but simple. Newton used to mock him about it, but Leibniz idea of God was way more rational than one would expect from his constant citations to god.

    More on the history of Leibniz’ binary language, the letter and the medallion can be found here (pp. 31-36):

    [“The binary medallion apparently was never struck*. Numerous writers have based a contrary assumption, in the last analysis, upon having seen some version of its design. The Duke was already 70 years old when he received the medallion proposal in 1697. “(p. 35)

    “After a thorough search of the catalogs of applicable coin collections, including all known special Brunswickian collections, Dr. W. Jesse of the Stadtisches Museum Braunschweig reported in his letter of November 2, 1965 that in his opinion, the proposed medallion had never been struck. (p. 51)”

    “What actually survives are illustrations in later printings of the letter. Two Versions of Leibniz’s Design of the Binary Medallion. They are facsimiles of the ones appearing on the respective title pages of Johann Bernard Wiedeburg’s Dissertatio mathematica de praestantia arithmeticae binaria prae decimali (Jena: Krebs, 1718) and Rudolf August Nolte’s Leibniz Mathematischer Beweis der Erschaffung und Ordnung der Welt in einem Medallion. Langenheim, 1734. (See pp. 34, 36, 56 for images of the proposed coin, including the obverse side).”]

    During the Summer a small group of people from Wolfram Research led by Stephen Wolfram worked together on the design for Chaitin’s 60th birthday medallion. Stephen and I were keen to incorporate representations of the most definitive elements of Chaitin’s influential career as founder of AIT. It was pretty obvious that Chaitin’s medallion had to include the letter Omega representing his Omega number (Chaitin’s Omega gives the halting probability of a (prefix-free) universal Turing machine). We also wanted to show some digits of an Omega number calculated by Cristian Calude, since even though the Omega number is non-computable, Calude managed to calculate an initial segment by using the binary version of Chaitin’s formula following Chaitin’s construction with register machine programs (of course the digits are dependent on the universal Turing machine chosen). The halting and non-halting results for the register machine programs in question were represented by arrows and lines below the letter Omega. Here is the link to Calude’s paper in which he computed the first digits of Chaitin’s Omega number. It includes a section that we used in determining the placement of the arrows in our design:

    Cristian S. Calude, Michael J. Dinneen, and Chi-Kou Shu. “Computing a Glimpse of Randomness,” Experimental Mathematics, Vol. 11 (2002), No. 3.

    The first 64 bits of Chaitin’s Omega from the paper are:
    000000100000010000011000100001101000111111…
    0010111011101000010000
    However, we decided to use the 40 digits from the standard binary formula version (Chaitin’s original formulation), also calculated by Calude in the same seminal paper:
    0001000000010000101001110111000011111010

    The upper background of the medallion is a binary circular array conceived by Michael Schreiber and generated with the following code in Mathematica:
    Manipulate[Graphics[
    {Black, Disk[{0, 0}, p + 2], Table[
    Table[{GrayLevel[Mod[a, 2]],
    Disk[{0, 0}, q + 1, {2 Pi (a – 1)/(2^q), 2 Pi a/(2^q)}]}, {a, 1, 2^(q),
    1}],
    {q, p, 1, -1}], White, Disk[]}],
    {{p, 3, “bits”}, 1, 8, 1}]

    Like Leibniz, we wanted an inscription in timeless Latin, so we began looking for a text to inscribe on Greg’s medallion, one that was related to his seminal work.

    One year previously, when I met Chaitin at his office in IBM’s Thomas J. Watson Research Center in Yorktown Heights, New York, he invited me to his home and kindly gave me some of his published books (I already had a couple of them but he completed my collection). In return I sent him a very rare limited edition of a book by Jorge Luis Borges and Alfonso Reyes entitled “La máquina de pensar” (“The thinking machine”). Needless to say I kept a copy for myself! As everybody knows, Borges is a famous Argentinian writer just like Chaitin himself (Chaitin is also American). Reyes is a Mexican writer whom Borges credits as an important influence. Indeed their styles show a degree of similarity. In any case, it turned out that like me, Chaitin liked Borges a lot, but he had never heard of Reyes, whom I happen to like as much as Borges. He told me he had enjoyed the book very much, so some of the first inscriptions proposed for the medal were quotes from Borges from his Babel library. But soon we decided that one of the Leibniz quotations appearing on Chaitin’s webpage would be more appropriate:

    *Dieu a choisi celuy qui est… le plus simple en hypotheses et le plus riche en phenomenes.
    [God has chosen that which is the most simple in hypotheses and the most rich in phenomena.]
    *Mais quand une regle est fort composée, ce qui luy est conforme, passe pour irrégulier.
    [But when a rule is extremely complex, that which conforms to it passes for random.]

    Greg has suggested that these quotes from Leibniz, among others, are early anticipations of AIT.

    But after further discussions with Stephen, we agreed on two of Chaitin’s own most often quoted statements encapsulating his most seminal contributions: “Everything can be summarized in one thing, but that thing cannot be reached” (In other words: All computable facts can be summarized in Chaitin’s Omega number, but that number is not itself computable); and “Mathematical facts are true for no reason” (or by accident, as Chaitin uses to say).

    Stephen decided to consult a world expert—a friend of his from high school named Armand d’Angour who is now a Classics professor at Oxford. In 2004 he was commissioned by the International Olympic Committee to compose a Pindaric Ode to Athens which was recited at the Olympic Games. The first thing he pointed out was that Leibniz’s inscription (‘omnibus ex nihilo ducendis sufficit unum’) was a hexameter. D’Angour quickly came up with a pentameter as well for Greg, in his words a “perfect classical one-liner” of the kind that kings in antiquity used to reward poets for. Thus we had a full elegiac couplet, the first line of which read as follows:

    Everything can be summarized in one thing, but the thing itself cannot be reached
    OMNE UNO IMPLICITUR QUOD NON ATTINGITUR IPSUM.
    D’Angour suggested that we replace the “o” in “uno” with an Omega letter (‘Everything can be summarised in one Omega, which itself cannot be attained’).
    He added that Latin verse aficionados would enjoy the way the first three words ran into each other, thus demonstrating what the phrase connoted.

    The second line which at first read:
    Mathematical facts are true by chance
    MATHEMATICAE PRINCIPIA FORTUITO VERA

    was later turned into the pentametric
    FORTUITA EVENIUNT VERA MATHEMATICAE.
    The truths of mathematics turn out to be fortuitous.

    And beneath this the medal read:
    Celebrating the work(s) of Gregory Chaitin MMVII:
    AD LAUDEM GC MMVII (where the Leibniz version has IMAGO CREATIONIS INVEN GGL).

    D’Angour claims that if he were Greg Chaitin, he would be happy to have all this inscribed on his tombstone. If he were Maecenas, he would consider rewarding the poet with a Sabine Farm.

    The Latin inscription on Leibniz’s medallion can be rendered thus: “To make all things from nothing unity suffices” (i.e. You can represent every number using just the digit 1). The inscription on Chaitin’s medallion says: “Everything can be summarized in one [Omega], which cannot itself be attained/The truths of mathematics turn out to be fortuitous”.

     

    Chaitin medallion

    Once we had finalized the design, we wondered about the obverse of the medallion. We realized that this was the chance to finally cast Leibniz’ medallion after almost 300 years! So I went about reconstructing it, noting every single detail. I wrote some Mathematica code incorporating all these details which could be used for an electronic design to be finally struck. Here is the Mathematica notebook.

    Stephen Wolfram presented the medallion to Chaitin during the NKS Science Conference on the 15th. of July, 2007 at the University of Vermont, Burlington, U.S. The original solid silver medallion was delivered to him on November the 2nd of the same year. Nine more copies were made of Merlin gold, one of which belongs to me (pictures below). The others were given to Chaitin’s relatives, and to Armand D’Angour, Cristian Calude, Jeremy Davis and Stephen Wolfram. Two were retained by WRI’s design department for the archive.

     

    Chaitin medallion face Leibniz medallion face

    On the Kolmogorov-Chaitin complexity for short sequences

    Posted in Algorithmic information theory, Computer Science, Foundations of Computation, New Ideas on October 31st, 2007 by Hector Zenil – Be the first to comment

    My paper On the Kolmogorov-Chaitin complexity for short sequences, coauthored with my PhD thesis advisor Jean-Paul Delahaye has been published as a book chapter in:RANDOMNESS AND COMPLEXITY, FROM LEIBNIZ TO CHAITIN, edited by Cristian S. Calude (University of Auckland, New Zealand) and published by World Scientific.

    Chaitin festschrift From Randomness to Complexity from Leibniz to Chaitin by Cristian Calude
    An extended draft version of this paper can be found in arXiv here and the webpage we have set up for our research on what we call Experimental Algorithmic Theory can be accessed here. The results of our ongoing experiments will be frequently published on this site.The book is a collection of papers contributed by eminent authors from around the world in honor of Gregory Chaitin’s birthday. It is a unique volume including technical contributions, philosophical papers and essays.

    I presented our paper at the NKS Science Conference 2007 held at the University of Vermont, Burlington, U.S. The conference blog has an entry describing my participation.

    NKSMeetingZenilChaitinDaviesWolframCastiFrom left to right: Hector Zenil, Stephen Wolfram, Paul Davies, Ugo Pagallo, Gregory Chaitin, Cristian Calude, Karl Svozil, Gordana Dodig-Crnkovic and John Casti.