Archive for the 'Complexity' Category

Algorithmicity and programmability in natural computing with the Game of Life as in silico case study

Thursday, September 4th, 2014

In a previous article, I suggested a method for testing the algorithmicity of a natural/physical process using the concept of Levin’s universal distribution. In this new paper published by the Journal of Experimental & Theoretical Artificial Intelligence, I explain this method in the context of the problem formulated by Floridi concerning the testability of pancomputationalism. Then, I introduce a behavioural battery of programmability tests for natural computation, as an example of a computational philosophy approach. That is to tackle a simplified version of a complex philosophical question with a computer experiment. I go on to demonstrate the application of this novel approach in a case study featuring Conway’s Game of Life. I also briefly discuss another question raised by Floridi, concerning a grand unified theory of information, which I think is deeply connected to the grand unification of physics. You can access the paper for free here.

How Humans perceive the world is biased by how patterns are distributed in Nature and their intrinsic complexity

Wednesday, August 27th, 2014

A new paper of mine with my colleagues, and Algorithmic Nature Lab members, Nicolas Gauvrit and Fernando Soler-Toscano just came out.

Using previously generated and new experimental data together with new methods to calculate the algorithmic complexity of 2-dimensional objects, we were able to find that when humans assess the complexity of an image (a small 4×4 pattern), their rating is correlated to the algorithmic complexity of the image mediated by the probability that such pattern appears in real world scenes. In other words, humans are biased both towards patterns in the world and algorithmic complexity, but also patterns in the world are correlated to algorithmic complexity. This strengthens my claim for an algorithmic world, where patterns can be accounted for by an algorithmic production process.

The journal (Visual Cognition) allows 50 free electronic copies of the paper to be downloaded. Should you be interested in this paper and can’t access it otherwise, you can have a free copy, using the following e-print link.

Announcing the Online Algorithmic Complexity Calculator

Saturday, March 23rd, 2013

We have made available a basic beta version of an Online Algorithmic Complexity Calculator implementing the methods we have developed in recent papers at the Algorithmic Nature lab.

The OACC provides a comprehensive framework of universal mathematical measures of algorithmic complexity for researchers and professionals. It retrieves objective numerical measures of randomness for potential applications in a very wide range of disciplines, from bioinformatics to psychometrics, from linguistics to economics.

It is based on several years of research devoted to new methods for evaluating the information content and algorithmic complexity. The description of the Coding Theorem method to deal with short strings is described in this paper. It currently retrieves numerical approximations to Kolmogorv complexity and Levin’s universal distribution (or Solomonoff algorithmic probability) for binary strings of short length, for which lossless compression algorithms fail as a method for approximation to program-size complexity, hence providing a complementary and useful alternative to compression algorithms. More algorithmic information measures, more data and more techniques will be incorporated gradually in the future, covering a wider range of objects such as longer binary strings, non-binary strings and n-dimensional objects (such as images).

It also includes a Short String Complexity Explorer (it may take some time to run if you open the link) tool developed in Mathematica, it runs oinline with a free player. It provides a comparison of the estimated Kolmogorov complexity (K), the algorithmic probability (m), Shannon’s Entropy (E) and compressibility (using Deflate) of a string. It also provides an indication of the relative complexity among all other strings for which K has been approximated and its distribution rank.

Calculating a Universal Distribution to Approximate Kolmogorov-Chaitin Complexity

Wednesday, December 12th, 2012

Computing the incomputable has always been a challenge. For example, in finding the busiest Turing machines (Rado) given a number of symbols and states (whimsically called busy beavers). This means either finding Turing machines that, starting from an empty input, produce more non-blank symbols in their output tapes before halting than any other Turing machine of the same size, or Turing machines that, also starting from an empty input, have the greatest runtime before halting than any other Turing machine of the same size. Both problems are ultimately undecidable because of the Turing-complete capabilities of Turing machines, as proven by Alan Turing himself (that is, the capability of some Turing machines to simulate any other Turing machine).

In this new paper we describe how we have managed to calculate an approximation of a so-called Universal Distribution (aka Levin’s semi-measure) which connects the frequency of production of a string to its Kolmogorov complexity (K). The chief advantage of calculating an approximation of the Universal Distribution is that it is an incremental process over an average of a large number of Turing machines. One doesn’t get rid of the constant from the invariance theorem in the theory of algorithmic information theory (for example when Kolmogorov complexity is measured using 2 different universal Turing machines), yet one seems to have to make fewer arbitrary decisions.

One of the main advantages is that one can better deal with strings of very short lengths. Think about it! If one wished to approximate K for a single bit by using compression algorithms, the lossless compression algorithm would not be able to compress the single bit any further. And this not only happens for a single bit but for all strings up to a certain minimal length for which lossless compression algorithms are simply unsuitable (recall that a compression algorithm includes the decompression instructions together with the data in the new compressed object in order to make it self-decompressible).

The usefulness of lossless compression algorithms as a method for approximating K derives from the fact that compression is a sufficient test of non-randomness. This is because K is, more precisely than an uncomputable function, upper semi-computable, meaning that one can estimate upper bounds. The lossless compressed length of an object s (e.g. a string) is therefore an upper bound on K(s). The usefulness of the Coding Theorem Method (the theorem presented in this paper) will ultimately come down to whether it is useful in applications, which is the main motivation behind its conception, given the failure of compression algorithms to provide information about the possible K(s) values for s that are too short (shorter, say, than the typical size of the length of the instructions that a lossless compression algorithm has to add to the compressed object).

The state of the art of the Coding Theorem Method can be gleaned from this paper, recently made public by my colleagues and I, and announcing the release of the calculation of a universal distribution based on (5,2), that is, all Turing Machines with 5 states and 2 symbols: Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines

It represents a major improvement over this previous calculation of mine, that required new and more powerful techniques to deal with a semi-computable distribution. It improves our previous work in terms both of accuracy and coverage of number of short strings and validates previous calculations of universal distributions, showing the incremental nature of the method to be fully compatible with the other calculated universal distributions with smaller samples of small Turing machines (and for which the known Busy Beaver values could be used).

In this other paper we explain why our approximations of K are real-number values, showing that strict integer-value program size follows our Coding Theorem Method, and thus that ours constitutes a more fine-grained measure. It is also shown that logical depth departs from both strict program-size and the Coding Theorem Method evaluations, being in agreement with the theory for all these 3 measures. The paper is available online at: Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures

In the next post I will be announcing and also briefly explaining the results from another paper showing that not only can our Coding Theorem deal with short strings, but that we have found a way to validate the method by lossless compressibility. Moreover, we have found that in the transition period, where the Coding Theorem Method starts to be too expensive to be of practical use whereas the compression method starts to provide some results, the 2 methods are in great agreement with each other. Like an expensive microscope of great power (e.g. the LHC at CERN), our Coding Theorem Method requires an incredible amount of calculation. The trick is to know when to use a microscope–to know when a microscope is required rather than a telescope. We believe we are providing the tools to deal with the information content of the smallest entities in the computational universe.

Conjectures concerning Busy Beavers, Dynamic Behavior and Turing Universality

Friday, June 1st, 2012

In a recent paper I have advanced some conjectures using a coefficient that renders aspects of the qualitative behavior of complex systems in quantitative terms. It measures the sensitivity of a system to external stimuli, its apparent ability to (efficiently) transfer information from the input through the output.

In a previous paper, and in a quite different context, I proposed the conjecture that systems capable of (efficient) Turing universal computation had large coefficients (greater than 1), meaning that they are sensitive enough to external stimuli.

It is evident that a system that doesn’t react to its input is unlikely to be (efficiently) programmable, which is what Turing universality is about. Recall that Turing universality is the notion of a machine capable of simulating any other machine of the same type. More formally, if U is a universal Turing machine, anda Turing machine with input s, one can run U with input, that is>, and U will then behave like M with input s, where M and s are any Turing machine and any input.

In that paper I used this coefficient as a phase transition detector, and it so happened that the way in which it was defined made it eminently capable of classifying the behavior of cellular automata and other systems, even yielding Wolfram’s well-known classes of behavior. The paper is available online here.

Now I have advanced another set of conjectures, this time relating a set of Turing machines defined by the way they behave: Busy Beaver Turing machines. These are machines that do a lot of work before halting. More formally, they print more non-blank symbols and have the greatest runtime among machines of the same size (number of states and symbols). I have written a program showing all known and candidate Busy Beaver machines (in Mathematica, runnable on your browser by downloading the Mathematica CDF player):

Busy Beaver from the Wolfram Demonstrations Project by Hector Zenil

Let bb(n) be the Busy Beaver with n states (and 2 symbols). The conjectures say that:

1. (strong version): For all n > 2, bb(n) is capable of universal computation.
2. (sparse version): For some n, bb(n) is capable of universal computation.
3. (weak version): For all n > 2, bb(n) is capable of (weak) universal
4. (weakest version): For some n, bb(n) is capable of (weak) universal

They are based on the fact that Busy Beaver machines are complex, in Bennett’s sense. Bennett’s notion of “logical depth” is the idea of capturing the complexity of a string s in terms of the unfolding time of the shortest (or set of nearly shortest) programs producing the said string. The outputs of Busy Beaver Turing machines are therefore, by definition, the ones with the greatest logical depth. If they print more symbols than any other machine and take the greatest time to do so, they have maximal logical depth. Busy Beaver machines are also machines that produce their outputs with the minimum resources (if there were shorter machines producing the same outputs those would be the Busy Beaver machines), hence they are the shortest programs producing the strings. Busy Beaver machines are therefore by definition very deep (in Bennett’s definition of complexity), meaning they are complex in this sense (logical depth is also profoundly related to Kolmogorov complexity, given that the definition involves the shortest(s) programs).

Besides being deep, Busy Beavers have another characteristic: they halt. So they don’t just do their stuff without worrying about whether to stop. One can easily encode a “Busy Beaver” that loops and produces any arbitrary number of non-blank symbols say for example, an infinite loop printing 1s), but Busy Beavers don’t. They save a state to halt at the end. Machines that are not able to halt (the majority) cannot be Turing universal, because they wouldn’t ever produce any output, simply because by definition something is an output if the machine has halted. But machines that always halt would be decidable, and therefore not universal either.

So among the things to attempt to amount to evidence in favor of the positive answer of the conjectures of Turing universality for Busy Beaver machines, is to encode some input such that they don’t stop (and to prove so), as a first step towards a prove for universality. If Busy Beavers are capable of halting for some inputs (e.g. empty input, as this is entailed in the definition of a Busy Beaver) but are also capable of not halting for others, this would strengthen the conjectures. Busy Beavers may, however, be very different among them, as they are defined by behavior, rather than their specific inner workings, but if the behavior of a Busy Beaver in any way determines something that captures a property of all them, one could use the same argument in favor or against them towards proving or disproving these conjectures.

There seems no easy way to prove that all Busy Beavers are capable of universal computation, but such a proof, if it exists, would be the first to characterize a non-trivial set of well defined universal Turing machines. Non-trivial, because of course one can build a universal Turing machine and keep adding resources and proving that the resulting machines are all universal, just as happens with Turing-complete programming languages. But Busy Beaver machines would be characterized by nothing but a qualitative behavior. This makes the assertions of the conjectures more interesting.

It could be the case that Busy Beavers are not capable of Turing universality for blank tapes. In fact, Michel Pascal, a world expert in the field of Busy Beavers, thinks that Busy Beavers may be capable of –weak– universal computation (personal communication), that is, starting from non-blank tape configurations (e.g. a periodic or bi-periodic tape configuration). This accords with another conjecture in Computer Science, the Collatz sequence. The conjecture says that whichever number you start with, if n is even, divide it by 2 to get n / 2. If n is odd, multiply it by 3 and add 1 to obtain 3n + 1; you will end always end up reaching 1. I have also written 2 small computer programs (also in Mathematica) illustrating this conjecture:

Collatz Sequence Paths from the Wolfram Demonstrations Project by Hector Zenil

Collatz Sequence Maximums from the Wolfram Demonstrations Project by Hector Zenil

According to Michel, Busy Beavers would be related to Collatz-type functions (see his paper “Small Turing machines and generalized busy beaver competition“), with conjectures that are related but more general (in the context of small Turing machines) to the ones I’m talking about.

Today, we know that Turing universality requires very little. For example, Wolfram conjectured that a 2,3 Turing machine was universal, on the basis of the aspects ot its behavior that he studied. And as proven by Alex Smith, this 2,3 TM turns out to be capable of universal computation. The machine is surprisingly small, and indeed is the smallest it can possibly be, given that we know that no 2,2 Turing machine can be capable of Turing universality (proof recently provided by Maurice Margensten in a Complex Systems paper). Today we also know many minimialistic systems capable of universal computation: Conway’s game of Life, Wolfram’s Rule 110, many Post-tag systems and a variation called Cyclic tag systems, Wang tiles. And at least 2 important researchers in the field have recently argued for the ubiquity of Turing universality (Martin Davis and Maurice Margenstern myself, along the lines anticipated by Wolfram’s Principle of Computational Equivalence).

Finally, I advanced a last conjecture relating Busy Beavers and my coefficient C in this paper, establishing that:

C(bb(n)) > 0

That is, that Busy Beavers have a positive coefficient C, i.e. are quite sensitive to input stimuli, which, if the very first conjecture is true (that Turing universal machines have a large coefficient) makes this last coefficient a good place to start to make the case for the universality of Busy Beavers.

The paper “On the Dynamic Qualitative Behavior of Universal Computation” at ArXiv and has been recently published in the Journal of Complex Systems (vol. 20-3) also available online here.

Turing’s Deep Field: Visualizing the Computational Universe

Thursday, January 5th, 2012

I generated this image in the course of an investigation of the distribution of runtimes of programs in relation to the lengths of mathematical proofs, the results of which are being published in my paper bearing the title “Computer Runtimes and the Length of Proofs with an Algorithmic Probabilistic Application to Optimal Waiting Times in Automatic Theorem Proving” Volume 7160 of the Lecture Notes in Computer Science series (LNCS), a festschrift for Cristian Calude.

The preprint is now available online in the ArXiv here. The paper shows that as theoretically predicted, most machines either stop quickly or never halt. It also suggests how a theoretical result for halting times may be used to predict the number of random theorems (dis)proven in a random axiom system –see the section I’ve called “Gödel meets Turing in the computational universe”.

Turing View of The Computational Universe

The plot was the winning image in this year’s Kroto Institute Image Competition, in the Computational Imagery category titled “Runtime Space in a Peano Curve”, it shows the calculation of the distribution of runtimes from simulating (4n+2)^(4n) = 10000 Turing machines with 2 symbols and n=2 states (of a total of more than 10^12 simulated Turing machines with up to n=4 states) following a quasi-lexicographical order in a Peano curve preserving–as far as possible–the distance between 2 machines arranged in a 2-dimensional array from a 1-dimensional enumeration of Turing machines.

In the image each point or cluster of points represents a Turing machine or a group of Turing machines, and the color is determined by a spectrum encoding their halting runtimes–the lighter the square the sooner the machine entered the halting state. White cells represent machines that are proven to halt in infinite time. Red cells show the Turing machines that take longer to halt (popularly called Busy Beavers). Knowing the values of the Busy Beaver functions allows us to identify the machines that never halt (depicted in white). The image is noncomputable, meaning that the process cannot be arbitrarily extended because of the undecidability of the halting problem (i.e. there is no procedure for ascertaining the color of the following pixels to zoom out the picture and cover a larger fraction of the computational universe). Put it in the words of crystilogic,

What you’re looking at is a subset of descriptions of all possible things, and some impossible things. This is possibility and reality compressed into an image.

Turing machines with an arbitrary number of states can encode any possible mathematical problem and are therefore perfect compressors of the known, the yet to be known, and even the unknowable (due to the undecidability of the halting problem).

This is how the image looks like as displayed in the stairway of the Kroto Research Institute in the UK:

Some postcards with the winning image were printed and they can be sent to scholar or enthusiasts upon request sending an email to hectorz[at]

I want to dedicate this prize to my former thesis advisor Jean-Paul Delahaye who suggested me the Peano arrangement as a packing for my visual results of halting times. And also to Cristian Calude who co-advised me all along my PhD thesis and who encouraged me to publish this paper and what better place to do so than for his festschrift.

I’m releasing the images under an open licence in honour of the Turing Year, so that it may be used for any artistic illustration of some of Turing’s main results (the halting problem) or for any other purpose in any medium.

Postcards of the winning image are also available upon request. Just send an email requesting 1 or more postcards to hectorz [at] or to let him know (if you can) that you will be using any of the images (or if you need better resolution versions).

“The World is Either Algorithmic or Mostly Random” awarded a 3rd Place Prize in this year’s FQXi contest

Friday, June 10th, 2011

Based on the combined ratings of the contest community and the panel of expert reviewers appointed by the FXQi, which included the members of the institute, I was awarded a 3rd Place Prize for my work The World is Either Algorithmic or Mostly Random in this year’s FQXi contest on the topic Is Reality Digital or Analog? sponsored by the Foundational Questions Institute. The winners were announced at this year’s World Science Festival in New York City.

My work can be summarized in one line as an explanation of the complexification process of the world, the process whereby we have evolved from a more primitive (random) state to the current organized state of the universe.

The essay is a summary of the philosophical branch of my current scientific research on finite algorithmic information theory. This philosophical branch is concerned with the exploration of the possible connections between algorithmic complexity and the physical world (or what happens in it). I propose the notion that the universe is likely digital, not as a claim about what the universe is ultimately made of but rather about the way it unfolds. Central to the argument are concepts of symmetry breaking and algorithmic probability, which are used as tools to compare the way patterns are distributed in our world to the way patterns are distributed in a simulated digital one. These concepts provide a framework for a discussion of the informational nature of reality. I argue that if the universe were analog, then the world would likely look more random, making it largely incomprehensible. The digital model has, however, an inherent beauty in its imposition of an upper limit and in the convergence in computational power to a maximal level of sophistication. Even if deterministic, that the world is digital doesn’t necessarily mean that the world is trivial or predictable, but rather that it is built up from operations that at the lowest scale are simple but that at a higher scale look complex and even random–though in appearance only.

How have we come from the early state of the universe (left) to the structures we find today (right)?

The arguments supporting my views are partially based on the findings of my research, epitomized by our most recent paper Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness available in ArXiv in which my co-author and I describe a method that combines several theoretical and experimental results to numerically approximate the algorithmic (Kolmogorov-Chaitin) complexity of bitstrings by using the concept of algorithmic probability, which is connected to algorithmic complexity by way of the (Levin-Chaitin) coding theorem.

An extended (and detailed) version of The World is Either Algorithmic or Mostly Random is forthcoming and will be eventually posted.

Classifying objects by complexity

Wednesday, June 2nd, 2010

I have coauthored, with Jean-Paul Delahaye and Cedric Gaucherel, and made available today on arXiv a new paper entitled Image information content characterization and classification by physical complexity. In the paper we present a method for estimating the complexity of an image based on the concept of Bennett’s logical depth. Unlike the application of the concept of algorithmic complexity by itself, the addition of the concept of logical depth results in a characterization of objects by organizational (physical) complexity. We use this measure to classify images by their information content. The method provides a means for evaluating and classifying objects by way of their visual representations.

The method described in the paper ranks images based on their decompression times and the classification corresponds to the intuitive ranking resulting from a visual inspection, with things like microprocessors, human faces, cities, engines and fractals figuring at the top as the most complex objects; and random-looking images, which ranked high by algorithmic complexity, were ranked low according to the logical depth expectation, classified next to  trivial images such as the uniformly colored, indicating the characteristic feature of the measure of logical depth. A gradation of different complexities were found in the groups between, gradually increasing in complexity from bottom to top.

significant different groups

Complexity classification of images, from more complex to less complex(group descriptions on the right are followed by the average decompression times as approximations to Bennett's logical depth)

Along the paper we show that:

  • The concept of logical depth can be implemented as a feasible and applicable method to approach a real-world problem.
  • After studying several cases and tested several compression algorithms, the method described in this paper has shown to work and to be of use for identifying and classifying images by their apparent physical complexity.
  • The procedure described constitutes an unsupervised method for evaluating the information content of an image by physical complexity.
  • As the theory predicted, logical depth yields a reasonable measure of complexity that is different from the measure obtained by considering algorithmic complexity alone, while being in accordance with one’s intuitive expectations of greater and lesser complexity.
  • The paper is available here.

    Stephen Hawking: A brief examination of the recent warning over alien civilizations

    Tuesday, May 4th, 2010

    Stephen Hawking asserts that while aliens almost certainly exist, humans should avoid making contact.

    The original story published by BBC News can be found here.

    He claims: “We only have to look at ourselves to see how intelligent life might develop into something we wouldn’t want to meet.”

    Stephen Hawking recent assertion looks like an interesting time to bring up the question of intelligent life, meaning and purpose.

    Let’s examine what Hawking’s argument implies, assuming that intelligent life, other than human, exists elsewhere in the universe:

    1. Aliens come from an Earth-like planet.
    2. The Earth has something the aliens do not.
    3. The Earth has something aliens need or want.

    Each point may be not necessarily independent of each other, each may be chained or added to the other. Point 2 may imply Point 1. However, Earth-like planets are likely to be quite common, as the current research on exoplanets seems to suggest. While Point 1 is chauvinistic, Point 3–the notion that Earth possesses something special– is egocentric. Concerning Point 2 again: as a sort of example, many cities on Earth are in need of potable water. Does that mean they can just go looking for it in another city?

    Destructed castle on the coast way of Ireland

    Originally uploaded by hzenilc.

    If we think that aliens are looking for a planet with an atmosphere like ours, we are already making a very strong assumption that requires further support. To assume that their planet has the same atmosphere as ours and that they would consequently need a planet exactly like it, is to make a strong claim–and a highly implausible one at that. We may think that water is the most precious element in the universe, but it might just be lethal to an alien civilization. If it is true that we can only imagine them in the basis of the kind of life and it diversity on Earth, it is also true that we know nothing about them, if they really do exist.

    If humans find life elsewhere, and if it is not related to Earth’s life, it is likely to be unrecognizable. After all we have experience of a very large range of life forms here on Earth, and we find some of them already alien enough. “Extremophiles” is the name we give to Earth species that can survive in places that would quickly kill humans and other “normal” life-forms.

    As for needing a work force, it doesn’t make sense that aliens would seek it here. Humans are highly unproductive. And aliens with the technological capabilities to travel to Earth are likely to have had long experience building the kinds of machines that humans have only recently got around to building and improving. Advanced intelligent aliens most likely make extensive use of robots, if they are not some kind of cybernetic beings themselves.

    Though assuming that aliens and humans are alike in their biochemical constitution may be chauvinistic and misguided, the supposition that our intelligences are similar has a more solid basis, especially since we also assume that the alien civilization in question has built spaceships, travels in space and is looking for other civilizations. According to Stephen Wolfram’s Principle of Computational Equivalence (PCE), there is a good chance that if non-trivial life forms exist they will have or develop the same degree of sophistication (e.g. intelligence) than ours. Wolfram’s PCE would also suggest that once primitive life has developed it will eventually achieve its maximal degree of sophistication, which would be the maximal possible degree of sophistication. In other words, if life is around, intelligent life is as likely as life.

    We used to say that aliens are likely to be much more advanced than humans. Why so? Because the universe has a 15 billion year history. Characterizing a civilization that has been around for about 2 million years and has only begun developing technology in the last few hundred as ‘advanced’ is beyond naive, it is statistically implausible. As Carl Sagan pointed out, every intelligent civilization seems to reach a period when its own technological power is capable of destroying itself. This is the stage which human civilization reached about 70 years ago (and still is), with the invention of atomic bombs. The longevity of said aliens’ civilization means that they have managed to make good use of the resources of their host planet. Even assuming they have reached the point where they have exhausted their resources, we have to concede that they have probably been good, ecologically responsible stewards of their planet. Of course it is still possible, despite all this, that these aliens may wish to colonize a new planet in order to exploit its resources rather than simply seizing what they find and taking it away with them.

    Water is one of the most common elements in the universe, or it is quiet easy to synthesize it (see the chemical recipe here). If they just need a rock revolving around a star at a distance such that life can be sustained, there are certainly many more possibilities than coming to Earth. Think about it. The human civilization is much closer to achieving the terraformation of Mars (reengineering Mars soil and atmosphere to make it Earth-life friendly), given its current and near future capabilities than traveling abroad to conquer, if not cohabit with another civilization.

    Now, from a practical standpoint, the notion that we could hide away in a corner of the universe is nonsense, to say the least, if we are not somehow already hidden due to the size of the universe itself. The Earth has been broadcasting to space since radio signals were invented. Messages have been kept symbolical on purpose. So in the worst case scenario, assuming Hawking is right and our signals are likely to be picked up by an alien civilization willing to take us on, hiding is just rubbish because we can’t. As Seth Shostak says, the first thing to do would be to shut down the BBC, NBC, CBS and the radars at all airports. Not unless we install a kind of Dyson sphere around the Earth or stop broadcasting altogether. Now, the only way to make sense out of Hawking particular comment in this regard, is taking into consideration time scales. If it is true that Earth has been broadcasting to the space during the last 50 or more years, it is also true that someday in the future it may reach a technological state that allows it to avoid, purposely or not, doing so. So it may be the case that civilizations decide to hide themselves after a short period of broadcasting history.

    The advice to hide from aliens implies of course that they are destructive, or that the contact between our civilizations may be destructive, and this brings us back to the question of their longevity. If they were indeed destructive they would have annihilated themselves. As Hawking does right, consider the human case.

    But if Hawking is right, we would probably have nothing to be scared of. If an alien civilization wants us dead we will either barely notice it when that happens or that won’t ever happen if it hasn’t happened already. The chances of a destructive civilization to exist seem lower than the chances of a pacific civilization to extend their existence time period in the universe history. The former would have likely already destroyed itself while the latter may have better chances. Caution would not hurt though. We ought to keep an eye on ourselves and how we develop, and that means using our resources more intelligently and not necessarily manufacturing more bombs. But being scared definitely says much more about us than anyone else because we simply know nothing about them, nor we can pretend to know what are their intentions, if any.

    But of course in the matter of encounters between civilizations in space, every possibility is time-scale dependent. Imagine for an instant that two alien civilizations are at war. We would certainly be well advised to keep away from the conflict. We may be justified in thinking that if we were in the way when either of the parties ran short of resources to prosecute their war, they would most likely help themselves to anything we had that could be of use. But think a little further. The odds of encountering two civilizations actually engaged in warfare are rather small.

    Civilizations making war would either annihilate each other, or if they don’t, then one party would end up achieving hegemonic status. It would seem logical that periods of peace are much longer than periods of war. In the first place civilizations that contemplate war must be both smart enough and hostile enough, and reaching these thresholds of achievement and antagonism takes time–peacetime.

    As Hawking claims, many of the life forms in the universe are probably just microbes, but unlike Hawking I believe that if civilizations do exist they’d have little time to make war, even assuming they wanted to. Once again, one has only to think of the Earth to reach this conclusion. If the Cold War had not remained cold, we either wouldn’t be here or else we’d all now be ruled by a single country–as has been pretty much the case (despite there being no actual war) over the last few decades, with the emergence of a current single superpower (with a partial sharing of the global power, and other powers emerging). But when superpowers emerge these days, they are more interested in keeping the peace because they have reached a stage where they depend on each other, chiefly because trade and commerce have become globalized. It turns out that the human world has become much more cooperative than we might have expected. But this is not by chance; it seems to be a common path. Not that one can be absolutely certain, though. Only by witnessing other civilizations could we safely make generalizations. And yet logic dictates that the opposite path–the path of belligerence–would be unlikely.

    What I think is that if civilizations were to find each other they are more likely to be interested in each other’s culture and knowledge, than in each other’s natural resources–either because natural resources can be found in many places, or even created by civilizations that are sufficiently advanced, or else because they are simply not needed, curiosity about the universe being the sole motive behind exploration. Which would mean that civilizations would preserve ‘alien’ civilizations to enrich themselves, just as anthropologists and ethnologists do on Earth. To make my point in terms of Hawking, aliens are more likely to be like modern scientists than like the barbaric Europeans who colonized the Americas.

    On the Algorithmic Nature of the World

    Wednesday, April 21st, 2010

    In a new paper I’ve coauthored with Jean-Paul Delahaye, we propose a test based on the theory of algorithmic complexity and an experimental evaluation of Levin’s universal distribution to identify evidence in support of or in contravention of the claim that the world is algorithmic in nature.

    To this end we have undertaken a statistical comparison of the frequency distributions of data from physical sources on the one hand–repositories of information such as images, data stored in a hard drive, computer programs and DNA sequences–and the frequency distributions generated by purely algorithmic means on the other–by running abstract computing devices such as Turing machines, cellular automata and Post Tag systems. Statistical correlations were found and their significance measured.

    The paper is forthcoming as a book chapter by invitation of Gordana Dodig-Crnkovic, in Gordana Dodig-Crnkovic and Mark Burgin (eds.) Information and Computation by World Scientific, 2010.

    The paper is available online on arXiv.

    If the subject is of interest to you I invite you to regularly visit our research project main webpage:, where we are publishing results and updates.

    Evaluating the complexity of a living organism by its algorithmic complexity

    Saturday, September 26th, 2009

    One of the greatest scientific achievements of the last century was the understanding of life in terms of information. We know today that the information for synthesizing the molecules that allow organisms to survive and replicate is encoded in the DNA. In the cell, DNA is copied to messenger RNA, and triplet codons in the messenger RNA are decoded in the process of translation to synthesize polymers of the natural 20 amino acids.

    Humans have been intrigued by the origin and mechanisms underlying complexity in nature coming from information contained in repositories such as the DNA. Darwin’s theory of evolution suggests that this complexity could evolve by natural selection acting successively on numerous small, heritable modifications.

    Darwin’s theory represents a great leap forward in our understanding of the fundamental processes behind life. However, there is a tendency to assume that evolution os the sole factor in designing nature while it may not actually be the main driving force behind the complexity of living organisms [If you wish to know more about the theory of evolution by means of natural selection, three respectable British institutions have set up special websites in celebration of Darwin’s 200th. anniversary: the University of Cambridge (with the original scanned text and even an audio version in mp3 format), the Open University and the BBC].

    Nature seems to use a specific toolkit of body features rather than totally random shapes. Like units of Lego, Nature assembles its forms from a limited set of elements. For example, despite the variety of living forms on the Earth, they do all seem to have a front-to-back line down the center of the body, and extremities (if any) on the sides, from flies who have a head at one end and a tail at the other, to worms, snakes and humans. Despite the randomness that may undermine any shared regularity among all animals in combinatoric terms, on a certain level, from a certain perspective, we are all similar in shape and features. Why didn’t evolution attempt other, completely different forms? And if it did, why were so few of them successful? Given the improbability of  several other shapes having been put into circulation without any of them winning out save the ones we all know, we could conclude that evolution never did attempt such a path, instead keeping to a small pool of tried and tested basic units whose survival has never been in jeopardy. There are some symmetries and general features that many animals share (more than can be explained by inheritance) that are not so easily explained in purely evolutionist terms. A remarkable example is the resemblance of all animals in their embryonic phase.

    Two teams of biologists (Walter Jakob Gehring and colleagues at the University of Basel, Switzerland, and Matthew Scott and Amy Weiner working with Thomas Kaufman at Indiana University, Bloomington) seem to have independently discovered toolkits that Nature appears to use that they have called homeobox containing genes.

    This discovery indicates that organisms use a set of very simple rules passed along to them (thus reducing the amount of randomness involved) to build a wide variety of forms from just a few basic possible body parts. To oversimplify somewhat, one can for instance imagine being able to copy/paste a code segment (the homeobox) and cause a leg to grow in the place where an antenna would normally be in an ant.

    This begins to sound much more like the footprint of computation rather than a special feature characterizing life, since it turns out that a few simple rules are responsible for the assembly of complex parts. Moreoever, this is consonant with what in Wolfram’s scheme of things life’s guiding force is said to be, viz. computation. And with what Chaitin has proposed as an algorithmic approach to life and evolution, as well as with my own research, which is an attempt to discover Nature’s basic hidden algorithmic nature.  All the operations involved in the replication process of organisms– replacing, copying, appending, joining, splitting–would seem to suggest the algorithmic nature of the process itself. A computational process.

    Based on my own research interests it is my strong belief that though by no means wrong, Darwin’s theory of evolution belongs within a larger theory of information and computation, according to which life has managed to speed up its rate of change by channeling information efficiently between generations, together with a rich exchange of information with the outside by a process that while seemingly random, is in fact the consequence of interaction with other algorithmic processes.

    Think a bit further about it. Evolution seems deeply connected to biology on Earth, but as part of a larger computation theory it might be applied anywhere in the universe just as the laws of physics do. Evolution may be formulated and explained as a problem of information transmission and channeling, pure communication between 2 points in time. If you want to efficiently gather and transmit information it may turn out that biological evolution may be not the cause but the consequence.

    The theory of algorithmic information (or simply AIT) on the other hand does not require a random initial configuration (unfortunately perhaps, nor any divine intervention) to have a program, when run, produce complicated output. This is in keeping with Wolfram’s finding that all over the computational universe there are simple programs with simple inputs generating complex output, what in NKS terms is called ‘intrinsic randomness’, yet is purely deterministic. Nor does AIT require the introduction of randomness during the computation itself. In other words, it seems that randomness plays no necessary role in producing complex organisms. Evolution seems to underlie change, its pace and direction, but it does not seem to constitute the driving force behind life.

    Evolution seems to be taking advantage of the algorithmic properties of living systems to fabricate new forms of life. To facilitate understanding of these body patterns the University of Utah has set up an illustrative website. Incidentally, this genetic toolkit based on the homeobox concept is surprisingly well captured in the Spore video game.

    In a recent article Greg Chaitin has proposed (Speculations on biology, information and complexity) that some of the properties of DNA and the accumulation of information in DNA may be better explained from a software perspective, as a computer program in constant development. When writing software, subroutines are used here and there all the time, and one usually creates an extra module or patch rather than rewrite a subroutine from scratch. This may correspond to what we see in DNA as redundant sections and ‘unused’ sections.

    In Chaitin’s opinion, DNA is essentially a programming language for building an organism and then running that organism. One may therefore be able to characterize the complexity of an organism by measuring the program-size complexity of its DNA. This seems to work well for the length of DNA, since the longest known sequence of DNA belongs to what is certainly the most sophisticated organism on this planet, i.e. homo sapiens.
    Chaitin proposes the following analogy:

    program -> COMPUTER -> output
    DNA ->

    However, we encounter problems when attempting to view the process of animal replication in the same algorithmic terms. If, as the sophistication of homo sapiens would suggest, human DNA is the most complex repository of information, and given that DNA represents the shortest encoding capable of reproducing the organism itself, we would expect the replication runtime of human DNA to be of the same order relative to other animals’ replication times. But this is not the case. A gestation period table is available here. So what are we to make of the fact that the right complexity measure for living beings (the logical depth of an object as the actual measure of the organizational complexity of a living organism) does not produce the expected gestation times? One would expect the human gestation period to be the longest, but it is not.

    Charles Bennett defined the logical depth of an object as the time required by a universal computer to produce the object from its shortest description, i.e. the decompression time taken by the DNA from the fertilized egg of an animal (seen as a universal computer) to produce another organism of the same type. There seems to be more at stake, however, when trying to apply the concept to Chaitin’s replication analogy– issues ranging from when to determine the end of the replication (the gestation period?), to better times to give birth, to gestation times inherited from ancestral species, to the average size of organisms (elephants and giraffes seem to have the longest periods). Some hypotheses on period differences can be found here for example.

    If living organisms can be characterized in algorithmic terms as we think they can, we should be able to introduce all these variables and still get the expected values for the complexity measurement of an organism– seen as a computer program–reproducing another organism from its shortest encoding (the DNA being an approximation of it). A complete model encompassing the theory of evolution has yet to emerge. It seems to be on the horizon of AIT, as another application to biology, one that provides a mathematical explanation of life.

    In summary:
    So far, what we know is that DNA is the place where the information for replicating an animal is to be found. What’s being proposed above is that the information content in the DNA can be actually measured and effectively approximated as a distance measure of the complexity of an organism. If one can quantify these values one could, for instance, actually quantify an evolutionary step in mathematical terms.
    Also, evolution is not usually seen as part of a computational theory, but as an special feature of life. I think otherwise.
    Randomness has hitherto been thought to play a major role in evolution as it is mutation that drives the evolutionary process. But I suggest that this is not the case. It is just another part of the deterministic computation, as algorithmic information theory suggests.
    Finally, evolution has been thought of in terms of very small steps rather than building blocks and building over them as other scientists have found (which would explain why the theory of evolution has been bedeviled by questions which have not thus far been satisfactorily answered). This favors my computational view of the process of life, because it is based on what in software technology is seen as a subroutine orientation programming paradigm.

    In summary:

    • So far, what we know is that the DNA is the place where the information for replicating an animal is to be found. What’s being proposed above is that the information content in the DNA can be actually effectively approximated by means of its program-size complexity and logical depth to define a measure of the complexity of an organism. If one can quantify these values one could, for example, actually quantify an evolutionary step in mathematical terms. This would represent a first step toward encompassing Darwin’s theory of evolution within an algorithmic mathematical theory of life. Evolution is not usually seen as part of a computational theory, but as a special feature of life. The above suggests otherwise.
    • Randomness has hitherto been thought to play a major role in the evolution of species, as it is mutation that drives the evolutionary process. But I suggest that this is not the case. Rather I suggest that what appears to be random is actually part of a deterministic computation, which means that randomness plays no significant part in the process, while computation does.
    • Finally, evolution has hitherto been thought of as a process that advances by very small steps, rather than one that is capable of quickly building over blocks of code, as it might be actually the case. This new understanding favors the computational view I am putting forward here as playing a main role in the process of life, because it is based on what in software technology is the practice of a subroutine orientation programming paradigm: code reuse.

    The Shortest Universal Turing Machine Implementation Contest

    Monday, December 22nd, 2008

    The Shortest Universal Turing Machine Implementation Contest


    23 Dec – 2008

    Contest Overview

    In the spirit of the busy beaver competition though related to program-size complexity, we are pleased to announce the “Shortest Universal Turing Machine Implementation Contest”.

    The contest is open-ended and open to anyone. To enter, a competitor must submit a universal machine implementation written in the language specified in the contest web site (C++) with smaller size values than the latest  record published on the web page.

    In order to take part in this competition it is necessary to submit the source code, to be compiled using the compiler program and version specified in the contest web site. It is important that you provide documentation of your code, either in an attached file or as commented text in the source code file.

    Each submitter must agree to be bound and abide by the rules. Submissions remain the sole property of the submitter(s), but should be released under the GNU General Public License (GPL)  so we may be permitted to make them available on  this web site for downloading and executing.

    Rules (General Rules section)

    Team composition

    Players may enter alone or as teams of any size. Anyone is eligible to enter.

    Hector Zenil (IHPST and LIFL, Paris 1 University and Lille 1 University)
    Jean-Paul Delahaye (LIFL, Lille 1 University)

    Swarm Games

    Wednesday, December 5th, 2007

    Carlos Gershenson, a friend of mine, has developed a suite of games with NetLogo for entertainment at parties. The games have to do with patterns that emerge as a result of the iterative application of  very simple rules by humans or other mobile agents.

    Individuals are provided with a single, simple rule at the outset. The outcomes are sometimes independent of the initial conditions and sometimes sensitive to them, but nobody can anticipate them  (except perhaps Carlos and other complexity researchers).Some of the rules are as follows:
    – “Approach one”: Each player chooses another player and approaches them one step at a time.  [ Some people ended up in the center of the room while others were  grouped in clusters.]
    – “Retreat from one”: Each player chooses another player and then runs away from them. [Everybody ended up on the periphery of the room.]
    – “Step between two”: Each player chooses two players, and tries to step  between them. [I had no idea what would happen. As it turned out, everybody ended up in a single tight cluster in the center of the room.]If different rules are issued to different individuals, interesting patterns emerge.

    Recently, the New York Times  published an interesting article entitled “From Ants to People, an Instinct to Swarm” with graphs of ants that strikingly resemble  Carlos’ simulations.

    Swarm NYT
    As the article points out, people in the U.S. spend 3.7 billion hours a year in congested traffic, but you will never see ants stuck in gridlock. Carlos has himself  worked on improving traffic lights using auto-organization techniques. He recently earned his PhD with a thesis on the subject. Titled 
    Design and Control of Self-organizing Systems
    , it has been published online as as ebook under a CopyLeft licence. It is an enjoyable work.References:
    Gershenson, Carlos. Design and Control of Self-organizing Systems. CopIt ArXives, Mexico, 2007. TS0002ENFrom Ants to People, an Instinct to Swarm. New York Times, 2007.

    Carlos Gershenson’s suite of games in NetLogo.

    On the Foundations of Quantum Mechanics, The Netherlands

    Thursday, November 15th, 2007

    Originally uploaded by hzenilc.

    Models and Simulations 2
    11 – 13 October 2007
    Tilburg University, The Netherlands

    I attended this conference one month ago. Among several interesting talks, one in particular caught my attention. It was given by Michael Seevinck from the Institute for History and Foundations of Science at Utrecht, The Netherlands. His talk was about the foundations of Quantum Mechanics, and there were many NKS related topics that it brought  to mind. He talked about reconstructing Quantum Mechanics (QM) from scratch by exploring several restricted models in order to solve the so-called measurement problem, to deal with the nonlocality of quantum correlations, and with its alleged non-classicality, there being  no consensus on  the meaning of Quantum Mechanics  (Niels Bohr said once: “If you think you have understood quantum mechanics, then you have not understood quantum mechanics.”—More quotes of this sort on QM here).  The restrictons chosen in order to reconstruct the theory must be physical principles and not  theoretical assumptions. In other words, one approaches the problem contrariwise than is traditional, taking the least possible restrictions and exploring the theories that can be built thereon. The speaker characterized  this approach  as the “study [of]  a system from the outside” in order to “reconstruct the model”. It is basically a pure NKS approach: “Start from a general class of possible models and try to constrain it using some physical principles so as to arrive at the model in question (in this case QM).”

    One can then proceed to ask such questions as how one might identify QM uniquely, what it is that makes QM quantum, what set of axioms in the model is to be used, and which of them are necessary and sufficient? The question of meaning, previously asked of the formalism, is removed, and bears, if at all, only on the selection and justification of  first principles. Seevinck came up with the following interesting statement: “The partially ordered set of all questions in QM is isomorphic to the partially ordered set of all closed subspaces of a separable Hilbert space” (one of Mackey’s axioms in his axiomatisation of 1957). He added: “They (the principles)have solely an epistemic status. The personal motives for adopting certain first principles should be bracketed. One should be ontologically agnostic. The principles should be free of ontological commitment.” And further: “…axioms are neutral towards philosophical positions: they can be adopted by a realist, instrumentalist, or subjectivist.” He cited Clifton, Bub and Halverson who provided the following quantum information constraints used to derive quantum theory:

    1. No superluminal information transfer via measurement.

    2. No broadcasting

    3. No secure bit commitment

    Seevinck’s methodology in further detail is: Start with a general reconstruction model with a very weak formalism. Gradually see what (quantum) features are consequences of what added physical principles, and also see which features are connected and which features are a consequence of adding which principle. One thereby learns which principle is responsible for which element in the (quantum) theoretical structure.

    One can generate further foundational questions over the whole space of restricted models, e.g.  how many of them:

    – forbid superluminal signalling?

    – allow nonlocality, and to what extent?

    – solve NP-complete problems in polynomial time?

    An important question which arises concerns whether intrinsic randomness would be of a different nature in different models or whether all of them would yield to deterministic randomness.

    His talk slides are available online. Highly recommended.

    Among other interesting people I met was Rafaela Hillebrand, of  the Institute for The Human Future at Oxford University. The Institute’s director, Nick Bostrom, has proposed an interesting theory concerning the likelihood that our reality is actually  a computer simulation. I have myself approached the  question in my work on experimental algorithmic complexity, in particular in my work on  the testability and the skepticism content of the simulation hypothesis. I will post on that subject later. The subject of thought experiments–in which I have an interest– was one that came up frequently.

    On the simplest and smallest universal Turing machine

    Tuesday, October 30th, 2007

    Alex Smith has recently been able to prove that a Turing machine conjectured to be capable of universal computation by Wolfram was actually universal (Wolfram 2,3 Turing machine Research Prize).

    Part of the challenge was to find an encoding not doing by itself the universal computation that would make the Turing machine universal. Smith succeeded providing an encoding providing a background that while nonperiodic is sufficiently regular to be generated by infinite word written on the tape can be generated by a omega-automaton (not itself universal).

    This relaxation of the tape content has been regular in the cellular automaton world, but also in the field of what is called weak-universality with the only difference that the Turing machines are allowed to start with other than blank tapes (blank tapes are periodic tapes with period 1).

    An objection might be that such a coupled system could turn a nonuniversal machine into a universal one, but Smith showed that his encoding was capable of restarting the computation itself, something that coupled systems usually are not capable of. In other words, the preparation of the tape in Smith’s case is done in advance and the omega-automaton do not longer interact in any further time, while to have nonuniversal machines to become universal usually requires an automaton intervening at every step (or every certain steps) of a computation.

    One may also be surprised by the need of an omega-automaton for infinite strings. But the difference to the traditional blank tape is subtle. Think of how machines operate in the real world. Machines do not run on blank tapes, they usually do so over memory (the cache, RAM or a HD) with all kind of information (that is considered garbage if it is not instantiated by a running program). You may think that this garbage is not infinite, but it is not so a blank tape for a Turing machine, so instead of thinking of providing a Turing machine with blank tape as need it, one can think of providing the Turing machine with a non-blank tape. Now, what is on the tape in the case of Smith is not exactly “garbage” because it plays a role in “helping” the Turing machine to perform its computation, in a kind of support on which the Turing machine that is capable of universal computation actually achieves the computation with the help of a defined background (that doesn’t mean that the machine cannot perform universal with other backgrounds) yet the background is not powerful enough to perform the hardest part of the computation. In the physical world, if processes are seen as computations, computations are performed on a support, on the background of the physical world. So these kinds of relaxation may be closer to actual physical situations than abstract situations in which a blank tape for a computation is assumed or required.

    The discussion opened up by Wolfram’s work and motivated by Smith’s answer has generated a fruitful and renovated discussion of universality and complexity of small Turing machines. This is why I think this research s relevant to modern computer science, and not only as an amusing mathematical puzzle:

    • New techniques for proving universality are found (Alex Smith’s novel approach for unbounded computations from arbitrary lengths and non-periodic initial configurations).
    • New universal models of computation have been discovered (cyclic tag- systems, bi-tag systems).
    • Such research provides a better understanding of universality,  its limits, its  underlying principles and its necessary and sufficient conditions.
    • It is a base for actually building universal devices when only a few elements can be used, e.g. in nanotechnology or molecular computation.
    • Simple/small machines may be more easily/effectively embedded in other systems.
    • The old discovery/invention duality question comes to the fore: It sheds light on how simple universality is, how frequently it occurs, whether  it is engineered or not, whether  one builds universal computation or finds it in the universe.
    • It could shed light on the relative feasibility of  universal Turing machines based on different tape configurations (e.g. blank characters, repetitive words, non-repetitive with computationally simple backgrounds) as actual physical systems.  At present it is not at all clear why one ought to  favor blank characters over other possible real-world backgrounds, such as “noise.”
    • Questions of size and complexity  arise: It would be interesting, for instance, to find out whether there is a polynomial (or exponential) trade-off between program size and and the concept of simulating a process.
    • Some questions  on algorithmic complexity arise: Will the encoding always be more complex if the machine is simpler? All theorems in algorithmic information theory depend on additive constants, which depend on the sizes of typical universal Turing machines. What is the impact of different generalizations of universality on algorithmic complexity and what is the role of  encoding in such a measure?
    • Some questions arise on the relation between several variants of universality definitions: Is there an effective and efficient encoding for each non-periodic encoding preserving universality? If so, how does this impact their complexity? Is there a non-periodic encoding with blank characters for each periodic blank word encoding, and what would the impact of such  an encoding be on the size/complexity of the Turing machine in question?

    The field is active and still an important area of research. Several computer science conferences include talks on small computational systems such as Computability in Europe (CiE) and Machines, Computations and Universality (MCU) included such talks this year, focusing in particular on reversible cellular automata and universal Turing machines.

    Here are some references from the small Turing machine community:

    [1] Manfred Kudlek. Small deterministic Turing machines. Theoretical Computer Science, 168(2):241-255, November 1996.
    [2] Manfred Kudlek and Yurii Rogozhin. A universal Turing machine with 3 states and 9 symbols. In Werner Kuich, Grzegorz Rozenberg, and Arto Salomaa, editors, Developments in Language Theory (DLT) 2001, vol. 2295 of LNCS, pp. 311-318, Vienna, May 2002. Springer.
    [3] Maurice Margenstern and Liudmila Pavlotskaya. On the optimal number of instructions for universality of Turing machines connected with a finite automaton. International Journal of Algebra and Computation, 13(2):133-202, April 2003.
    [4] Claudio Baiocchi. Three small universal Turing machines. In Maurice Margenstern and Yurii Rogozhin, editors, Machines, Computations, and Universality (MCU), volume 2055 of LNCS, pp. 1-10, Chisinau Moldavia, May 2001. Springer.
    [5] Turlough Neary and Damien Woods. Four small universal Turing machines. Machines, Computations, and Universality (MCU), volume 4664 of LNCS, pp. 242-254, Orleans, France, September 2007. Springer.
    [6] Yurii Rogozhin. Small universal Turing machines. Theoretical Computer Science, 168(2):215-240, November 1996.
    [7] Shigeru Watanabe. 5-symbol 8-state and 5-symbol 6-state universal Turing machines. Journal of the ACM, 8(4):476-483, October 1961.
    [8] Shigeru Watanabe. 4-symbol 5-state universal Turing machines. Journal of Information Processing Society of Japan, 13(9):588-592, 1972.
    [9] Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002.

    Meaning is in the word net: cyclic self-referential definitions, dictionaries and found in translation

    Saturday, December 2nd, 2006

    In the “Period 1 Cycle English Dictionary” published by “No way, Inc.” (it’s been said to be the most accurate dictionary ever) one can read:

    dog (Noun) : a dog is a dog.

    The lazy creators of this dictionary appear to have forgotten what is broadly accepted by common sense. A definition would not be a definition if it were cyclical and self-referential, simply because one would, theoretically, fall into an infinite loop trying to define an unknown word that has as part of its definition the word itself.

    Let’s leave aside the fictional Period 1 Cycle English Dictionary, which I just made up, and consider how things work in a regular dictionary. Assume you want to know the definition of a word. Looking it up, you’ll encounter a set of ordered words in the language in question(if the dictionary is not a bilingual one). Let’s prove that in the end, any definition of a word w in a closed finite dictionary is cyclic with period p_w:

    Proof sketch:

    We may hypothesize that the dictionary (D) is closed under the operation “the definition of” (i.e. all words in the definition of a word have a definition in the dictionary). One can also safely assume that all words in D have a non-empty definition (given the definition of “dictionary” itself!) and that the dictionary is finite. Then if w is a word in D, the orbit of the successive definitions d_1,d_2,…d_n, … for w (listed as sets of words themselves) will end up eventually crossing the path d_w={d_1,d_2,…d_n, …} because d_w cannot be an infinite sequence without repeating elements in D– the finite nature of D would necessitate  coming back to the original word itself after a certain period p depending on the word, making every word w in the dictionary cyclic with period p_w. 

    If all the definitions in a dictionary are then cyclic and enclosed in the dictionary itself, where does the analytic knowledge that a dictionary seems to provide come from? When one consults a dictionary in a given language, one already knows some words in that language, but assuming one doesn’t, would a dictionary be completely useless, as the issue of ultimately cyclic self-referential definitions seems to suggest?

    One could conceive of a dictionary as a whole as a syntactical knowledge container with no actual knowledge in it– to someone who does not bring to it any knowledge of the relevant language. One may wonder how something so perfectly self-referential could actually be of use, as dictionaries seem to be. Is it indeed because you always know some, or indeed many, words of a language already when you consult a dictionary? For since every word is defined by other words in the same language, looking up the meaning of a given word would lead you to other words and yet others and eventually back to the first word, the word whose meaning you set out to learn. This would be the case even if  the dictionary were bilingual, and the meaning of the word you wished to check was given in a second language. Thus all dictionaries are perfectly circular, closed, self-referential sources.

    However, the analytical knowledge in a dictionary does not come from the definitions as words, but from the word net underneath, where the words are connected in some, perhaps unique fashion(modulo exact synonyms) to each other. That’s what quite successful projects such as WordNet and functions like WordData in Mathematica are about. The power of being able to analyze the language as a net in a computable way is priceless for the progress of computational linguistics and linguistics in general.


    2-level depth wordnet for the word "chair"

    2-level depth wordnet for the word "chair"



    For example, if “chair” is connected to “table”, “office”, “dining room”, etc. it should be easy to map it to its equivalent in any other language. Unless the target language doesn’t have the concept “chair” as referring to a seat placed at a table in a dining room (which was perhaps the case in some Asian countries before colonialism), together with an explicit cognitive representation of it.

    Of course problems arise when considering the mappings between words having the same written form but different senses. The word “chair,” for example, is a seat, but may also mean the officer who presides at a meeting. Also posing a problem would be cases of words being mapped to or from words belonging to different parts of speech, such as “personne” in French, which  maps onto two completely different words in Spanish: “persona” and “nadie”,  a noun and an adjective respectively, with completely different connections and different supporting nets. So even when it seems that most relations might be surjective, the general case is certainly not bijective, and that applies to homonyms too, which often creates ambiguities in translation. However, the supporting network  would be able to uncover this fact and solve a possible ambiguity based on  context by extending the word network to encompass the ambiguity. In other words, if a subnet cannot be uniquely mapped, extending it should eventually solve the ambiguity. What one would need  is a corpus big enough to build such a network once and for all and then simply make comparisons at the network level. This could work even for completely new or unknown languages, either dead, living or artificial, assuming that they share a part of our actual reality and hence some part of our mental representations  (In a sense this is what Champollion did when he deciphered the Rosetta stone– he discovered a partial mapping of a subnetwork of words from an unknown language – Egyptian – to a subnetwork of a known one – Greek). In the final analysis, each language has a single unique network (changing slightly through time but remaining well connected and strong enough to make  it unique and recognizable while being isomorphic with that of any other language).  All languages could be identified by their fingerprints -their word net. This kind of analysis would identify the lack of a word net structure in hoax languages, such as perhaps the Voynich manuscript.

    Having  established that, what about mining the world of all possible meanings, the world of all possible translations, and the world of all possible ideas? We wouldn’t have the problem of distinguishing between a coherent idea and a non-coherent one since the network would provide some minimal coherence. Thus the net-into-the -net approach would give us a way of translating from word to word and from phrase to phrase and from idea to idea as faithfully as possible in most cases, since in the end all of us as human beings share a single reality, though we perhaps approach it from different points of view.

    Again, the analytical knowledge in a dictionary comes from the net connecting the words, so even if someone does not know English at all, I would say that he would be able, albeit with considerable difficulty, to learn English just by deducing the net connecting objects, in other words, by mapping his own mental  representations of objects onto words in the  English dictionary. In the process he could encounter some ambiguities, but the further he goes, the more of these he would be able to resolve. On the other hand,  speakers of those languages in which “chair” does not exist, both in the language itself and as a real object in the culture,   would be able to deduce what  a chair is by tracking its  relations with the objects they know and for which they do have mental representations and the phonemes to externalize them. So the problem of translation, which began with the mapping of word onto word and then phrase onto phrase  with statistical tools,  becomes, with this approach, a matter of  mapping net to net.  Indeed this seems to be the approach adopted  by Meaningful Machines.  

    These ideas could be carried to the limit by taking the sum total of human languages and enquiring into the mapping between such a network and our cognitive representations. Such a move would provide grounds  for rebutting the Chinese room argument, since in the end it does not matter whether someone inside the room has no knowledge at all of a language; he would be able to map what he is mechanically translating onto his own mental representations, generating what, according to the argument, could not be generated: understanding. Because Searle’s  idea was, as I recall, to build up a case against A.I. in terms of the meaningless of the Turing test and true AI in general.

    One may actually use a dictionary without knowing a single word in it! That is because there is a mapping between the word net in the dictionary and one’s own language, or even better, a mapping (not necessarily injective or surjective) between the word net of the dictionary and your cognitive personal word net.

    Oversimplifying, translation might be reduced to the search for the homomorphism between algebraic groups, with each group G={W,d} being a language dictionary, with W the set of words in that language and d the closed operation “definition of”. One can then see each definition as an oversimplification of a directed, ordered graph g={w,s}, with w the set of vertex words involved in the definition and s the ordered (since definitions may be not commutative) set of edges for an ideally formal- enough language dictionary.

    Bad practices of cyclic definitions in a dictionary should rather be expressed as the practice of period 1 cycles, i.e. words that have in their  immediate definition the word itself.

    This post is  related to a previous post titled “Meaning against A.I.

    Seth Lloyd’s quantum universe view

    Wednesday, November 22nd, 2006


    In an exchange of emails, Seth Lloyd and I discussed the topic I wrote about some posts ago. Here is some of it.

    According to Lloyd, there is a perfectly good definition of a quantum Turing machine (basically, a Turing machine with qubits and extra instructions to put those qubits in superposition, as above). A universal quantum computer is a physical system that can be programmed (i.e., whose state can be prepared) to simulate any quantum Turing machine. The laws of physics support universal quantum computation in a straightforward way, which is why my colleagues and I can build quantum computers. So the universe is at least as powerful as a universal quantum computer. Conversely, he says, a number of years ago he proved that quantum computers could simulate any quantum system precisely, including one such as the universe that abides by the standard model. Accordingly, the universe is no more computationally powerful than a quantum computer.

    The chain of reasoning, to jump to the quantum computer universe view, seems to be 1 and 2 implies 3 where 1, 2 premises and the conclusion 3 are:

    1 the universe is completely describable by quantum mechanics
    2 standard quantum computing completely captures quantum mechanics
    3 therefore the universe is a quantum computer.

    Seth Lloyd claims to have proved the connection between 1 and 2, which probably puts the standard (or some standard) theory of quantum mechanics and the standard quantum computing model in an isomorphic relation with each other.

    Lloyd’s thesis adds to the conception of the Universe as a Turing computer an important and remarkable claim (albeit one that depends on the conception of the quantum computer), viz. that the Universe is not only Turing computable, but because it is constituted by quantum particles which behave according to quantum mechanics, it is a quantum computer.

    In the end, the rigid definition of qubit together with the versatility of possible interpretations of quantum mechanics allows, makes difficult to establish the boundaries of the claim that the universe is a quantum computer. If one does assume that it is a standard quantum computer in the sense of the definition of a qubit then a description of the universe in these terms assumes that quantum particles encode only a finite amount of information as it does the qubit, and that the qubit can be used for a full description of the world.

    Quantum computation may have, however, another property that may make it more powerful than Turing machines as Cristian Calude et al. have suggested. That is the production of indeterministic randomness for free. Nevertheless, no interpretation of quantum mechanics rules out the possibility of deterministic randomness even at the quantum level. Some colleagues, however, have some interesting results establishing that hidden variables theories may require many more resources in memory to keep up with known quantum phenomena. In other words hidden variable theories are more expensive to assume, and memory needed to simulate what happens in the quantum world grows as bad as it could be for certain deterministic machines. But still, that does not rule out other possibilities, not even the hidden variables theories, even if not efficient in traditional terms.

    This is important because this means one does not actually need ‘true’ randomness, the kind of randomness assumed in quantum mechanics. So one does not really need quantum mechanics to explain the complexity of the world or to underly reality to explain it, one does require, however, computation, at least in this informational worldview. Unlike Lloyd and Deutsch, it is information that we think may explain some quantum phenomena and not quantum mechanics what explains computation (neither the structures in the world and how it seems to algorithmically unfold), so we put computation at the lowest level underlying physical reality.

    Lloyd’s thesis adds to the conception of the Universe as a Turing computer an important and remarkable claim (albeit one that depends on the conception of the quantum computer), viz.  that the Universe is not only Turing computable, but because it is constituted by quantum particles which behave according to quantum mechanics, it is a quantum computer computing its future state from its current one. The better we understand and master such theories, the better prepared we would be to hack the universe in order to perform the kind of computations–quantum computations–we would like to perform.

    I would agree with Rudy Rucker too as to why Seth Lloyd assigns such an important role to quantum mechanics in this story. Rudy Rucker basically says that being a subscriber to quantum mechanics, Lloyd doesn’t give enough consideration to the possibility of deterministic computations. Lloyd writes, “Without the laws of quantum mechanics, the universe would still be featureless and bare.” However, though I am one among many (including Stephen Wolfram) who agree  that it is unlikely that the universe is a cellular automaton, simply because cellular automata are unable to reproduce quantum behavior from empirical data (but note that Petri and Wolfram himself attempt explanations of quantum processes based on nets), there’s  absolutely no need to rush headlong into quantum mechanics. If you look at computer simulations of physical systems, they don’t use quantum mechanics as a randomizer, and they seem to be able to produce enough variations to feed a computational universe. Non-deterministic randomness is not neccesary; pseudorandomness or unpredictable computation seem to be enough.

    Universality on Real Computation

    Thursday, October 12th, 2006

    A paper of mine in French on this subject is already in arXiv: “Universality on Real Computation”. Chris Moore called my attention to his paper entitled “Recursion Theory on the Reals and Continuous-time Computation” ,  which arrives at similar results using a different approach. A paper I’m writing in English on this subject, and which I have been discussing with several people, is forthcoming.  A preliminary powerpoint presentation that I used when presenting on the topic to Gregory Chaitin is available here:Universality on Real Computation In computability theory, the theory of real computation deals with hypothetical abstract machines capable of infinite precision. They operate on the set of real numbers, most of which are non-computable by a Turing machine. These hypothetical computing machines can be viewed as idealised analog computers or differential machines. The question I am interested in concerns the consequences of computing on the set of real numbers for current foundational concepts in traditional computation,  for example universality. I have found that an infinite number -possibly denumerable- of  universal real computers of varying power allows what I have called “intrinsic universality” (there is another definition of intrinsic universality related to dynamical systems and universal computation), since real numbers are not bounded in complexity and there is therefore one for each class of complexity. However, taking the complete set of real numbers, what I claim is that a single and ultimate universal machine for real computation is not possible, since for each proposed universal real computer A –with pre-fixed real numbers R=r_1,r_2,…,r_n — it suffices to calculate the maximal Turing degree of the set of the real numbers involved, let’s say O_n, to conceive a new real computer B able to compute numbers of higher Turing degree, let’s say O_m with m>n, which the given universal real machine A won’t be able to emulate– which contradicts the definition of a universal abstract real computer. Therefore A won’t be able to emulate (in the way conceived for defining universality in the traditional Turing model)  any other real machine B which belongs to the same set –the real numbers– (which evidently can be seen as their class of languages) unless it allows non-recursive functions going through all the levels of the arithmetical and hyper-arithmetical hierarchy. In other words, real computation does not allow universality, which is the very foundation of computer science unless we take into consideration non-recursive functions of arbitrary power, which adds an aditional problem to the conception of a universal device and the convergence of real computing models, unlike the traditional model based on recursive functions theory and Turing machines. However, in real computation there is a place for relative universality, which is very rich in ideas and theories and gives rise to a hierarchical Church-Turing type thesis for each level of the arithmetical and hyper-arithmetical hierarchy. So talking about the universal abstract device E_n for example–because of  the arithmetical level E_n– for each n natural number would make sense, unlike a device E able to behave like any other E_n for any n natural number, which would need at least one function or the composite of several functions able to attain any arbitrary degree of computability. In other words, E wouldn’t be a field since it is not closed under the set of operators of the defined E.

    Here is a related paper entitled on “The complexity of real recursive functions” by Manuel Lameiras Campagnolo. In France, “M. Olivier Bournez” has a particular interest in real computation, in particular analog and hybrid systems.Other important authors in the field include : Karp, da Costa, Blum, Cucker, Shub and Smale. Felix da Costa has let me know that  he is working on something related to my ideas on universality in real computation and relative universality.It is important to say that being interested in real computation is not the same thing as being an “hyper-computationalist”. In fact, some if not most  of the above authors think that real computation would at best be only as powerful as the digital models.  Their interest in the field is mainly operational, in the sense that these types of  machines perform operations commonly related to fields like real anaylisis.

    International Conference in Complex Systems, NECSI

    Monday, August 7th, 2006

    NECSI/ICCS Conference Report, Quincy, Greater Boston, USA, July 2006.

    First lesson: For every complex problem there is a simple, neat, wrong solution.

    I attended talks given by Ed Fredkin on Finite Nature, Lazlo Barabasi on Complex Networks, Christoph Teuscher on Biology and Computation and John Nash on his research upon Game Theory.
    * Ed Fredkin presented a table salt 3-D cellular automata model that fulfils all physical constraints on symmetry and energy conservation. The model is surprisely Turing-universal. As a reminder, the Zuse-Fredkin Thesis is a Church-type thesis which claims additionally that the universe being a Cellular Automaton can be understood in terms of the evolution of  Cellular Automata. An interesting paper entitled “Church-Turing Thesis is Almost Equivalent to Zuse-Fredkin Thesis” by Plamen Petrov is available here>
    And much more information on the interesting ideas of Ed Fredkin is available in and on  Ed Fredkin homepage at MIT, which  includes some remarks on why this thesis is not in agreement with  Stephen Wolfram’s, since Wolfram does not intend to imply that the universe is a classical cellular automaton but rather conceives of   a discrete universe based on systems performing simple rules and producing all the complex behavior we find in nature. Such systems are comparable to Turing machines, tag systems, axioms or any other equivalent system.
    * In his lecture, Lazlo Barabasi claimed that behind all complex systems there are complex networks, in particular Free-Scale Networks (with a few nodes very well connected with others, and many nodes weakly connected). These nets are efficient and robust (even minus random nodes they remains connected)except under attack (provided the best connected nodes are targeted).
    * Christoph Teuscher gave a lecture on computation inspired by biology. However, as he himself admitted, it was not his area of expertise.
    * John Nash presented his research on Game Theory using Mathematica as an engine.
    * I met Hava Siegelmann and one of her fellows, Yariv, who is working on Artificial Intelligence. Siegelmann worked some years ago (while completing  her PhD dissertation) upon a model of computation called Analog Recurrent Neural Network or just ARNN which, under certain circumstances,  is able to compute more functions than the set of recursive functions , which are those computed by the Turing machines. She is now working on topics related to Molecular Biology. I asked her about the forceful  critiques delivered by traditional logicians like Martin Davis, who wrote a paper entitled “The Myth of HyperComputation.”    The target of most of these critiques is a certain circular argument—to compute more than Turing machines it is necessary to use non-Turing computable weights previously coded into the neural network. It has been known for a long time that the set of all real numbers is able to encode any arbitrary language, even if it is non-Turing computable. What is remarkable from my point of view is her result relating to complexity classes, since weights with lower complexity are able to compute a higher class when  used in those networks. Aditionally she argued that even with a stochastic function her networks are able to solve non-computable functions. Recently I discussed the fact with Stephen Wolfram and we agreed that she is assuming strong randomness. I would say however that Siegelmann’s work is much more beatiful from a complexity point of view than from a “hypercomputational” viewpoint. In her book published under the title “Neural Networks and Analog Computation: Beyond the Turing Limit ” she proves that: a) there exists an universal neural network with only 9 neurons, and b) that  p(r) suffices to compute a non-computable function, where r is an arbitrary complete real number but p(r) represents the first p digits of the expansion of r–which means that linear precision suffices to achieve “super-Turing” capabilities, assuming that the neural network can have access to any possible real number before the computation. In other words this seems to be true only if all possible real numbers are allowed a priori (just as in the Turing machines an unbounded tape is neccesary to carry out all recursive languages, and neural networks with rational numbers as weights do not compute the same set of functions as neural networks with whole numbers as weights, the first having been  proven by Klenee to compute the same set as Turing machines and the second  to compute the same set of languages as finite automata, those called regular languages).
    I exchanged ideas with some other interesting people, like an engineer from the Space and Airbone Systems Department of Raytheon. And I met John Nash Jr. during the gala dinner  at which he was presented with an award  for his contributions to  Complexity Theory, mainly honoring his work relating to Game Theory.