Computer Science

Calculating a Universal Distribution to Approximate Kolmogorov-Chaitin Complexity

Posted in Algorithmic information theory, Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on December 12th, 2012 by Hector Zenil – Be the first to comment

Computing the incomputable has always been a challenge. For example, in finding the busiest Turing machines (Rado) given a number of symbols and states (whimsically called busy beavers). This means either finding Turing machines that, starting from an empty input, produce more non-blank symbols in their output tapes before halting than any other Turing machine of the same size, or Turing machines that, also starting from an empty input, have the greatest runtime before halting than any other Turing machine of the same size. Both problems are ultimately undecidable because of the Turing-complete capabilities of Turing machines, as proven by Alan Turing himself (that is, the capability of some Turing machines to simulate any other Turing machine).

In this new paper we describe how we have managed to calculate an approximation of a so-called Universal Distribution (aka Levin’s semi-measure) which connects the frequency of production of a string to its Kolmogorov complexity (K). The chief advantage of calculating an approximation of the Universal Distribution is that it is an incremental process over an average of a large number of Turing machines. One doesn’t get rid of the constant from the invariance theorem in the theory of algorithmic information theory (for example when Kolmogorov complexity is measured using 2 different universal Turing machines), yet one seems to have to make fewer arbitrary decisions.

One of the main advantages is that one can better deal with strings of very short lengths. Think about it! If one wished to approximate K for a single bit by using compression algorithms, the lossless compression algorithm would not be able to compress the single bit any further. And this not only happens for a single bit but for all strings up to a certain minimal length for which lossless compression algorithms are simply unsuitable (recall that a compression algorithm includes the decompression instructions together with the data in the new compressed object in order to make it self-decompressible).

The usefulness of lossless compression algorithms as a method for approximating K derives from the fact that compression is a sufficient test of non-randomness. This is because K is, more precisely than an uncomputable function, upper semi-computable, meaning that one can estimate upper bounds. The lossless compressed length of an object s (e.g. a string) is therefore an upper bound on K(s). The usefulness of the Coding Theorem Method (the theorem presented in this paper) will ultimately come down to whether it is useful in applications, which is the main motivation behind its conception, given the failure of compression algorithms to provide information about the possible K(s) values for s that are too short (shorter, say, than the typical size of the length of the instructions that a lossless compression algorithm has to add to the compressed object).

The state of the art of the Coding Theorem Method can be gleaned from this paper, recently made public by my colleagues and I, and announcing the release of the calculation of a universal distribution based on (5,2), that is, all Turing Machines with 5 states and 2 symbols: Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines

It represents a major improvement over this previous calculation of mine, that required new and more powerful techniques to deal with a semi-computable distribution. It improves our previous work in terms both of accuracy and coverage of number of short strings and validates previous calculations of universal distributions, showing the incremental nature of the method to be fully compatible with the other calculated universal distributions with smaller samples of small Turing machines (and for which the known Busy Beaver values could be used).

In this other paper we explain why our approximations of K are real-number values, showing that strict integer-value program size follows our Coding Theorem Method, and thus that ours constitutes a more fine-grained measure. It is also shown that logical depth departs from both strict program-size and the Coding Theorem Method evaluations, being in agreement with the theory for all these 3 measures. The paper is available online at: Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures

In the next post I will be announcing and also briefly explaining the results from another paper showing that not only can our Coding Theorem deal with short strings, but that we have found a way to validate the method by lossless compressibility. Moreover, we have found that in the transition period, where the Coding Theorem Method starts to be too expensive to be of practical use whereas the compression method starts to provide some results, the 2 methods are in great agreement with each other. Like an expensive microscope of great power (e.g. the LHC at CERN), our Coding Theorem Method requires an incredible amount of calculation. The trick is to know when to use a microscope–to know when a microscope is required rather than a telescope. We believe we are providing the tools to deal with the information content of the smallest entities in the computational universe.

Conjectures concerning Busy Beavers, Dynamic Behavior and Turing Universality

Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation on June 1st, 2012 by Hector Zenil – Be the first to comment

In a recent paper I have advanced some conjectures using a coefficient that renders aspects of the qualitative behavior of complex systems in quantitative terms. It measures the sensitivity of a system to external stimuli, its apparent ability to (efficiently) transfer information from the input through the output.

In a previous paper, and in a quite different context, I proposed the conjecture that systems capable of (efficient) Turing universal computation had large coefficients (greater than 1), meaning that they are sensitive enough to external stimuli.

It is evident that a system that doesn’t react to its input is unlikely to be (efficiently) programmable, which is what Turing universality is about. Recall that Turing universality is the notion of a machine capable of simulating any other machine of the same type. More formally, if U is a universal Turing machine, anda Turing machine with input s, one can run U with input, that is>, and U will then behave like M with input s, where M and s are any Turing machine and any input.

In that paper I used this coefficient as a phase transition detector, and it so happened that the way in which it was defined made it eminently capable of classifying the behavior of cellular automata and other systems, even yielding Wolfram’s well-known classes of behavior. The paper is available online here.

Now I have advanced another set of conjectures, this time relating a set of Turing machines defined by the way they behave: Busy Beaver Turing machines. These are machines that do a lot of work before halting. More formally, they print more non-blank symbols and have the greatest runtime among machines of the same size (number of states and symbols). I have written a program showing all known and candidate Busy Beaver machines (in Mathematica, runnable on your browser by downloading the Mathematica CDF player):

Let bb(n) be the Busy Beaver with n states (and 2 symbols). The conjectures say that:

1. (strong version): For all n > 2, bb(n) is capable of universal computation.
2. (sparse version): For some n, bb(n) is capable of universal computation.
3. (weak version): For all n > 2, bb(n) is capable of (weak) universal
computation.
4. (weakest version): For some n, bb(n) is capable of (weak) universal
computation.

They are based on the fact that Busy Beaver machines are complex, in Bennett’s sense. Bennett’s notion of “logical depth” is the idea of capturing the complexity of a string s in terms of the unfolding time of the shortest (or set of nearly shortest) programs producing the said string. The outputs of Busy Beaver Turing machines are therefore, by definition, the ones with the greatest logical depth. If they print more symbols than any other machine and take the greatest time to do so, they have maximal logical depth. Busy Beaver machines are also machines that produce their outputs with the minimum resources (if there were shorter machines producing the same outputs those would be the Busy Beaver machines), hence they are the shortest programs producing the strings. Busy Beaver machines are therefore by definition very deep (in Bennett’s definition of complexity), meaning they are complex in this sense (logical depth is also profoundly related to Kolmogorov complexity, given that the definition involves the shortest(s) programs).

Besides being deep, Busy Beavers have another characteristic: they halt. So they don’t just do their stuff without worrying about whether to stop. One can easily encode a “Busy Beaver” that loops and produces any arbitrary number of non-blank symbols say for example, an infinite loop printing 1s), but Busy Beavers don’t. They save a state to halt at the end. Machines that are not able to halt (the majority) cannot be Turing universal, because they wouldn’t ever produce any output, simply because by definition something is an output if the machine has halted. But machines that always halt would be decidable, and therefore not universal either.

So among the things to attempt to amount to evidence in favor of the positive answer of the conjectures of Turing universality for Busy Beaver machines, is to encode some input such that they don’t stop (and to prove so), as a first step towards a prove for universality. If Busy Beavers are capable of halting for some inputs (e.g. empty input, as this is entailed in the definition of a Busy Beaver) but are also capable of not halting for others, this would strengthen the conjectures. Busy Beavers may, however, be very different among them, as they are defined by behavior, rather than their specific inner workings, but if the behavior of a Busy Beaver in any way determines something that captures a property of all them, one could use the same argument in favor or against them towards proving or disproving these conjectures.

There seems no easy way to prove that all Busy Beavers are capable of universal computation, but such a proof, if it exists, would be the first to characterize a non-trivial set of well defined universal Turing machines. Non-trivial, because of course one can build a universal Turing machine and keep adding resources and proving that the resulting machines are all universal, just as happens with Turing-complete programming languages. But Busy Beaver machines would be characterized by nothing but a qualitative behavior. This makes the assertions of the conjectures more interesting.

It could be the case that Busy Beavers are not capable of Turing universality for blank tapes. In fact, Michel Pascal, a world expert in the field of Busy Beavers, thinks that Busy Beavers may be capable of –weak– universal computation (personal communication), that is, starting from non-blank tape configurations (e.g. a periodic or bi-periodic tape configuration). This accords with another conjecture in Computer Science, the Collatz sequence. The conjecture says that whichever number you start with, if n is even, divide it by 2 to get n / 2. If n is odd, multiply it by 3 and add 1 to obtain 3n + 1; you will end always end up reaching 1. I have also written 2 small computer programs (also in Mathematica) illustrating this conjecture:

According to Michel, Busy Beavers would be related to Collatz-type functions (see his paper “Small Turing machines and generalized busy beaver competition“), with conjectures that are related but more general (in the context of small Turing machines) to the ones I’m talking about.

Today, we know that Turing universality requires very little. For example, Wolfram conjectured that a 2,3 Turing machine was universal, on the basis of the aspects ot its behavior that he studied. And as proven by Alex Smith, this 2,3 TM turns out to be capable of universal computation. The machine is surprisingly small, and indeed is the smallest it can possibly be, given that we know that no 2,2 Turing machine can be capable of Turing universality (proof recently provided by Maurice Margensten in a Complex Systems paper). Today we also know many minimialistic systems capable of universal computation: Conway’s game of Life, Wolfram’s Rule 110, many Post-tag systems and a variation called Cyclic tag systems, Wang tiles. And at least 2 important researchers in the field have recently argued for the ubiquity of Turing universality (Martin Davis and Maurice Margenstern myself, along the lines anticipated by Wolfram’s Principle of Computational Equivalence).

Finally, I advanced a last conjecture relating Busy Beavers and my coefficient C in this paper, establishing that:

C(bb(n)) > 0

That is, that Busy Beavers have a positive coefficient C, i.e. are quite sensitive to input stimuli, which, if the very first conjecture is true (that Turing universal machines have a large coefficient) makes this last coefficient a good place to start to make the case for the universality of Busy Beavers.

The paper “On the Dynamic Qualitative Behavior of Universal Computation” at ArXiv and has been recently published in the Journal of Complex Systems (vol. 20-3) also available online here.

Turing’s Deep Field: Visualizing the Computational Universe

Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, Foundations of Math, Mathematical Logic on January 5th, 2012 by Hector Zenil – Be the first to comment

I generated this image in the course of an investigation of the distribution of runtimes of programs in relation to the lengths of mathematical proofs, the results of which are being published in my paper bearing the title “Computer Runtimes and the Length of Proofs with an Algorithmic Probabilistic Application to Optimal Waiting Times in Automatic Theorem Proving” Volume 7160 of the Lecture Notes in Computer Science series (LNCS), a festschrift for Cristian Calude.

The preprint is now available online in the ArXiv here. The paper shows that as theoretically predicted, most machines either stop quickly or never halt. It also suggests how a theoretical result for halting times may be used to predict the number of random theorems (dis)proven in a random axiom system –see the section I’ve called “Gödel meets Turing in the computational universe”.

Turing View of The Computational Universe

The plot was the winning image in this year’s Kroto Institute Image Competition, in the Computational Imagery category titled “Runtime Space in a Peano Curve”, it shows the calculation of the distribution of runtimes from simulating (4n+2)^(4n) = 10000 Turing machines with 2 symbols and n=2 states (of a total of more than 10^12 simulated Turing machines with up to n=4 states) following a quasi-lexicographical order in a Peano curve preserving–as far as possible–the distance between 2 machines arranged in a 2-dimensional array from a 1-dimensional enumeration of Turing machines.

In the image each point or cluster of points represents a Turing machine or a group of Turing machines, and the color is determined by a spectrum encoding their halting runtimes–the lighter the square the sooner the machine entered the halting state. White cells represent machines that are proven to halt in infinite time. Red cells show the Turing machines that take longer to halt (popularly called Busy Beavers). Knowing the values of the Busy Beaver functions allows us to identify the machines that never halt (depicted in white). The image is noncomputable, meaning that the process cannot be arbitrarily extended because of the undecidability of the halting problem (i.e. there is no procedure for ascertaining the color of the following pixels to zoom out the picture and cover a larger fraction of the computational universe). Put it in the words of crystilogic,

What you’re looking at is a subset of descriptions of all possible things, and some impossible things. This is possibility and reality compressed into an image.

Turing machines with an arbitrary number of states can encode any possible mathematical problem and are therefore perfect compressors of the known, the yet to be known, and even the unknowable (due to the undecidability of the halting problem).

This is how the image looks like as displayed in the stairway of the Kroto Research Institute in the UK:

Some postcards with the winning image were printed and they can be sent to scholar or enthusiasts upon request sending an email to hectorz[at] labores.eu.

I want to dedicate this prize to my former thesis advisor Jean-Paul Delahaye who suggested me the Peano arrangement as a packing for my visual results of halting times. And also to Cristian Calude who co-advised me all along my PhD thesis and who encouraged me to publish this paper and what better place to do so than for his festschrift.

I’m releasing the images under an open licence in honour of the Turing Year, so that it may be used for any artistic illustration of some of Turing’s main results (the halting problem) or for any other purpose in any medium.

Postcards of the winning image are also available upon request. Just send an email requesting 1 or more postcards to hectorz [at] labores.eu or to let him know (if you can) that you will be using any of the images (or if you need better resolution versions).

Compression-based Investigation of Cellular Automata, A Phase Transition Coefficient and a Conjecture Related to Universal Computation

Posted in Algorithmic information theory, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on August 22nd, 2010 by Hector Zenil – 3 Comments

In my new article Compression-based investigation of the dynamical properties of cellular automata and other systems, published in the Journal of Complex Systems (19:1), pages 1-28, I present a method for studying the qualitative behavior of cellular automata and other abstract computing machines based on the approximation of their program-size complexity using a general lossless compression algorithm. I show that the compression-based approach classifies cellular automata (CA) into clusters according to their heuristic behavior, with these clusters showing a correspondence with Wolfram’s main classes of systemic behavior. I also present a Gray code-based numbering scheme for initial conditions optimal for this kind of investigation, and a compression based method for estimating a characteristic exponent in the form of a phase transition coefficient measuring the resiliency or sensitivity of a system to its initial conditions. And I conjecture that universal systems have large transition coefficients.

I think this constitutes a novel framework for investigating the dynamical properties of cellular automata and other systems. Here I will discuss some of the main ideas and implications of the paper. A pdf version of the paper is available online on ArXiv

Algorithmic complexity: classification into Wolfram’s four classes

In A New Kind of Science and in several papers dating from the mid-1980s, Stephen Wolfram defined four classes into which cellular automata (CA) and several other systems evolve, each capturing a different qualitative behavior which evolves from the same ordered initial configuration (the simplest black cell).

  • Class 1 consists of CA whose dynamics reach a steady state regardless of the initial conditions.
  • Class 2 consists of CA whose long-term evolution produces periodic or nested structures.

These first two classes are simple, in the sense that their long-term evolution can be deduced from running the system for a small number of steps.

  • Class 3 CA produce structures that seem random.
  • Class 4 CA produce localized structures on a random looking background, and hence are complex looking.

Wolfram’s classification is heuristic, and the assignment of CA to the four classes is somewhat subjective. To the best of my knowledge, there is, to date, no universally agreed upon classification scheme for CA. I think, however, that there is no better approach than a pure program-size complexity measure, which is the approach I follow in this paper. I propose to continue the investigation later, using the same measure to discover other interesting properties and possible hierarchies.

An interesting question related to Wolfram’s classification concerns its dependence on the initial condition–chiefly because the classification was originally meant to be constructed by visual inspection over the evolution of a CA, and as we know, the evolution of a CA depends on its initial condition. This has been a major critique (Eppstein) of Wolfram’s classification, because the expectation is that the classification should be based on the evolution from an unordered (random) configuration.

Nevertheless, the classification is actually based on the potential of a CA to evolve into any of the possible behaviors from at least one initial configuration (the question is of course not answerable in finite time, since there is an infinite number of possible initial configurations). Wolfram’s classification may therefore be seen as being dependent on the initial condition of a CA.

It is not a surprise that one can, for example, construct a CA belonging to more than one of Wolfram’s four classes when starting from different initial configurations. Rule 110 belongs to class 4 because it is capable of universal computation–one can set up an initial configuration to ‘program’ rule 110 to carry out any computation (this being the very basic concept of a programmable computer).

For every CA rule there is a definite (but in general undecidable) answer to the question whether or not it is capable of universal computation (or in reachability terms, whether a CA will evolve into a certain configuration). The question only makes sense if the evolution of a CA depends on its initial configuration. No rule can be universal that fixes the initial configuration once and for all (there would be no way to input an instruction and carry out an arbitrary computation).

On the other hand, some rules, such as Rule 0, don’t produce different configurations relative to variant initial configurations. No matter how you may change the initial condition, there is no way to make it compute something other than what it actually computes for every other initial configuration.

A possible objection (made by David Eppstein) is that there are CAs that can be made to look as if they belonged to all classes by modifying their initial conditions. Which is true: a CA may belong to a certain class until, given another initial configuration, it is made to behave as if it belonged to another class.

My compression-based approach shows that Wolfram’s heuristic classification can actually be quantified by a measure which is clearly dependent on the initial conditions, while also being capable of detecting sensitivity to initial configurations, and hence of replacing the visual inspection. This hierarchical classification is well defined and is therefore a good candidate for a complexity measure.

The second part of my investigation actually takes advantage of the ability of CAs to behave differently in order to undertake a closer inspection and a novel classification, taking into account the average capability of a system to behave in different ways.

Differentiation from a priori approaches

The approach is different from others in that it is an a posteriori technique, unlike, for example, Langton’s lambda, a measure of the density of a CA rule. It is an a posteriori technique because unlike this lambda number, the compression-based approach requires the CA to evolve before saying anything about it, whereas Langton’s lambda is computed from the rules of the CA.

Langton’s lambda is simply the fraction of rules in which the new state of the cell is non-zero. For example, the rules of the elementary cellular automaton number 1 in Wolfram’s enumerating scheme are simple: all 3-tuples sent to 0 but one, the last one. Therefore Langton’s lambda is 1 over 8.

The lambda parameter of a CA is a number between 0 and 1. For example, if lambda is 0 (e.g. for ECA rule number 0), the evolution develops into a trivial state. Langton found that CA rules with lambda close to zero evolve into trivial states and CA rules close to 1 evolve into random-looking behavior, with complex behavior somewhere in between. It is near this transition that the most interesting CAs will lie, the ones that manifest the most complex behavior.

Unfortunately, classifying CAs with lambda values is more complicated than that, as one quickly faces undecidability. If it were possible to decide once and for all whether a CA is complex by computing its lambda value, without having to run the CA for a single step, one could solve all kinds of undecidable questions simply by looking at a CA’s rules.

The critical value for lambda is not a universal constant. Nor, for that matter, is my phase transition coefficient. But the main difference, as noted before, is that the compression-based approach actually looks at the evolution of the system rather than trying to figure everything out from the description of the CA.

The compression-based method represents a formal approach to Wolfram’s classification process, replacing the need for a visual inspection with a technique and a coefficient to determine to what class a CA belongs. The approach is compatible with what Stephen Wolfram himself has proposed in his NKS book, without contradicting any computability result or Wolfram’s own Principle of Computational Irreducibility, which says that while some computations may admit shortcuts that allow them to be performed more rapidly, others don’t, so that one cannot really tell what the evolution of a CA will be, except by running it.

Initial configuration numbering scheme

Ideally, one should feed a system with a natural sequence of initial configurations of gradually increasing complexity. Doing so ensures that qualitative changes in the evolution of the system are not attributable to discontinuities in its set of initial conditions.

What I propose is an initial configuration numbering scheme where two successive values differ by only one bit. To explore the qualitative behavior of a cellular automaton when starting from different initial configurations, the optimal method, because of its smoothness, is to follow this Gray encoding enumeration, in order to avoid any undesirable “jumps” attributable to the system’s having been fed with discontinuous initial configurations. By following the Gray code, an optimal numbering scheme was devised so that two consecutive initial conditions differed only in the simplest degree (by one bit). This simple convention will allow us to continue the investigation of the dynamical behavior of a system without being concerned about arbitrarily introducing artificial complexity when moving from one initial configuration to the next.

Phase transition detection and coefficient

I defined a measure based on the change of the asymptotic direction of the size of the compressed evolutions of a system for different initial configurations (following the proposed Gray-code enumeration of initial configurations). My phase transition coefficient yields an interesting classification: it measures the resiliency or sensitivity of a system to its initial conditions. So rules such as 30 and 0 appear close to each other. Odd as this may seem, this is because both, when their initial conditions are changed, behave in more or less the same way. In other words, there is no change in the qualitative behavior of these CAs when feeding them with different inputs, regardless of how different the inputs may be.

In this phase transition classification, for example, rules such as 122 and 89 appear next to each other, because, as the investigation proves, they are both CAs with relatively high phase transition coefficients, meaning that they are very sensitive to initial conditions, dramatically changing their qualitative behavior when starting from one rather than another initial configuration.

Phase transition and predictability

An obvious feature of universal systems is that they need to be capable of carrying information by reflecting changes made to the input in the output. In attempting to determine whether a system is capable of reaching universal computation, one may ask whether a system is capable of this in the first place, and how efficiently it does so. And this is what the phase transition actually measures, because it tells how well a system manages to respond to an input. Obviously, a system such as rule 0 or rule 255, which does not change regardless of the input, is trivially decidable. But a universal system should be capable of some reaction to external manipulation (the input to the system). The inverse, however, should not hold, because having a large transition coefficient by no means implies that the system will behave with the freedom required of a universal system if it is to emulate any possible computation (a case in point may be rule 22, which, despite having the largest transition coefficient, may not be versatile enough for universal computation).

The phase transition measure also implies that one may be able to predict the behavior of a system for an initial configuration with a degree of certainty based on the previous variability of the system in relation to an initial segment of initial configurations. It is an open question whether this is a lower bound. In other words, it is unclear whether looking at the behavior of a system for a certain length of time and for certain configurations will tell you anything about its behavior for other initial configurations or if it were allowed to evolve for a longer period of time. Experience says one would do well to predict future behavior on the basis of past behavior, and this may also be related to Israeli and Goldenfeld‘s very interesting findings. In 2004 they showed that some computationally irreducible elementary cellular automata have properties that are predictable at a coarse-grained level. They did so following a renormalization group (RG) technique, which refers to a mathematical apparatus that allows one to investigate the changes in a physical system as one views it at different distance scales.

They were able to provide a hierarchy of complexity in agreement with their apparent complexity. Israeli and Goldenfeld’s classification is also an a posteriori investigation, but it too is bedeviled by the unavoidable (and ultimately undecidable) statistical question, namely, whether one can keep on predicting for all initial conditions and for any number of steps, without having to run the system  forever and for all possible initial conditions. But unlike mine, their classification is partial, in the sense that one cannot always say whether the complexity of one CA is greater than that of another. Based on the comparison of their approximated compressed sizes, however, I could come up with a total order in full agreement with their apparent complexity as well as with Wolfram’s four classes (one that even yields the same number of classes).

An open question directed to me by Nigel Goldenfeld– in trying to understand the similarities and differences between their RG and my algorithmic approach–concerns how their hierarchy relates to mine. What they experimentally suggest is that the larger the scale of the transformations used, the more highly structured the objects and therefore the greater their algorithmic complexity.

For example, rule 110 is one rule about which my own phase transition classification says that, despite showing some sensitivity, it also shows some stability. Which means that one can say with some degree of certainty how it will look (and behave) for certain steps and certain initial configurations, unlike those at the top. This turns out to be predictable according to Israeli and Goldenfeld as well, at a coarse-grained level, after a scale transformation.

A sketch on a conjecture on the transition coefficient of Turing-universal systems

Based on two empirical facts, I conjecture that universal CAs should have a large transition coefficient, as 110 does. Rules such as 110 and 54 also turn out to be next to each other in the phase transition classification, both having large values (they are in the top 24, and both the top 24 and the bottom 22 are in the original paper: see the last 2 tables).

So I base this conjecture on the following empirical observations:

1. The only known universal elementary cellular automata figure at the top of this classification, and so do candidates such as rule 54, which appears right next to rule 110.

2. Universality seems to imply that a system should be capable of being controlled by the inputs, which our classification suggests those at the bottom are not, as all of them look the same no matter what the input, and may not be capable of carrying information through the system toward the output.

Other rules that some people think may be capable of some sophisticated computation (See paper in the Journal of Complex Systems by Paul-Jean Letourneau) also have large transition coefficients, such as rule 146, with a transition coefficient 0.124, ranking 39 out of the 256 elementary cellular automata.

As noted before, however, the transition coefficient is not a universal measure. In this case, coefficients were calculated for 600 steps in blocks of 50 for the first 500 initial conditions, which means that some rules may be capable of greater transitions but are somehow ‘slow’ at the selected number of steps and number of initial conditions. i.e. they take more than 500 initial conditions–in Gray code numbering–to show larger changes, or else larger changes are being missed because of the jumps in blocks of 50 steps, though this latter possibility is less likely.

The conjecture also seems to be in agreement with Wolfram’s claim (made at some of his oral presentations) that rule 30 (as a class 3 elementary cellular automaton) may be, according to his Principle of Computational Equivalence (PCE), computationally universal. But it may turn out that it is too difficult (perhaps impossible) to control in order to perform a computation, because it behaves too randomly.

Classifying objects by complexity

Posted in Algorithmic information theory, Complexity, Computer Science, New Ideas on June 2nd, 2010 by Hector Zenil – Be the first to comment

I have coauthored, with Jean-Paul Delahaye and Cedric Gaucherel, and made available today on arXiv a new paper entitled Image information content characterization and classification by physical complexity. In the paper we present a method for estimating the complexity of an image based on the concept of Bennett’s logical depth. Unlike the application of the concept of algorithmic complexity by itself, the addition of the concept of logical depth results in a characterization of objects by organizational (physical) complexity. We use this measure to classify images by their information content. The method provides a means for evaluating and classifying objects by way of their visual representations.

The method described in the paper ranks images based on their decompression times and the classification corresponds to the intuitive ranking resulting from a visual inspection, with things like microprocessors, human faces, cities, engines and fractals figuring at the top as the most complex objects; and random-looking images, which ranked high by algorithmic complexity, were ranked low according to the logical depth expectation, classified next to  trivial images such as the uniformly colored, indicating the characteristic feature of the measure of logical depth. A gradation of different complexities were found in the groups between, gradually increasing in complexity from bottom to top.

significant different groups

Complexity classification of images, from more complex to less complex(group descriptions on the right are followed by the average decompression times as approximations to Bennett's logical depth)

Along the paper we show that:

  • The concept of logical depth can be implemented as a feasible and applicable method to approach a real-world problem.
  • After studying several cases and tested several compression algorithms, the method described in this paper has shown to work and to be of use for identifying and classifying images by their apparent physical complexity.
  • The procedure described constitutes an unsupervised method for evaluating the information content of an image by physical complexity.
  • As the theory predicted, logical depth yields a reasonable measure of complexity that is different from the measure obtained by considering algorithmic complexity alone, while being in accordance with one’s intuitive expectations of greater and lesser complexity.
  • The paper is available here.

    Comments on Turing’s very first Universal machine approaching Turing’s 100th. birthday anniversary

    Posted in Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, Minds and Machines on May 18th, 2010 by Hector Zenil – Be the first to comment

    The idea that a machine could perform the tasks of any other machine is the description of a Universal (Turing) machine. Its invention is considered by many to have been one of the major landmarks giving rise to the field of computer science. ‘Universal’ means that one can ‘program’ a general-purpose machine to perform the tasks of any specific-purpose machine. Turing machines are to this day the central object of study in the theory of computation.

    In an attempt to understand how the very first universal machine was described in Turing’s original 1936 paper “On Computable Numbers, with an Application to the Entscheidungsproblem,” I spent half a day re-reading the paper (and its corrected version by Donald Davies, published in The Essential Turing by Jack Copeland), trying to decode it, only to find that it is written in something like an anticipation of a (Turing-complete) subroutine-oriented programming language which is impossible to rewrite in a traditional transition table. So if one really tries hard, one ends up encoding an arbitrary universal Turing machine, not Alan Turing’s first universal Turing machine.

    Although the paper has all the primary elements of a traditional description of a Turing machine (Turing’s ‘a-machine’ description), the fact that it used multiple conventions for describing increasingly complex machines was Emile Post’s strongest critique. In a letter to Church, Turing replied to Post’s appreciation, arguing that his use of more than one convention when building his universal machine did not affect the final result, though he did admit it made it hard to decipher.

    The result is that not only would it be a giant machine in terms of states and symbols, but the number of actual colors that it may need seems to be unknown. It is only known to assume 18 states, and to simulate Turing’s second a-machine example with 23 instructions (here the product of states and colors does not necessarily lead to the number of instructions in Turing’s formalism, because his transitions are not total functions).

    In the traditional 5-tuple form, Turing’s original universal machine could be written as (Mathematica notation):

    {q1, “blank”} -> {q2, P0, R}
    {DA, D} -> {DAA, DC, R}
    {q2, “blank”} -> {q3, E, R}
    {DAA, D} -> {DAAA, D, R}
    {q3, “blank”} -> {q4, P1, R}
    {DAAA, D} -> {DAAAA, DCC, R}
    {q4, “blank”} -> {q1, E, R}
    {DAAAA, D} -> {DA, D, R}

    But notice that no state starting by q leads to a state starting by D because D (states are called m-configurations in Turing’s original jargon) are rather prefix subroutines defined in Turing’s paper while q’s are actually traditional Turing machine states. In Turing’s paper E is, for example, an erasing subroutine. Some other ‘m-configurations’ require scanning the whole tape several times (which is what one would do if one is asked to emulate another Turing machine), and so on. So most of the behavior description of the machine is encoded as strange strings of letters.

    Nevertheless, Turing’s choice is somehow clever from a programmer perspective, he proceeded in the way one would do so today for designing an implementing a universal computer. One would hardly do so by describing the basic elements, but rather by constructing higher level subroutines describing a machine function and based itself in one or more other subroutines up to the level of states and symbols. Think of programming at the level of the machine language v. programming in an intermediate level language. Writing a universal Turing machine in detail in terms of states and symbols from the beginning leading to complete lost and misunderstanding, just as it would so if one pursuits the implementation of a complex piece of software writing binary code or even in a pure assembler language. Turing’s description provides a better understanding, not trivial though, of what a universal Turing machine does to carry out the computation of any other Turing machine by sketching and grouping intuitively the machine operations into these program subroutines.

    Unlike today, that one can simply make Wolfram|Alpha to run any Turing machine simulation such as a random 2-state 5-color Turing machine or the 4-state 2-color Busy Beaver (for a list of other Turing machine examples one can type in WolfraAlpha click here), Turing had no computer to ran and test his code, it is not a surprise that his universal machine code came together with several glitches. It was quite amusing to see that the first ever program written for a digital computer was already bedeviled by bugs. And this program was the actual implementation of the very first universal Turing machine by Alan Turing himself.

    If Turing had not died in 1954, at the age of only 41, next June 23 (2010) he would have 98 years old. As a tribute to his work I’ve set up the following webpage gathering most, if not all, the public images known of him visualturing.org

    For his 100th birthday anniversary a series of events are being organized. 2012 will not only be the year of the Olympic Games that Turing would have particularly followed in his own country (UK) as an enthusiast long distance runner but also The Alan Turing Year to which I’m honored to be part of as a member of the advisory committee representing Wolfram’s Science Group.

    On the Kolmogorov-Chaitin complexity for short sequences

    Posted in Algorithmic information theory, Computer Science, Foundations of Computation, New Ideas on October 31st, 2007 by Hector Zenil – Be the first to comment

    My paper On the Kolmogorov-Chaitin complexity for short sequences, coauthored with my PhD thesis advisor Jean-Paul Delahaye has been published as a book chapter in:RANDOMNESS AND COMPLEXITY, FROM LEIBNIZ TO CHAITIN, edited by Cristian S. Calude (University of Auckland, New Zealand) and published by World Scientific.

    Chaitin festschrift From Randomness to Complexity from Leibniz to Chaitin by Cristian Calude
    An extended draft version of this paper can be found in arXiv here and the webpage we have set up for our research on what we call Experimental Algorithmic Theory can be accessed here. The results of our ongoing experiments will be frequently published on this site.The book is a collection of papers contributed by eminent authors from around the world in honor of Gregory Chaitin’s birthday. It is a unique volume including technical contributions, philosophical papers and essays.

    I presented our paper at the NKS Science Conference 2007 held at the University of Vermont, Burlington, U.S. The conference blog has an entry describing my participation.

    NKSMeetingZenilChaitinDaviesWolframCastiFrom left to right: Hector Zenil, Stephen Wolfram, Paul Davies, Ugo Pagallo, Gregory Chaitin, Cristian Calude, Karl Svozil, Gordana Dodig-Crnkovic and John Casti.

    On the simplest and smallest universal Turing machine

    Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, General on October 30th, 2007 by Hector Zenil – 2 Comments

    Alex Smith has recently been able to prove that a Turing machine conjectured to be capable of universal computation by Wolfram was actually universal (Wolfram 2,3 Turing machine Research Prize).

    Part of the challenge was to find an encoding not doing by itself the universal computation that would make the Turing machine universal. Smith succeeded providing an encoding providing a background that while nonperiodic is sufficiently regular to be generated by infinite word written on the tape can be generated by a omega-automaton (not itself universal).

    This relaxation of the tape content has been regular in the cellular automaton world, but also in the field of what is called weak-universality with the only difference that the Turing machines are allowed to start with other than blank tapes (blank tapes are periodic tapes with period 1).

    An objection might be that such a coupled system could turn a nonuniversal machine into a universal one, but Smith showed that his encoding was capable of restarting the computation itself, something that coupled systems usually are not capable of. In other words, the preparation of the tape in Smith’s case is done in advance and the omega-automaton do not longer interact in any further time, while to have nonuniversal machines to become universal usually requires an automaton intervening at every step (or every certain steps) of a computation.

    One may also be surprised by the need of an omega-automaton for infinite strings. But the difference to the traditional blank tape is subtle. Think of how machines operate in the real world. Machines do not run on blank tapes, they usually do so over memory (the cache, RAM or a HD) with all kind of information (that is considered garbage if it is not instantiated by a running program). You may think that this garbage is not infinite, but it is not so a blank tape for a Turing machine, so instead of thinking of providing a Turing machine with blank tape as need it, one can think of providing the Turing machine with a non-blank tape. Now, what is on the tape in the case of Smith is not exactly “garbage” because it plays a role in “helping” the Turing machine to perform its computation, in a kind of support on which the Turing machine that is capable of universal computation actually achieves the computation with the help of a defined background (that doesn’t mean that the machine cannot perform universal with other backgrounds) yet the background is not powerful enough to perform the hardest part of the computation. In the physical world, if processes are seen as computations, computations are performed on a support, on the background of the physical world. So these kinds of relaxation may be closer to actual physical situations than abstract situations in which a blank tape for a computation is assumed or required.

    The discussion opened up by Wolfram’s work and motivated by Smith’s answer has generated a fruitful and renovated discussion of universality and complexity of small Turing machines. This is why I think this research s relevant to modern computer science, and not only as an amusing mathematical puzzle:

    • New techniques for proving universality are found (Alex Smith’s novel approach for unbounded computations from arbitrary lengths and non-periodic initial configurations).
    • New universal models of computation have been discovered (cyclic tag- systems, bi-tag systems).
    • Such research provides a better understanding of universality,  its limits, its  underlying principles and its necessary and sufficient conditions.
    • It is a base for actually building universal devices when only a few elements can be used, e.g. in nanotechnology or molecular computation.
    • Simple/small machines may be more easily/effectively embedded in other systems.
    • The old discovery/invention duality question comes to the fore: It sheds light on how simple universality is, how frequently it occurs, whether  it is engineered or not, whether  one builds universal computation or finds it in the universe.
    • It could shed light on the relative feasibility of  universal Turing machines based on different tape configurations (e.g. blank characters, repetitive words, non-repetitive with computationally simple backgrounds) as actual physical systems.  At present it is not at all clear why one ought to  favor blank characters over other possible real-world backgrounds, such as “noise.”
    • Questions of size and complexity  arise: It would be interesting, for instance, to find out whether there is a polynomial (or exponential) trade-off between program size and and the concept of simulating a process.
    • Some questions  on algorithmic complexity arise: Will the encoding always be more complex if the machine is simpler? All theorems in algorithmic information theory depend on additive constants, which depend on the sizes of typical universal Turing machines. What is the impact of different generalizations of universality on algorithmic complexity and what is the role of  encoding in such a measure?
    • Some questions arise on the relation between several variants of universality definitions: Is there an effective and efficient encoding for each non-periodic encoding preserving universality? If so, how does this impact their complexity? Is there a non-periodic encoding with blank characters for each periodic blank word encoding, and what would the impact of such  an encoding be on the size/complexity of the Turing machine in question?

    The field is active and still an important area of research. Several computer science conferences include talks on small computational systems such as Computability in Europe (CiE) and Machines, Computations and Universality (MCU) included such talks this year, focusing in particular on reversible cellular automata and universal Turing machines.

    Here are some references from the small Turing machine community:

    [1] Manfred Kudlek. Small deterministic Turing machines. Theoretical Computer Science, 168(2):241-255, November 1996.
    [2] Manfred Kudlek and Yurii Rogozhin. A universal Turing machine with 3 states and 9 symbols. In Werner Kuich, Grzegorz Rozenberg, and Arto Salomaa, editors, Developments in Language Theory (DLT) 2001, vol. 2295 of LNCS, pp. 311-318, Vienna, May 2002. Springer.
    [3] Maurice Margenstern and Liudmila Pavlotskaya. On the optimal number of instructions for universality of Turing machines connected with a finite automaton. International Journal of Algebra and Computation, 13(2):133-202, April 2003.
    [4] Claudio Baiocchi. Three small universal Turing machines. In Maurice Margenstern and Yurii Rogozhin, editors, Machines, Computations, and Universality (MCU), volume 2055 of LNCS, pp. 1-10, Chisinau Moldavia, May 2001. Springer.
    [5] Turlough Neary and Damien Woods. Four small universal Turing machines. Machines, Computations, and Universality (MCU), volume 4664 of LNCS, pp. 242-254, Orleans, France, September 2007. Springer.
    [6] Yurii Rogozhin. Small universal Turing machines. Theoretical Computer Science, 168(2):215-240, November 1996.
    [7] Shigeru Watanabe. 5-symbol 8-state and 5-symbol 6-state universal Turing machines. Journal of the ACM, 8(4):476-483, October 1961.
    [8] Shigeru Watanabe. 4-symbol 5-state universal Turing machines. Journal of Information Processing Society of Japan, 13(9):588-592, 1972.
    [9] Stephen Wolfram. A New Kind of Science. Wolfram Media, 2002.

    Meaning against A.I.

    Posted in Computer Science, Foundations of Computation, Minds and Machines, Natural Language and Computational Linguistics on October 12th, 2006 by Hector Zenil – 1 Comment

    A significant number of researchers believe that there are sentences with semantic value that could never be understood by a machine. These researchers believe that the mind has a semantic component, unlike machines. Their’s is a  Chinese Room type argument a la Searle. Consider Chomsky’s example of two books in a library with the same title, and two readers, each taking out one of the books. Do they get the same book? These researchers argue that machines would be unable to answer correctly on the basis of context since the answer would depend on a cognitive understanding of the situation. My claim is that all  meaningful components of a situation are based on a hierarchy that can be artificially represented by categories capturing the essence and functionality of  human mental operations. The answer to Chomsky’s question would be “yes” if one is  referring to the information content of the books,  “no” if one is referring to the books as physical objects.

    Hacking Days at MIT and Wikimania at Harvard, August 2006.

    Posted in Computer Science, Conferences on August 7th, 2006 by Hector Zenil – Be the first to comment

    Hacking Days at MIT and Wikimania at the Harvard Law School came to a close yesterday. Here is a brief summary:

    Brion Vibber, Chief Technology Officer of Wikimedia, gave many talks. He discussed everything from why wiki projects are difficult to cache (since they are dynamical) to new features to come, like Semantic MediaWiki, a possible Xapian search engine, better Wikitags, a better parser,  possible support for  PDF documents and integration with the DjVu image format file, among other video formats like OpenID/YADIS/A-Select/better. There were some OLPC -One Child Per Laptop- computers outside which are able to synchronize themselves, being interconnected in order to play music or share any type of information through a wireless net they build by themselves.

    Mark Bergsma talked about near-future  server technology of the Wikimedia projects, like 64bits servers. He provided information about the geographical sites of the Wikipedia clusters, mainly located in Florida and Amsterdam. He talked about some features of the caching architecture using squid and some new DNS technologies they are exploring, like PowerDNS and geographical balancing (e.g. BGP DNS) and object purging. He announced that they were already using the HTCP inter-cache protocol. He also announced a plan to make the switch with one core switch/router more reliable. Some of the participants proposed the use of  Planet Lab services (http://www.planet-lab.org), a collection of machines distributed over the globe running a common software package including a Linux-based OS, mechanisms for bootstrapping, distribution software and a set of node monitor tools. PlanetLab is mainly devoted to research as a testbed for overlay networks, providing groups the opportunity to experiment with planetary-scale services.

    A later talk was about enhancing Wiktionary to be used as a database for external applications requiring it. Right now the Wiktionary can only be  exploited by addressing a query directly to it. A new database structure is being developed in order to give Wiktionary semantic meaning -among other things, relating each word to its translation in all other languages already in Wikitionary- which will eventually allow  many new features and the generation of a full knowledge database rather than a bunch of words ( about a million words at the moment,  in all langauges) with definitions.

    An interesting talk about managing and posting onto the discussion page also took place. In a nutshell, the idea was to treat each post as a wikipage in itself. Some questions were raised about performance impact on the whole system arising from the huge amount of new wikipages, and other security and control questions emerged too but the idea seemed to be very well accepted. Finally a nice proposal to include video and audio streaming into wikiprojects was presented.

    There were several talks about WikitionaryZ during  Hacking Days and Wikimania, by Eric Moeller and others.Wiktionary Z is an iniative to create the Ultimate Universal Wiktionary (pretty humble, isn’t it?) by integrating semantic knowledge into Wiktionary. The project is based on defining meaning using a short, simple phrase  that defines a word clearly and unequivocally and that could be exactly translated into all the languages of  the Wiktionary. There is also a record of the relationships of the words with each other, thus making it possible to build a machine-readable repository of semantic knowledge.

    Following that, Andre Engels talked about pywikipediabot and the challenges of writing wikipedia bots avoiding anything that used  screen scrapping, making the process of maintaning the bots quite complicated given the need to change bots everytime there is a change in the format of the articles. He also spoke about the dangers of using bots for big tasks, because errors in bot programming can lead to hundreds of thousands of damaged pages.

    Other talks were about OpenID, an OpenSource authentication system very similar to Microsoft Passport in that it integrates the user’s identity in several projects (wikimedia projects, blogs, etc) into a single id. There are good plans to integrate this feature into Wikipedia soon.

    WYSIWYG for Wikipedia: One of the main problems when using Wikipedia is the difficulty of editing using the Wikitags. Although technologically advanced users can easily adapt to the Wikitags system, most  people just can’t get the hang of it. Hence the need for an easy-to-use, simple editor is evident, although the lack of a proper mediaWiki parser and the complexity of the Wikitags language makes such a thing hard to implement. Anyway, Frederico Caldeira Knabben has created a very nice and useful WYSIWYG HTML editor called FCKeditor (www.FCKeditor.org), and is willing to join forces with media wiki to integrate it into wikipedia.

    There was also a panel featuring Brion Vibber, Andre Engels and Erik Moeller which addressed the possibility of a  MediaWiki API. Many of the attendees were enthusiastic, and some went into a separate room and discussed the specification with Brion and Andre. They came up with a preliminary agreement that may be available soon on the web. The day ended with an enjoyable tour of the MIT’s Media Lab.

    Some topics were very boring. For instance,  the Wikimedia discussion panel was mainly about internal politics and logistics. Nothing interesting for the broad audience.

    From the point of view of our discussions here the most interesting topics at Wikimania related to the Semantic Web and the reliability of Wikiprojects, given  that everybody can edit the entries. Jim Giles, the author  of the Nature paper comparing Wikipedia and Britannica talked about the strong and weak points of the entries, and the reviews. According to him, Britannica made the point that most of their entries that were evaluated were in the sciences, where Wikipedia is stronger, most of the contributors being from these disciplines. The argument was that this  kind of comparison could not serve as an adequate measure of the entire corpus of knowledge covered by Britannica. Furthermore Britannica also made the point that the entries (50 in all) were badly reviewed, accounting for the fact that  Wikipedia earned a rating very close to that earned by them (3.9 errors on average per article for Wikipedia as against the 2.9 obtained by Britannica). However the author argues that the reviewers were the same for both sides, so they would on average have committed the same number of errors.

    Regarding the session in which they were expecting to have a consensus on improving the Wikipedia content reliability, they didn’t reach it. According to Martin Walker, a professor of Chemistry, things gradually coalesced during discussions over the weekend. Both the Germans and the English-language people seem to have come to a similar consensus:
    1. Set up stable (uneditable) versions of articles (static unvandalized
    versions). The Germans expect to be testing out this idea within a month.
    2. Then set up a validation system, possibly using review teams of trusted
    people from wikiprojects, to check every fact (& sign off, giving the
    reference source). The fact checking would be made available to any user
    who wanted to take the time. This validated version would also be an
    uneditable version.
    3. On the English Wikipedia we thought there ought to be an outside
    expert (with a PhD and a reputation) to sign off on the validated version,
    so we could say: ”This page has been approved by Professor XXX from
    University of YYY”.

    The discussion page   set up on this issue is at:

    http://en.wikipedia.org/wiki/Wikipedia_talk:Pushing_to_validation

    and

    http://wikimania2006.wikimedia.org/wiki/Proceedings_talk:MW2

    Concerning the Semantic web, there is already a wikiproject at www.ontoworld.org which is working. The basic idea is to use tags for all pieces of information contained in the articles in order to relate them to other data.
    A typical example is:
    ”’Africa”’ is a continent, south of [[Europe]] and southwest of [[Asia]].

    where the tags for Europe and Asia signal a  relation to  countries with the same tags.

    In the end what we have is a big relational database underlying the system which allows queries using a SQL-type query language called SPARQL the  specification for which is at:
    http://www.sparql.org/query.html

    I conducted some tests using the following entry:
    http://wiki.ontoworld.org/index.php?title=Germany&action=edit,
    and using the Africa article I made a search of www.ontoworld.org and created a new article from a query looking for each country in Africa sorted by Population. The query was:
    Editing Africa Population by Countries>

    == List of African countries ==
    [[Category:Country||Persons]]
    [[located in::Africa]]
    [[Population:=*]]
    [[Area:=*]]

    [[Category:Continent]]

    The technology used is something called RDF. More on that at:

    http://wiki.ontoworld.org/index.php/Special:ExportRDF

    and there are some libraries in several languages that deal with it:
    For RDF access with libraries
    Get RDFLib from rdflib.net
    SMW lib
    RAP from www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi

    and the description of the MediaWiki extension project for the Semantical Web is at:

    http://ontoworld.org/index.php/Semantic_MediaWiki

    We also explored  some browsers being developed for the Semantic Web. They are at:

    http://brownsauce.sourceforge.net/

    http://simile.mit.edu

    The workshop on “Using Wikipedia’s Knowledge in Your Applications”
    (http://wikimania2006.wikimedia.org/wiki/Proceedings:MK1) was very interesting. There I met the speakers (Markus Krötzsch, Denny Vrandecic)with whom I exchanged information, agreeing  to keep in contact with them to discuss my concerns relating to addressing  queries to the URL and  transforming unstructured data into semantically  usable information. There was some discussion about Natural Language Processing translation to Semantic Information, directly taking clue words from the articles in Wikipedia and introducing tags and relations:

    http://wikimania2006.wikimedia.org/wiki/Proceedings:DV2

    http://wikimania2006.wikimedia.org/wiki/Proceedings:DV1

    and

    http://wikimania2006.wikimedia.org/wiki/Proceedings:MN1

    I met Tim Berners-Lee who is also very interested in the Semantic Web and the idea of creating a browser exploiting these features. I also met Betsy Megas who is  working on a Semantic Web project called wiktionaryZ.org which is like the wiktionary.org but semantical. We had a long discussion about the convenience of having  a “categories” squeme in the Semantic Web. My point was that in most cases  the categories could  be built dynamicaly. They would be present in an abstract form without there being any need to  define them explicitly. A completely relational model strikes me as more interesting. People have tried to categorize everything since the existence of the encyclopedias,  but the number of possible categories can be much higher than the number of elements to categorize, simply because categories can be so arbitrary that the final number of categories can reach the number of subsets of elements, which is not useful from my point of view.

    A couple of plenary sessions were held in the main auditorium of Harvard Law School. One of them featured a lecture by Lawrence Lessig, a lawyer who has been involved in cases such as the  one pitting Monopoly against Microsoft. He is the founder of the Commons Licence, better known as Left-copyright, where some rights are reserved but others are yielded to the people to allow them to be creative when using resources. He talked about the Read-Only (RO) Society, which was how he characterized  the last century,  and the new Read-Write (RW) Society which is moving toward the Commons Licence, Open Software, Open Hardware, Free and Open Encylopedias like Wikipedia, Freedom of Work/Job (freelancers and people organizing free foundations), free access to  human knowledge and communications (Web Access, Skype), Free and Open broadcasting (pod-casting), among other things.

    Most of the sessions were recorded and are available at:

    http://wikimania2006.wikimedia.org/wiki/Archives

    And the Abstracts and Procedures are at:

    http://wikimania2006.wikimedia.org/wiki/Proceedings:Index

    I also found an interesting site for exploring the degree of separation between 2 different topics through Wikipedia (sometimes it works):

    http://tools.wikimedia.de/sixdeg/index.jsp?from=cuneiform&to=Semantic_Web

    Computability in Europe Conference (CiE) Report, Wales UK

    Posted in Computability, Universality and Unsolvability, Computer Science, Conferences, Foundations of Computation on August 7th, 2006 by Hector Zenil – Be the first to comment

    This is a report on the Computability in Europe Conference (CiE), held at the University of Swansea, Wales in the United Kingdom in July 2006.

    I attended a mini-course on Quantum Computing given by Julia Kempe, a lecture on the Church-Turing thesis by Martin Davis– who defended it against proposed models of hypercomputation– and a  lecture on Proof Theory. Another very interesting lecture was on Godel and Turing’s remarks on the Human Mind (the dichotomy argument from Godel and the mechanistic vision from Turing). Among other noteworthy lectures were Samuel Buss’ on Complexity of Proofs, John Dawson’s on Godel in Computability, Wilfried Sieg’s on the Concept of Mechanical Procedure in Godel and Turing, as well as many presentations on hypercomputability and computing over the reals. I met people whom I had only known through email exchanges, like Felix Da Costa from the Technological Institute of Lisbon, Robert Meyer professor emeritus at the National University of Australia, and Julia Kempe from France who is a renowned researcher in the quantum computing field and with whom I shared some doubts I had concerning where the restrictions in Quantum Computing lay which constrained its power to the set of recursive functions. I also met people from SUNY who are doing interesting research on Turing-computation, studying isomorphisms between Oracle machines and the relation with the Tenenbaum theorem upon the uniqueness of the recursive model of PA (Peano Arithmetic). Many lectures were given on computing over infinite time and space and computing at the limit of the general relativity theory. The conference was intended to take the pulse of the field of hypercomputation in Europe and worldwide.

    International Conference in Complex Systems, NECSI

    Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Conferences, Foundations of Computation on August 7th, 2006 by Hector Zenil – Be the first to comment

    NECSI/ICCS Conference Report, Quincy, Greater Boston, USA, July 2006.

    First lesson: For every complex problem there is a simple, neat, wrong solution.

    I attended talks given by Ed Fredkin on Finite Nature, Lazlo Barabasi on Complex Networks, Christoph Teuscher on Biology and Computation and John Nash on his research upon Game Theory.
    * Ed Fredkin presented a table salt 3-D cellular automata model that fulfils all physical constraints on symmetry and energy conservation. The model is surprisely Turing-universal. As a reminder, the Zuse-Fredkin Thesis is a Church-type thesis which claims additionally that the universe being a Cellular Automaton can be understood in terms of the evolution of  Cellular Automata. An interesting paper entitled “Church-Turing Thesis is Almost Equivalent to Zuse-Fredkin Thesis” by Plamen Petrov is available here>

    http://digitalphysics.org/~ppetrov/

    And much more information on the interesting ideas of Ed Fredkin is available in
    http://www.math.usf.edu/~eclark/ANKOS_zuse_fredkin_thesis.html and on  Ed Fredkin homepage at MIT, which  includes some remarks on why this thesis is not in agreement with  Stephen Wolfram’s, since Wolfram does not intend to imply that the universe is a classical cellular automaton but rather conceives of   a discrete universe based on systems performing simple rules and producing all the complex behavior we find in nature. Such systems are comparable to Turing machines, tag systems, axioms or any other equivalent system.
    * In his lecture, Lazlo Barabasi claimed that behind all complex systems there are complex networks, in particular Free-Scale Networks (with a few nodes very well connected with others, and many nodes weakly connected). These nets are efficient and robust (even minus random nodes they remains connected)except under attack (provided the best connected nodes are targeted).
    * Christoph Teuscher gave a lecture on computation inspired by biology. However, as he himself admitted, it was not his area of expertise.
    * John Nash presented his research on Game Theory using Mathematica as an engine.
    * I met Hava Siegelmann and one of her fellows, Yariv, who is working on Artificial Intelligence. Siegelmann worked some years ago (while completing  her PhD dissertation) upon a model of computation called Analog Recurrent Neural Network or just ARNN which, under certain circumstances,  is able to compute more functions than the set of recursive functions , which are those computed by the Turing machines. She is now working on topics related to Molecular Biology. I asked her about the forceful  critiques delivered by traditional logicians like Martin Davis, who wrote a paper entitled “The Myth of HyperComputation.”    The target of most of these critiques is a certain circular argument—to compute more than Turing machines it is necessary to use non-Turing computable weights previously coded into the neural network. It has been known for a long time that the set of all real numbers is able to encode any arbitrary language, even if it is non-Turing computable. What is remarkable from my point of view is her result relating to complexity classes, since weights with lower complexity are able to compute a higher class when  used in those networks. Aditionally she argued that even with a stochastic function her networks are able to solve non-computable functions. Recently I discussed the fact with Stephen Wolfram and we agreed that she is assuming strong randomness. I would say however that Siegelmann’s work is much more beatiful from a complexity point of view than from a “hypercomputational” viewpoint. In her book published under the title “Neural Networks and Analog Computation: Beyond the Turing Limit ” she proves that: a) there exists an universal neural network with only 9 neurons, and b) that  p(r) suffices to compute a non-computable function, where r is an arbitrary complete real number but p(r) represents the first p digits of the expansion of r–which means that linear precision suffices to achieve “super-Turing” capabilities, assuming that the neural network can have access to any possible real number before the computation. In other words this seems to be true only if all possible real numbers are allowed a priori (just as in the Turing machines an unbounded tape is neccesary to carry out all recursive languages, and neural networks with rational numbers as weights do not compute the same set of functions as neural networks with whole numbers as weights, the first having been  proven by Klenee to compute the same set as Turing machines and the second  to compute the same set of languages as finite automata, those called regular languages).
    I exchanged ideas with some other interesting people, like an engineer from the Space and Airbone Systems Department of Raytheon. And I met John Nash Jr. during the gala dinner  at which he was presented with an award  for his contributions to  Complexity Theory, mainly honoring his work relating to Game Theory.

    Turing’s approach to Biology

    Posted in Computer Science, Foundations of Biology on August 7th, 2006 by Hector Zenil – Be the first to comment

    Where do the spots on animals come from?
    Turing’s answer to this question as well as much more on Turing and modern Fibonacci phyllotaxis is  presented and analysed by Jonathan Swintons in his Deodans’ blog

    http://www.swintons.net/deodands/archives/000091.html

    The Web as a Graph, Adaptive Crawlers and more…

    Posted in Computer Science on August 1st, 2006 by Hector Zenil – Be the first to comment

    The Web as a graph, adaptive crawlers, and mining the Web from unstructured data?

    http://www.almaden.ibm.com/webfountain/resources/TheWebasaGraph.pdf

    To be discussed soon…

    How To Criticize Computer Scientists

    Posted in Computer Science, General on August 1st, 2006 by Hector Zenil – Be the first to comment

    How To Criticize Computer Scientists or Avoiding Ineffective Deprecation And Making Insults More Pointed, from http://www.cs.purdue.edu/homes/dec/essay.criticize.html

    In recent exchanges, members of the faculty have tried in vain to attack other Computer Scientists and disparage their work. Quite frankly, I find the results embarrassing — instead of cutting the opponent down, many of the remarks have been laughably innocuous. Something must be done about it because any outsider who hears such blather will think less of our department: no group can hold the respect of others unless its members can deal a devastating verbal blow at will.This short essay is an effort to help faculty make their remarks more pointed, and help avoid wimpy vindictives. It explains how to insult CS research, shows where to find the Achilles’ heel in any project, and illustrates how one can attack a researcher.The Two Basic Types Of Research: Most lousy insults arise from a simple misimpression that all researchers agree on the overall aims of CS research. They do not. In particular, CS has inherited two, quite opposite approaches from roots in mathematics and engineering. Researchers who follow the mathematical paradigm are called theorists, and include anyone working in an area that has the terms “analysis”, “evaluation”, “algorithms”, or “theory” in the title. Researchers who follow the engineering paradigm are called experimentalists, and include most people working in areas that have the terms “experimental”, “systems”, “compiler”, “network”, or “database” in the title. Complex Theory And Simple Systems: Knowing the tradition from which a researcher comes provides the basis for a well-aimed insult.Theorists Favor Sophistication. Like mathematicians, theorists in Computer Science take the greatest pride in knowing and using the most sophisticated mathematics to solve problems. For example, theorists will light up when telling you that they have discovered how an obscure theorem from geometry can be used in the analysis of a computer algorithm. Theorists focus on mathematical analysis and the asymptotic behavior of computation; they take pride in the beauty of equations and don’t worry about constants. Although they usually imply that their results are relevant to real computers, they secretly dream about impressing mathematicians.Experimentalists Favor Simplicity. Like engineers, systems researchers take pride in being able to invent the simplest system that offers a given level of functionality. For example, systems researchers will light up when telling you that they have constructed a system that is twice as fast, half as small, and more powerful than its predecessor. Experimentalists focus on the performance of real computer systems; they take pride in the beauty of their code and worry about constants. Although they usually imply that their results can extend beyond real computers, they secretly dream of filing patents that apply to extant hardware.The Insult: Knowing that CS can be divided into two basic groups helps immensely when criticizing someone. There are two basic rules: identify the type of the researcher and issue an insult for that type. Avoid saying anything that inadvertently compliments them. If performed well, an insult will not only stun the researcher (who will be shocked to learn that not everyone agrees with his or her basic value system), but will also intimidate others in the audience.Identifying A Type: Identifying the type of a researcher is usually easy and does not require a strong technical background or real thinking. It can be done using keyword matching according to the following lists. Detecting Theory: You can tell someone is a theorist because they slip one or more of the following keywords and phrases into lectures and technical conversations: “theorem”, “lemma”, “proof”, “axiom”, “polynomial time”, “logarithmic”, “semantics”, “numerical”, “complexity”, “nondeterministic” or “nondeterminism”, and “for large enough N”. They write lots of equations, brag about knocking off the “extra log factor”, and often end their lecture with an uppercase “O” followed by a mathematical expression enclosed in parentheses. You can also recognize a theorist because they take forever to prove something that may seem quite obvious. (I once sat through an hour lecture where someone proved that after a computer executed an assignment statement that put the integer 1 into variable x, the value in x was 1.)Detecting Systems: An experimentalist will slip one or more of the following keywords and phrases into lectures and technical conversations: “architecture,” “memory,” “cpu” (sometimes abbreviated“CISC” or “RISC”), “I/O” or “bus”, “network”, “interface”, “virtual”, “compile” or “compiler”, “OS” or “system”, “distributed”, “program” or “code”, and “binary”. They talk about building programs and running the resulting system on real computer systems. They refer to companies and products, and use acronyms liberally. Their lectures often end with a graph or chart of measured system performance. You can also recognize an experimentalist because they describe in excruciating detail how they set up an experiment to measure a certain value even if the measurement produced exactly the expected results. (I once sat through an hour lecture where someone carefully explained how they used three computer systems to measure network traffic, when their whole point was simply to show that the network was not the cause of the problem they were investigating.)Forming An Insult:The key to a good insult lies in attacking whatever the researcher holds most dear and avoiding whatever the researcher does not care about. Thus, an insult lobbed at a theorist should focus on lack of sophisticated mathematics such as the following:Despite all the equations, it seems to me that your work didn’t require any real mathematical sophistication. Did I miss something? (This is an especially good ploy if you observe others struggling to understand the talk because they will not want to admit to that after you imply it was easy.)Isn’t this just a straightforward extension of an old result by Hartmanis? (Not even Hartmanis remembers all the theorems Hartmanis proved, but everyone else will assume you remember something they have forgotten.)Am I missing something here? Can you identify any deep mathematical content in this work? (Once again, audience members who found the talk difficult to understand will be unwilling to admit it.)In contrast, an insult lobbed at an experimentalist should imply that the techniques were used in previous systems or that the work isn’t practical such as:Wasn’t all this done years ago at Xerox PARC? (No one remembers what was really done at PARC, but everyone else will assume you remember something they don’t.)Have you tested this on the chip Intel got running last week in their lab? (No one knows what chip Intel got running last week, but everyone will assume you do.)Am I missing something? Isn’t it obvious that there’s a bottleneck in the system that prevents scaling to arbitrary size? (This is safe because there’s a bottleneck in every system that prevents arbitrary scaling.)How To Avoid Having An Insult Backfire On You: A misplaced insult can backfire, turning into an embarrassment for the attacker and a victory for the intended attackee. To avoid such occurrences, remember the following:Never attempt to attack theoretical work as not considering constants, as unrelated to real computer systems, or as requiring too much sophisticated mathematics. (The intended victim is likely to smile and thank you for the flattery.)Never attempt to attack a system as too small, too simple, or as lacking sophisticated mathematics (Again, the intended victim is likely to smile and thank you for the flattery.)Never attempt to attack systems work simply by saying that it’s so simple and obvious that you could have done it. (For years, people said that about UNIX and the TCP/IP protocols.) In fact, this is merely an extension of a ploy used by children on a playground: “Oh yeah? I could have done that if I wanted to.” Don’t try using it or someone will tell you to grow up. Attacking Crossover Work: Although rare, a few researchers include both theoretical and experimental work in the same project. Insulting such combinations can be tricky because a researcher can escape unscathed by pointing to one part of their work or the other as the answer. You can try to attack both parts simultaneously:I note that the systems aspect of this project seems quite complex. Do you think the cause of the convoluted implementation can be attributed to the more-or-less “simplistic” mathematical analysis you used?However, a clever insult can avoid talking about the work by suggesting sinister reasons for the paradigm shift:I notice that you did something unusual by combining both theory and experiment. Did you decide to try a second approach because you had insufficient results from the first?You seem to have a little theory and a little experimental work combined into one project. Isn’t it true that if you had a sufficiently strong contribution in one or the other you would have lectured about them separately?A Final Plea: I certainly hope faculty will take this essay to heart and sharpen their insult skills. In the future please make all your thrusts count.