How Humans perceive the world is biased by how patterns are distributed in Nature and their intrinsic complexity

Posted in Algorithmic information theory, Complexity, Computer Science, General, New Ideas on August 27th, 2014 by Hector Zenil – Be the first to comment

A new paper of mine with my colleagues, and Algorithmic Nature Lab members, Nicolas Gauvrit and Fernando Soler-Toscano just came out.

Using previously generated and new experimental data together with new methods to calculate the algorithmic complexity of 2-dimensional objects, we were able to find that when humans assess the complexity of an image (a small 4×4 pattern), their rating is correlated to the algorithmic complexity of the image mediated by the probability that such pattern appears in real world scenes. In other words, humans are biased both towards patterns in the world and algorithmic complexity, but also patterns in the world are correlated to algorithmic complexity. This strengthens my claim for an algorithmic world, where patterns can be accounted for by an algorithmic production process.

The journal (Visual Cognition) allows 50 free electronic copies of the paper to be downloaded. Should you be interested in this paper and can’t access it otherwise, you can have a free copy, using the following e-print link.

The Turing Test pass fiasco

Posted in General on June 10th, 2014 by Hector Zenil – Be the first to comment
originalTuringTest

The Turing test was a proposal made by Turing in 1950 intended to be a test to evaluate the apparent intelligence of a machine. The interrogator is in a room separated from the other person and the machine. The object of the game is for the interrogator to determine which of the other two is the human, and which is the machine. Taken seriously, this test should set a measure of language skill qualities for chatbots performing as intelligent interlocutors.

I would have loved to hear the news that the Turing Test had been passed for the right reasons–if there was genuine reason to believe so. Unfortunately this is not the case for the recent claim (http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx) and the bold reports published in its wake–without any critical comment– by the major newspapers and magazines. While it is difficult to fathom the exact motivations that drove Turing to come up with what he called “the imitation game”, it is clear that the chatbot which it is claimed passed the Turing test is no different from any other chatbot tested before judging by the types of conversations it has undertaken, except for the deliberate attempt by its creators to underscore its limitations by characterizing it as a 13 year old non-native English speaker. If the rules can be bent in this way, I could—taking things to the limit– easily write a script to pass the Turing test that simulated a 2-month old baby, or or of an alien being writing gibberish or a drunkard for that matter, one that forgets even the last question that was asked.

Taken seriously, the Turing Test should not be a test for deceiving the judges in this way; on display should be the  language skills of a typical interlocutor working in their native language and at full capability (which would rule out, for instance, a simulation of a drunkard or of an intellectually handicapped person). A milestone in AI in the context of the Turing test will be a chatbot that is genuinely able to simulate the entire range of language skills of a normal person working at full capability, a chatbot that does not reply with questions or forget what was said at the beginning of a conversation (or one question before for that matter), a chatbot that does not need a lookup table of about the same size as the number of questions it can answer, and yet is able to answer in about the same time as a human being.

The claim that the Turing Test has been passed does nothing but harm to the field of artificial intelligence, because anyone probing beyond what the newspapers and magazines have picked up from the original press release and repeated word for word (shame on them all, not only for this but for so many other atrocious errors disseminated by them, such as taking a script for a supercomputer!)  will judge it a fiasco in detriment of true successes in the field, past and future. This supposed success has done a disservice to the field and to the possibly honest creators of the chatbot, whose open admission that they had given it the character of a 13 year old foreign kid may have been meant to lower expectations of what it could achieve.

The mistake of claiming that their winner passed the true Turing test as they called it, and even calling it a milestone, is hard to excuse, especially in view of the damage it could do to the field, and indeed to the organizers themselves and other Turing test events that had already a hard time distancing them from a merely entertaining activity. In summary, if a Turing test was deemed to have been passed 2 days ago, it did it for all the wrong reasons.

For people interested in the subject, I recently gave an invited talk on Cognition, Information and Subjective Computation at the special session on Representation of Reality: Humans, Animals and Machines at the convention of the AISB – The Society for the Study of Artificial Intelligence and Simulation of Behaviour - at Goldsmiths, University of London. In the talk I surveyed some of the latest approaches to intelligence, consciousness and other generalized Turing tests for questions such as what is life. The slides are available at SlideShare here.

A more interesting discussion in the context of the Turing Test and good sense, is what has happened with CAPTCHA, for example, where images get more and more complicated in a battle to tell bots apart from simulating humans, yet computers have become better than humans in things that we recently thought they were decades away, such as text and face recognition, speech recognition is still getting there, but there is no need for a poorly performed *restricted* Turing Test to realize how computers can perform and excel at previously only-human tasks intelligent. Just as Turing said, I have no problems these days attributing intelligence to computers, even if a different kind of intelligence. And this was the purpose of the Turing Test from my point of view, that how intelligence is achieved is not important, just as flying does not need to be like birds, for an airplane to fly.

Update (June 22): Prof. Warwick has written in The Independent about the critics in his defense:

“…the judges were not told he was a teenager and Turing never suggested that artificial intelligence would have to pose as an adult – just that it fooled people to thinking it was human. Judges were free to ask the subjects any questions they liked in unrestricted conversations – and Eugene was clearly capable of holding its own.”

Really… I stand by my proposal then to be able to pass a truly unrestricted Turing Test in this spirit and write a chatbot that emulates a 2-month old baby. How little common sense for a test that should have been performed spotless and that has been claimed to had very high standards. I learnt also that judges were allowed to be children among others, so again, why we don’t put babies as judges, Turing never said anything against it, the only requirement explicitly said by Turing was to have non-computer experts. Were children taken into account in the 30% of confused judges? A question for the organizers that are holding information for the sake of an academic paper.

One of Eugene chatbot’s developers, Vladimir Veselov, has also spoken, in The Calvert Journal, he has said that

“I’m not sure that Eugene is an achievement in computer science, although, yes, we’ve shown that it’s possible to pass the Turing Test.”

(citing the way in which the organizers and the University of Reading announced the event as a milestone).

Really… I can help you, No, it is not a milestone whatsoever, and you have the answer, just a few sentences above you wrote:

“…There are two classes of chatbots, a self-learning system like a Cleverbot and the second, like Eugene, that comes with predetermined answers. In some instances, he asks questions and in others he uses information from the user to answer questions.”

and immediately after when asked by the journalist “Does that mean his ability to have a natural conversation is down to chance?” Veselov answers:

“He’s not like a smart system, so yes. He has knowledge of certain facts and some basic information about different countries, different inventors and so on. He uses an encyclopaedia to answer questions and can do very simple arithmetic operations. He can also control conversation so that if he doesn’t know the answer, he’ll turn the conversation around by asking a question.”

Is that different to what was done in the 60s with chatbots such as Eliza? Certainly there has been progress simply by how fast computers have become. And actually it is more surprising how bad chatbots perform despite the speedup increasing of incredible computer power and resources. Yet, after all, in 2014 it is claimed that the Turing Test has been passed with an Eliza-type chatbot after incremental improvements in the last 5 decades.

Veselov makes the outrageous claim that”The test is about imitation of humans, it’s not about artificial intelligence”! Wrong! Turing designed the test as a replacement question to whether machines could think, it has everything to do with AI. In fact Turing’s 1950 landmark paper is considered the starting point of the field of AI!

New paper solves and addresses some of Wolfram’s Open Problems in Cellular Automata

Posted in General on June 19th, 2013 by Hector Zenil – Be the first to comment

In a new paper entitled Asymptotic Behaviour and Ratios of Complexity in Cellular Automata Rule Spaces to be published soon by IJBC and available online here, at least 3 open problems in Cellular Automata are solved/addressed. These are related to:

  1. The development of a formal classification of behaviours of cellular automaton
  2. The development of automated ways to find “interesting” cellular automata and
  3. Find ways to estimate the frequencies of different types of cellular automaton behaviour among rule spaces

In Wolfram’s open problems document it reads:

“If one looks at all possible cellular automaton rules, can one make a quantitative estimate of what fraction will show repetitive, nested, etc. behavior? Is there a definite limiting fraction of rules that, say, show nested behavior.”

In the aforementioned paper, a string arguments and numerical answer is given: the fraction of complex cellular automata for increasingly larger rule spaces grows asymptotically to 1. Otherwise said, picking a longer computer program at random has greater chances to produce complex (algorithmic random) behaviour. When the program space is large enough the probability is arbitrarely close to 1. Notice the converse is not true, small program spaces do not necessarely have small fraction of complex behaviour. We show that defining a concept of asymptotic behaviour, complex behaviour dominates among n possible stable behaviours of a single program.

In this paper, we studied the asymptotic behaviour of symbolic computing systems, notably one-dimensional cellular automata (CA), in order to ascertain whether and at what rate the number of complex versus simple rules dominate the rule space for increasing neighbourhood range and number of symbols (or colours), and how different behaviour is distributed in the spaces of different cellular automata formalisms (as applied to elementary and totallistic CA rules). In doing so we defined an enumeration of initial conditions and automatic methods for interesting rules discovery. Using two different measures, Shannon’s block entropy and Kolmogorov complexity, the latter approximated by two different methods (lossless compressibility and block decomposition based on algorithmic probability), we arrive at the same trend of larger complex behavioural fractions. We also advance a notion of asymptotic and limit behaviour for individual rules, both over initial conditions and runtimes, and we provide a formalisation of Wolfram’s classification as a limit function in terms of Kolmogorov complexity.

Among the results, we confirmed preliminary findings reporting a preponderance of complex classes 3 and 4 as the number of symbols (colours) and neighbourhood ranges of these systems increase. We have shown that behaviour is for the most part stable for a given initial condition but that complexity increases in terms of the number of initial configurations and rules space sizes. The rate at which it does so is fast before asymptotically slowing down when approaching compressibility ratio 1 (uncompressibility). Similar results were obtained by using two different measures. One was a block entropy measure normalised and applied to CA space-time diagrams and the other 2 different approximations of Kolmogorov complexity (lossless compression and algorithmic probability). Following this latter, we found at least 2 ECA rule candidates for Class III or IV behaviour that were believed to belong to Class I or II (if you wonder which specific rules, have a look at the paper!).

While the entropy result may be less surprising because the number of possible configurations with more symbols and larger neighbourhood ranges suggests an increase in entropy, the fact that the 3 approaches used to measure complexity in rule spaces (including estimations of a universal and objective measure such as Kolmogorov complexity) yielded the same results provides a robust answer to the question regarding the change of fractions of behaviour and the rate of this change in computer programs for increasing resources where complexity is clearly seen to dominate asymptotically.

To know more about this:

Information Special Issue “Physics of Information”

Posted in General on May 28th, 2013 by Hector Zenil – Be the first to comment

Special Issue “Physics of Information”

A special issue of Information (ISSN 2078-2489).

Deadline for manuscript submissions: 20 November 2013

Special Issue Editors

Guest Editor
Dr. Hector Zenil
Unit of Computational Medicine, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
Website: http://www.mathrix.org/zenil/
E-Mail: hectorz@labores.eu
Interests: algorithmic information theory; computational biology and complex networks

Guest Editor
Dr. Gordana Dodig-Crnkovic
School of Innovation, Design and Engineering, Computer Science Laboratory, Mälardalen University, Sweden
Website: http://www.idt.mdh.se/~gdc
E-Mail: gordana.dodig-crnkovic@mdh.se
Interests: philosophy of information and computing

Special Issue Information

Dear Colleagues,

Information has enabled new interpretations of quantum phenomena that promise fresh theoretical and practical insights. It also lies at the heart of statistical mechanics. Longstanding paradoxes in thermodynamics have been solved (or raised) using information and computation. In cosmology, it has been suggested that the universe has an unexpectedly limited capacity to store and process information, perhaps indicating that space and time are not fundamental properties of reality. Paradoxically, physics seems to impose constraints on information, such as the speed at which it can travel from one region to another in or how much information we can extract from a physical event at the smallest scale. Focusing on information flow, it has also been suggested, will also help us better understand how cells and complex biological organisms work. Indeed these days molecular biology is mostly an information science.

But it is computer science that has placed information at the center of the modern debate. Digital information has dominated technology in the half century, empowering and extending human capabilities to new frontiers. How unreasonable is the effectiveness of digital computation in the natural sciences? Is computation the obvious description of information-processing? What are the connections between carrying information, programming artificial and natural systems and computation? What is the nature of the interplay between information, entropy and other complexity measures?

This special issue is devoted to all these questions as approached through rigorous and unpublished technical work from areas such as cosmology, astrophysics, mathematics, computer science, complexity science, biology, and neuroscience. The central topic for authors to bear in mind is some foundational aspect of information and reality, or information processing in nature.

Dr. Hector Zenil
Dr. Gordana Dodig-Crnkovic
Guest Editors

 

Submission

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. Papers will be published continuously (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are refereed through a peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed Open Access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 300 CHF (Swiss Francs) but fee waivers are possible for authors in developing countries and invited authors.

 

Keywords

  • information paradoxes in physics
  • programming reality
  • digital and informational physics
  • it from bit and/or it from bit
  • bit-string physics
  • quantum information and computation
  • computational thermodynamics
  • mathematical/Computational universe hypothesis
  • simulation and simulacra
  • theories of quantum gravity
  • algorithmic information theory
  • information, algorithms and automata
  • models of information processing
  • natural computing

Help science and behave randomly

Posted in General on May 9th, 2013 by Hector Zenil – Be the first to comment

Take part in a cutting-edge massive online experiment to asses how humans perceive and generate randomness

Human Randomness Perception and Generation Project:

http://complexitycalculator.com/hrng/

For obvious reasons I cannot tell much.

This is an experiment undertaken by the Algorithmic Nature Group of LABoRES.

The results will be announced in this blog. Stay tuned.

New volume: A Computable Universe

Posted in General on March 23rd, 2013 by Hector Zenil – Be the first to comment

With a foreword by Sir Roger Penrose, my new volume A Computable Universe has been published by World Scientific and the Imperial College Press after my bestseller Randomness Through Computation in 2011. A Computable Universe: Understanding and Exploring Nature as Computation, introduces the views of the pioneers in the computable universe view of the world and some of its strongest opponents (including Penrose himself).

 

bookcover

In the context of the Alan Turing Year, the book discusses the foundations of computation in relation to nature.

  • What is computation?
  • How does nature compute?

They discuss computation in the world from a variety of perspectives, ranging from foundational concepts to pragmatic models to ontological conceptions and philosophical implications.

The volume provides a state-of-the-art collection of technical papers and non-technical essays, representing a field that assumes information and computation to be key in understanding and explaining the basic structure underpinning physical reality. It also includes a new edition, prepared by Adrian German from Indiana University Bloomington and myself, of Konrad Zuse’s “Calculating Space” (the MIT translation) fully retyped in shiny LaTeX with a few corrections and improvements to figures. A free copy of which is available here (trivia: Zuse’s name was used in the Tron: Legacy film (2010) certainly inspired by his mind blowing ideas on a digital universe).

The book also includes a panel discussion transcription on the volume topic, featuring worldwide experts including a Nobel prize in quantum mechanics, physics, cognition, computation and algorithmic complexity. In this section I asked authors (and other authors asked as well) what I thought an informed reader would ask them (including myself), sometimes leading to quite heated exchanges.

The volume is dedicated to the memory of Alan M. Turing — the inventor of universal computation, on the 100th anniversary of his birth, and is part of the Turing Centenary celebrations.

Free sample chapter(s)
Foreword (533 KB)
Chapter 1: Introduction the Computable Universe (476 KB) by myself.
Table of Contents:

    • Foreword (R Penrose)
    • Preface
    • Acknowledgements
    • Introducing the Computable Universe (H Zenil)
  • Historical, Philosophical & Foundational Aspects of Computation:
    • Origins of Digital Computing: Alan Turing, Charles Babbage, & Ada Lovelace (D Swade)
    • Generating, Solving and the Mathematics of Homo Sapiens. E Post’s Views on Computation (L De Mol)
    • Machines (R Turner)
    • Effectiveness (N Dershowitz & E Falkovich)
    • Axioms for Computability: Do They Allow a Proof of Church’s Thesis? (W Sieg)
    • The Mathematician’s Bias — and the Return toEmbodied Computation (S B Cooper)
    • Intuitionistic Mathematics and Realizability in the Physical World (A Bauer)
    • What is Computation? Actor Model versus Turing’s Model (C Hewitt)
  • Computation in Nature & the Real World:
    • Reaction Systems: A Natural Computing Approach to the Functioning of Living Cells (A Ehrenfeucht, J Kleijn, M Koutny & G Rozenberg)
    • Bacteria, Turing Machines and Hyperbolic Cellular Automata (M Margenstern)
    • Computation and Communication in Unorganized Systems (C Teuscher)
    • The Many Forms of Amorphous Computational Systems (J Wiedermann)
    • Computing on Rings (G J Martínez, A Adamatzky & H V McIntosh)
    • Life as Evolving Software (G J Chaitin)
    • Computability and Algorithmic Complexity in Economics (K V Velupillai & S Zambelli)
    • Blueprint for a Hypercomputer (F A Doria)
  • Computation & Physics & the Physics of Computation:
    • Information-Theoretic Teleodynamics in Natural and Artificial Systems (A F Beavers & C D Harrison)
    • Discrete Theoretical Processes (DTP) (E Fredkin)
    • The Fastest Way of Computing All Universes (J Schmidhuber)
    • The Subjective Computable Universe (M Hutter)
    • What Is Ultimately Possible in Physics? (S Wolfram)
    • Universality, Turing Incompleteness and Observers (K Sutner)
    • Algorithmic Causal Sets for a Computational Spacetime (T Bolognesi)
    • The Computable Universe Hypothesis (M P Szudzik)
    • The Universe is Lawless or “Pantôn chrêmatôn metron anthrôpon einai” (C S Calude, F W Meyerstein & A Salomaa)
    • Is Feasibility in Physics Limited by Fantasy Alone? (C S Calude & K Svozil)
  • The Quantum, Computation & Information:
    • What is Computation? (How) Does Nature Compute? (D Deutsch)
    • The Universe as Quantum Computer (S Lloyd)
    • Quantum Speedup and Temporal Inequalities for Sequential Actions (M ?ukowski)
    • The Contextual Computer (A Cabello)
    • A Gödel-Turing Perspective on Quantum States Indistinguishable from Inside (T Breuer)
    • When Humans Do Compute Quantum (P Zizzi)
  • Open Discussion Section:
    • Open Discussion on A Computable Universe (A Bauer, T Bolognesi, A Cabello, C S Calude, L De Mol, F Doria, E Fredkin, C Hewitt, M Hutter, M Margenstern, K Svozil, M Szudzik, C Teuscher, S Wolfram & H Zenil)
  • Live Panel Discussion (transcription):
    • What is Computation? (How) Does Nature Compute? (C S Calude, G J Chaitin, E Fredkin, A J Leggett, R de Ruyter, T Toffoli & S Wolfram)
  • Zuse’s Calculating Space:
    • Calculating Space (Rechnender Raum(K Zuse)
    • Afterword to Konrad Zuse’s Calculating Space (A German & H Zenil)

Available at Amazon and your favorite online retailers.

My new book: Irreducibility and Computational Equivalence

Posted in General on March 23rd, 2013 by Hector Zenil – Be the first to comment

Irreducibility and Computational Equivalence is a collection of contributions by outstanding authors in celebration of Stephen Wolfram’s A New Kind of Science after 10 years of its publication. Published by Springer Verlag it is already available both Paperback and Hardcover, as well as an eBook and CDF (computable document).

Here is a free preview and the book website from Springer.

ZenilWolframBook It is clear that computation is playing an increasingly prominent role in the development of mathematics, as well as in the natural and social sciences. The work of Stephen Wolfram over the last several decades has been a salient part in this phenomenon helping founding the field of Complex Systems, with many of his constructs and ideas incorporated in his book A New Kind of Science (ANKS) becoming part of the scientific discourse and general academic knowledge–from the now established Elementary Cellular Automata to the unconventional concept of mining the Computational Universe, from today’s widespread Wolfram’s Behavioural Classification to his principles of Irreducibility and Computational Equivalence.

I found this volume fascinating in its efforts to flesh out the computational implications for biology more generally.
– Dr. Mark Changizi

I believe that this book will be an inspiration for future work in interdisciplinary research at the intersection of computer science, natural and social sciences.
– Prof. Ivan Zelinka

The volume, with a Foreword by Gregory Chaitin and an Afterword by Cris Calude, covers these and other topics related to or motivated by Wolfram’s seminal ideas, reporting on research undertaken in the decade following the publication of Wolfram’s NKS book. Featuring 39 authors, its 23 contributions are organized into seven parts.

It is now available from Amazon and any large online bookstore, and directly from Springer itself.

Table of Contents:

Foreword
Gregory Chaitin

Part I Mechanisms in Programs and Nature

1. Hyperbolic Cellular Automata
Maurice Margenstern
2. A Lyapunov View on the Stability of Cellular Automata
Jan M. Baetens & Bernard De Baets
3. On the Necessity of Complexity
Joost J. Joosten
4. Computational Technosphere and Cellular Engineering
Mark Burgin

Part II The World of Numbers & Simple Programs

5. Cellular Automata: Models of the Physical World
Herbert W. Franke
6. Symmetry and Complexity of Cellular Automata: Towards an Analytical Theory of Dynamical System
Klaus Mainzer
7. A New Kind of Science: Ten Years Later
David H. Bailey

Part III Everyday Systems

8. A New Kind of Finance
Philip Z. Maymin
9. The Relevance and Importance of Computation Universality in Economics
Kumaraswamy Velupillai
10. Exploring the Sources of and Nature of Computational Irreducibility
Brian Beckage, Stuart Kauffman, Louis Gross, Asim Zia, Gabor Vattay and Chris Koliba

Part IV Fundamental Physics

11. The Principle of a Finite Density of Information
Gilles Dowek and Pablo Arrighi
12. Artificial Cosmogenesis: A New Kind of Cosmology
Clément Vidal
13. Do Particles Evolve?
Tommaso Bolognesi

Part V The Behavior of Systems & the Notion of Computation

14. An Incompleteness Theorem for the Natural World
Rudy Rucker
15. Pervasiveness of Universalities of Cellular Automata: Fascinating Life-like Behaviours
Emmanuel Sapin
16. Wolfram’s Classification and Computation in Cellular Automata Classes III and IV
Genaro J. Martinez, Juan Carlos Seck Tuoh Mora and Hector Zenil

Part VI Irreducibility & Computational Equivalence

17. Exploring the Computational Limits of Haugeland’s Game as a Two-Dimensional Cellular Automaton
Drew Reisinger, Taylor Martin, Mason Blankenship, Christopher Harrison, Jesse Squires and Anthony Beavers
18. Irreducibility and Computational Equivalence
Hervé Zwrin and Jean-Paul Delahaye
19. Computational Equivalence and Classical Recursion Theory
Klaus Sutner

Part VII Deliberations and Philosophical Implications

20. Wolfram and the Computing Nature
Gordana Dodig-Crnkovic
21. A New Kind of Philosophy. Manifesto for a Digital Ontology
Jacopo Tagliabue
22. Free Will For Us, not For Robots
Selmer Bringsjord

Afterword
Cristian Calude

Available from Amazon and any large online bookstore, and directly from Springer itself.

Life as Thermodynamic Evidence of Algorithmic Structure in Natural Environments

Posted in General on March 23rd, 2013 by Hector Zenil – Be the first to comment

In evolutionary biology, attention to the relationship between stochastic organisms and their stochastic environments has leaned towards the adaptability and learning capabilities of the organisms rather than toward the properties of the environment. This article is devoted to the algorithmic aspects of the environment and its interaction with living organisms. We ask whether one may use the fact of the existence of life to establish how far nature is removed from algorithmic randomness. The paper uses a novel approach to behavioral evolutionary questions, using tools drawn from information theory, algorithmic complexity and the thermodynamics of computation to support an intuitive assumption about the near optimal structure of a physical environment that would prove conducive to the evolution and survival of organisms, and sketches the potential of these tools, at present alien to biology, that could be used in the future to address different and deeper questions. We contribute to the discussion of the algorithmic structure of natural environments and provide statistical and computational arguments for the intuitive claim that living systems would not be able to survive in completely unpredictable environments, even if adaptable and equipped with storage and learning capabilities by natural selection (brain memory or DNA).

This new paper has recently been published online and is freely available from MDPI here. The statistics show 355 readers in only 3 months.

Zenil, H.; Gershenson, C.; Marshall, J.A.R.; Rosenblueth, D.A. Life as Thermodynamic Evidence of Algorithmic Structure in Natural Environments. Entropy 2012, 14, 2173-2191.

Announcing the Online Algorithmic Complexity Calculator

Posted in Algorithmic information theory, Complexity, General on March 23rd, 2013 by Hector Zenil – Be the first to comment

We have made available a basic beta version of an Online Algorithmic Complexity Calculator implementing the methods we have developed in recent papers at the Algorithmic Nature lab.

The OACC provides a comprehensive framework of universal mathematical measures of algorithmic complexity for researchers and professionals. It retrieves objective numerical measures of randomness for potential applications in a very wide range of disciplines, from bioinformatics to psychometrics, from linguistics to economics.

It is based on several years of research devoted to new methods for evaluating the information content and algorithmic complexity. The description of the Coding Theorem method to deal with short strings is described in this paper. It currently retrieves numerical approximations to Kolmogorv complexity and Levin’s universal distribution (or Solomonoff algorithmic probability) for binary strings of short length, for which lossless compression algorithms fail as a method for approximation to program-size complexity, hence providing a complementary and useful alternative to compression algorithms. More algorithmic information measures, more data and more techniques will be incorporated gradually in the future, covering a wider range of objects such as longer binary strings, non-binary strings and n-dimensional objects (such as images).

It also includes a Short String Complexity Explorer (it may take some time to run if you open the link) tool developed in Mathematica, it runs oinline with a free player. It provides a comparison of the estimated Kolmogorov complexity (K), the algorithmic probability (m), Shannon’s Entropy (E) and compressibility (using Deflate) of a string. It also provides an indication of the relative complexity among all other strings for which K has been approximated and its distribution rank.

Calculating a Universal Distribution to Approximate Kolmogorov-Chaitin Complexity

Posted in Algorithmic information theory, Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, New Ideas on December 12th, 2012 by Hector Zenil – Be the first to comment

Computing the incomputable has always been a challenge. For example, in finding the busiest Turing machines (Rado) given a number of symbols and states (whimsically called busy beavers). This means either finding Turing machines that, starting from an empty input, produce more non-blank symbols in their output tapes before halting than any other Turing machine of the same size, or Turing machines that, also starting from an empty input, have the greatest runtime before halting than any other Turing machine of the same size. Both problems are ultimately undecidable because of the Turing-complete capabilities of Turing machines, as proven by Alan Turing himself (that is, the capability of some Turing machines to simulate any other Turing machine).

In this new paper we describe how we have managed to calculate an approximation of a so-called Universal Distribution (aka Levin’s semi-measure) which connects the frequency of production of a string to its Kolmogorov complexity (K). The chief advantage of calculating an approximation of the Universal Distribution is that it is an incremental process over an average of a large number of Turing machines. One doesn’t get rid of the constant from the invariance theorem in the theory of algorithmic information theory (for example when Kolmogorov complexity is measured using 2 different universal Turing machines), yet one seems to have to make fewer arbitrary decisions.

One of the main advantages is that one can better deal with strings of very short lengths. Think about it! If one wished to approximate K for a single bit by using compression algorithms, the lossless compression algorithm would not be able to compress the single bit any further. And this not only happens for a single bit but for all strings up to a certain minimal length for which lossless compression algorithms are simply unsuitable (recall that a compression algorithm includes the decompression instructions together with the data in the new compressed object in order to make it self-decompressible).

The usefulness of lossless compression algorithms as a method for approximating K derives from the fact that compression is a sufficient test of non-randomness. This is because K is, more precisely than an uncomputable function, upper semi-computable, meaning that one can estimate upper bounds. The lossless compressed length of an object s (e.g. a string) is therefore an upper bound on K(s). The usefulness of the Coding Theorem Method (the theorem presented in this paper) will ultimately come down to whether it is useful in applications, which is the main motivation behind its conception, given the failure of compression algorithms to provide information about the possible K(s) values for s that are too short (shorter, say, than the typical size of the length of the instructions that a lossless compression algorithm has to add to the compressed object).

The state of the art of the Coding Theorem Method can be gleaned from this paper, recently made public by my colleagues and I, and announcing the release of the calculation of a universal distribution based on (5,2), that is, all Turing Machines with 5 states and 2 symbols: Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines

It represents a major improvement over this previous calculation of mine, that required new and more powerful techniques to deal with a semi-computable distribution. It improves our previous work in terms both of accuracy and coverage of number of short strings and validates previous calculations of universal distributions, showing the incremental nature of the method to be fully compatible with the other calculated universal distributions with smaller samples of small Turing machines (and for which the known Busy Beaver values could be used).

In this other paper we explain why our approximations of K are real-number values, showing that strict integer-value program size follows our Coding Theorem Method, and thus that ours constitutes a more fine-grained measure. It is also shown that logical depth departs from both strict program-size and the Coding Theorem Method evaluations, being in agreement with the theory for all these 3 measures. The paper is available online at: Correspondence and Independence of Numerical Evaluations of Algorithmic Information Measures

In the next post I will be announcing and also briefly explaining the results from another paper showing that not only can our Coding Theorem deal with short strings, but that we have found a way to validate the method by lossless compressibility. Moreover, we have found that in the transition period, where the Coding Theorem Method starts to be too expensive to be of practical use whereas the compression method starts to provide some results, the 2 methods are in great agreement with each other. Like an expensive microscope of great power (e.g. the LHC at CERN), our Coding Theorem Method requires an incredible amount of calculation. The trick is to know when to use a microscope–to know when a microscope is required rather than a telescope. We believe we are providing the tools to deal with the information content of the smallest entities in the computational universe.

Paper published: Life as Thermodynamic Evidence of Algorithmic Structure in Natural Environments

Posted in General on November 6th, 2012 by Hector Zenil – Be the first to comment

Life as Thermodynamic Evidence of Algorithmic Structure in Natural Environments

In evolutionary biology, attention to the relationship between stochastic organisms and their stochastic environments has leaned towards the adaptability and learning capabilities of the organisms rather than toward the properties of the environment. This article is devoted to the algorithmic aspects of the environment and its interaction with living organisms. We ask whether one may use the fact of the existence of life to establish how far nature is removed from algorithmic randomness. The paper uses a novel approach to behavioral evolutionary questions, using tools drawn from information theory, algorithmic complexity and the thermodynamics of computation to support an intuitive assumption about the near optimal structure of a physical environment that would prove conducive to the evolution and survival of organisms, and sketches the potential of these tools, at present alien to biology, that could be used in the future to address different and deeper questions. We contribute to the discussion of the algorithmic structure of natural environments and provide statistical and computational arguments for the intuitive claim that living systems would not be able to survive in completely unpredictable environments, even if adaptable and equipped with storage and learning capabilities by natural selection (brain memory or DNA).

Zenil, Hector; Gershenson, Carlos; Marshall, James A.R.; Rosenblueth, David A. 2012. “Life as Thermodynamic Evidence of Algorithmic Structure in Natural Environments.” Entropy 14, no. 11: 2173-2191.

http://www.mdpi.com/1099-4300/14/11/2173

Hypercomputation in A Computable Universe

Posted in Computability, Universality and Unsolvability, Foundations of Computation, General, Recreation on September 21st, 2012 by Hector Zenil – Be the first to comment

This is the full version of my answer to a question formulated by Francisco A. Doria to me included in the Discussion Section of A Computable Universe, my edited volume being published by World Scientific and Imperial College Press coming out next month (already available in Asia) concerning whether I think if Hypercomputation is possible:

I was once myself a hypercomputation enthusiast (my Master’s thesis–in French–was on “Hyper calcul”, focused on feasibility of hypercomputational models). Paradoxically it wasn’t until I had a good knowledge of it that I started to better appreciate and to increasingly enjoy more the beauty of the digital (Turing) model. On the one hand, hypercomputational models do not converge in computational power. There are a plethora of possible models of hypercomputation while there is only one digital (in terms of computational power). I find it astonishing how little it takes to reach the full power of (Turing) universality, and I don’t see the power of universal machines as being subject to any limitation, all the opposite. I happen to think the question has a truth value and is ultimately susceptible of a physical answer. However, I don’t think the answer will emerge in the foreseeable future, if it ever does.

Still, I found Doria’s contribution to the subject interesting, as he points out a particular (“ideal”) model of hypercomputation that I think gets around some of Martin Davis’ objections (The Myth of Hypercomputation), though it doesn’t fully address the problem of verification. Unlike a Turing computation, which can in principle be verified by carrying it out by hand, step by step the inner working of a hypercomputer can only be followed by another hypercomputer. The caricature version of the problem is in the whimsical answer given by the (hyper?)computer Deep Thought in “The Hitchhiker’s Guide to the Galaxy” by Douglas Adams, which proposed “42″ as the “Ultimate (uncomputable?) answer to the Ultimate Question of Life, The Universe, and Everything” [added parenthesis]. The only way to verify such an answer would be by building another, more powerful and even less understandable computer. This makes me wonder whether we ought not to favour meaningful computation over what could potentially be hypercomputation, even if hypercomputation were possible.

There is a strong analogy to the concept of proof in math. Mathematical proofs seem to fall into two types. They either serve to convince of the truth of a statement one wasn’t certain was true, or else to provide logical evidence that a statement intuitively believed to be true was in fact so (e.g. the normality of the mathematical constant pi). But in the latter case, why would one bother to provide a proof of a statement that nobody would argue to be false? It is because ideally math proofs should provide insight into why a statement is true, and not simply establish whether or not it is so. There are a few exceptions of course. Some come from mathematical practice, for example Wiles’ proof of Fermat’s last theorem. It is not clear whether Wiles’ proof provides any insight into the original question of the truth value of Fermat’s theorem (it does contribute to understand the connection among different powerful mathematical theories). Some other cases, especially among computer automated proofs, are of the same kind, often neglecting the fact that a proof is also about explanation (for humans). In that sense I think we should also favour meaningful mathematical proofs (from meaningful mathematical questions!), just as we should better appreciate meaningful (digital) computation.

The study of infinite objects has given us great insight into profound and legitimate questions of math and computation, such as the question of the nature of what is a number or what is a computation. And it has been immensely useful to focus on limits of computation in order to better understand it. It is at the boundary between decidability and undecidability that one seems best positioned to answer the question of what does computation mean. And there are some examples where the study of a hypercomputational model is more valuable not as a model of hypercomputation but by its ancillary results. Unlike most people, I think the contribution of, for example, Siegelmann’s ARNN model, has much more to say about computational complexity (the ARNN model relativises P = NP and P != NP in a novel fashion) and therefore classical computation than about hypercomputation! (Siegelmann’s ARNNs model has come to be strangely revered among hypercomputation enthusiasts).

While I may agree that the problem of verification of a computation is not necessarily an argument against hypercomputation (because it can also be used against digital computation in practice), the answer is only in the realm of physics and not in paper models. As Davis points out, it is not a surprise that encoding non-computable numbers as weights in an artificial neural network deals to non-computable computation! so, assuming real numbers exist the consequence is straightforward and the justification of a hypercomputational model just circular.

Other models of hypercomputation are just fundamentally wrong, such as Peter Wegner’s model of “interactive computation” that is claimed to break Turing’s barrier. For a recent commentary pointing out the the flaws of yet another hypercomputational claim, you can consult this entertaining blog post by Scott Aaronson and his Toaster-Enhanced Turing Machine.

Meaningful Math Proofs and “Math is not Calculation”

Posted in Foundations of Math, General, New Ideas, Recreation on August 5th, 2012 by Hector Zenil – Be the first to comment

Mathematicians are generally thought to be very good at calculation, and they sometimes are, but this is not because math is about calculation, just as astronomy is not about telescopes and computer science is not about computers (paraphrasing Edsger Dijkstra).

But if it is not preeminently about calculation, then what is mathematics about? The common answer, given by mathematicians themselves, is that it is about solving problems–learning to do so, and to think in a logical fashion. I think this may also be misguiding. Let’s take as an example Fermat’s last theorem (which may also be misguiding, as some math is not only about historical unresolved questions with little impact, in some sense, Fermat’s Last Theorem is not much more than a curiosity).

As is well-known, Fermat’s last theorem was solved by Andrew Wiles in 1994. The solution wasn’t simple at all, and it required a lot of convoluted math (Wiles himself got it wrong first and a few years later fixed it). The solution required much more machinery than the machinery required to formulate Fermat’s original statement, it is like going after a fly with a bazooka. For some mathematicians concerned with the foundations of math, Wiles’ theorem may not be as satisfactory they would have wished, even though as a piece of mathematical work connecting several areas of advanced math.

Wiles’ exact solution couldn’t have been arrived at earlier, as it draws upon some of the most sophisticated areas of modern math, it transfers a pure problem of arithmetic to a very powerful framework founded in algebraic geometry and analysis (elliptic curves, modular forms). If Fermat had a solution it wouldn’t have been at all similar to Wiles’. In fact, even before Wiles’ proof, no mathematician ever thought that Fermat’s last theorem was actually false. The consensus was that the theorem was true, so the long expected proof was supposed to shed light on the reason behind this truth rather than simply confirm it. Wiles’ great achievement only confirmed that with a very sophisticated set of tools one could prove Fermat’s last theorem, but it didn’t shed any light on why it was true (if it does, it is still hard to justify such a tradeoff in which to shed some light in arithmetic one requires the use of set theory).

Wiles’ solution doesn’t answer the question of whether Fermat himself had a more basic (but no less valid) proof. The most likely case is that Fermat’s last theorem is actually independent of number theory, and requires the full power of set theory. Hence Fermat very likely had no such solution, contrary to what he claimed.

Fermat’s theorem is, however, a rare case of the ultimate great challenge, and not part of the common practise of the “average” mathematician. The work of an “average” mathematician depends very much on its field. If the field is concerned with foundations it may involve a lot of set theory and logic, which in large part involves finding sets of formulae to prove other sets of formulae and looking for relativisations. In algebra or group theory, or even topology, their work may have to do with all sorts of symmetries or ways to characterise things in different ways in order to establish new and unanticipated connections, shedding more light on one area in terms of another, from a different point of view (and employing a different language). This connection is in fact the greatest achievement of Wiles’ proof of Fermat’s last theorem, as it connects several subdisciplines of modern mathematics building sophisticated intuitions of true statements, ultimately connected to Fermat’s last theorem.

I think math can be better described as a discipline of patterns (e.g. mappings, homeomorphisms, isomorphisms). Patterns are integrally related to the discipline I value most: algorithmic information theory, which is the ultimate field of the study of patterns, patterns in strings and objects, leading to applications such as compression algorithms.

One of the consequences of this change of mindset is that it will allow students access to computers. Computers and humans together can achieve an impressive number of things; they do so everyday in all sorts of fields, from car design to the development of new chips for the next generation of computers. In fact, airplanes today are mostly driven by computers, no pilot or set of pilots alone would be able to fly a modern airplane these days, and more than 60 to 70% (in some markets even 90%) of the operations in any top stock market are computer made, with no human intervention. Today’s industry wouldn’t be conceivable without the combined forces of human creativity and motivation, and the computer’s speed, accuracy and productivity.

There is an important new trend spearheaded by the UK initiative Computer-Based Math
promoting these very ideas. Conrad Wolfram has recently explained in a blog post how computer programming could not only help to teach math, but how they could merge into a single discipline. I may have a chance to talk about all this at the House of Lords soon, where the education shift would need to take official form.

Conjectures concerning Busy Beavers, Dynamic Behavior and Turing Universality

Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation on June 1st, 2012 by Hector Zenil – Be the first to comment

In a recent paper I have advanced some conjectures using a coefficient that renders aspects of the qualitative behavior of complex systems in quantitative terms. It measures the sensitivity of a system to external stimuli, its apparent ability to (efficiently) transfer information from the input through the output.

In a previous paper, and in a quite different context, I proposed the conjecture that systems capable of (efficient) Turing universal computation had large coefficients (greater than 1), meaning that they are sensitive enough to external stimuli.

It is evident that a system that doesn’t react to its input is unlikely to be (efficiently) programmable, which is what Turing universality is about. Recall that Turing universality is the notion of a machine capable of simulating any other machine of the same type. More formally, if U is a universal Turing machine, anda Turing machine with input s, one can run U with input, that is>, and U will then behave like M with input s, where M and s are any Turing machine and any input.

In that paper I used this coefficient as a phase transition detector, and it so happened that the way in which it was defined made it eminently capable of classifying the behavior of cellular automata and other systems, even yielding Wolfram’s well-known classes of behavior. The paper is available online here.

Now I have advanced another set of conjectures, this time relating a set of Turing machines defined by the way they behave: Busy Beaver Turing machines. These are machines that do a lot of work before halting. More formally, they print more non-blank symbols and have the greatest runtime among machines of the same size (number of states and symbols). I have written a program showing all known and candidate Busy Beaver machines (in Mathematica, runnable on your browser by downloading the Mathematica CDF player):

Let bb(n) be the Busy Beaver with n states (and 2 symbols). The conjectures say that:

1. (strong version): For all n > 2, bb(n) is capable of universal computation.
2. (sparse version): For some n, bb(n) is capable of universal computation.
3. (weak version): For all n > 2, bb(n) is capable of (weak) universal
computation.
4. (weakest version): For some n, bb(n) is capable of (weak) universal
computation.

They are based on the fact that Busy Beaver machines are complex, in Bennett’s sense. Bennett’s notion of “logical depth” is the idea of capturing the complexity of a string s in terms of the unfolding time of the shortest (or set of nearly shortest) programs producing the said string. The outputs of Busy Beaver Turing machines are therefore, by definition, the ones with the greatest logical depth. If they print more symbols than any other machine and take the greatest time to do so, they have maximal logical depth. Busy Beaver machines are also machines that produce their outputs with the minimum resources (if there were shorter machines producing the same outputs those would be the Busy Beaver machines), hence they are the shortest programs producing the strings. Busy Beaver machines are therefore by definition very deep (in Bennett’s definition of complexity), meaning they are complex in this sense (logical depth is also profoundly related to Kolmogorov complexity, given that the definition involves the shortest(s) programs).

Besides being deep, Busy Beavers have another characteristic: they halt. So they don’t just do their stuff without worrying about whether to stop. One can easily encode a “Busy Beaver” that loops and produces any arbitrary number of non-blank symbols say for example, an infinite loop printing 1s), but Busy Beavers don’t. They save a state to halt at the end. Machines that are not able to halt (the majority) cannot be Turing universal, because they wouldn’t ever produce any output, simply because by definition something is an output if the machine has halted. But machines that always halt would be decidable, and therefore not universal either.

So among the things to attempt to amount to evidence in favor of the positive answer of the conjectures of Turing universality for Busy Beaver machines, is to encode some input such that they don’t stop (and to prove so), as a first step towards a prove for universality. If Busy Beavers are capable of halting for some inputs (e.g. empty input, as this is entailed in the definition of a Busy Beaver) but are also capable of not halting for others, this would strengthen the conjectures. Busy Beavers may, however, be very different among them, as they are defined by behavior, rather than their specific inner workings, but if the behavior of a Busy Beaver in any way determines something that captures a property of all them, one could use the same argument in favor or against them towards proving or disproving these conjectures.

There seems no easy way to prove that all Busy Beavers are capable of universal computation, but such a proof, if it exists, would be the first to characterize a non-trivial set of well defined universal Turing machines. Non-trivial, because of course one can build a universal Turing machine and keep adding resources and proving that the resulting machines are all universal, just as happens with Turing-complete programming languages. But Busy Beaver machines would be characterized by nothing but a qualitative behavior. This makes the assertions of the conjectures more interesting.

It could be the case that Busy Beavers are not capable of Turing universality for blank tapes. In fact, Michel Pascal, a world expert in the field of Busy Beavers, thinks that Busy Beavers may be capable of –weak– universal computation (personal communication), that is, starting from non-blank tape configurations (e.g. a periodic or bi-periodic tape configuration). This accords with another conjecture in Computer Science, the Collatz sequence. The conjecture says that whichever number you start with, if n is even, divide it by 2 to get n / 2. If n is odd, multiply it by 3 and add 1 to obtain 3n + 1; you will end always end up reaching 1. I have also written 2 small computer programs (also in Mathematica) illustrating this conjecture:

According to Michel, Busy Beavers would be related to Collatz-type functions (see his paper “Small Turing machines and generalized busy beaver competition“), with conjectures that are related but more general (in the context of small Turing machines) to the ones I’m talking about.

Today, we know that Turing universality requires very little. For example, Wolfram conjectured that a 2,3 Turing machine was universal, on the basis of the aspects ot its behavior that he studied. And as proven by Alex Smith, this 2,3 TM turns out to be capable of universal computation. The machine is surprisingly small, and indeed is the smallest it can possibly be, given that we know that no 2,2 Turing machine can be capable of Turing universality (proof recently provided by Maurice Margensten in a Complex Systems paper). Today we also know many minimialistic systems capable of universal computation: Conway’s game of Life, Wolfram’s Rule 110, many Post-tag systems and a variation called Cyclic tag systems, Wang tiles. And at least 2 important researchers in the field have recently argued for the ubiquity of Turing universality (Martin Davis and Maurice Margenstern myself, along the lines anticipated by Wolfram’s Principle of Computational Equivalence).

Finally, I advanced a last conjecture relating Busy Beavers and my coefficient C in this paper, establishing that:

C(bb(n)) > 0

That is, that Busy Beavers have a positive coefficient C, i.e. are quite sensitive to input stimuli, which, if the very first conjecture is true (that Turing universal machines have a large coefficient) makes this last coefficient a good place to start to make the case for the universality of Busy Beavers.

The paper “On the Dynamic Qualitative Behavior of Universal Computation” at ArXiv and has been recently published in the Journal of Complex Systems (vol. 20-3) also available online here.

Turing’s Deep Field: Visualizing the Computational Universe

Posted in Complexity, Computability, Universality and Unsolvability, Computer Science, Foundations of Computation, Foundations of Math, Mathematical Logic on January 5th, 2012 by Hector Zenil – Be the first to comment

I generated this image in the course of an investigation of the distribution of runtimes of programs in relation to the lengths of mathematical proofs, the results of which are being published in my paper bearing the title “Computer Runtimes and the Length of Proofs with an Algorithmic Probabilistic Application to Optimal Waiting Times in Automatic Theorem Proving” Volume 7160 of the Lecture Notes in Computer Science series (LNCS), a festschrift for Cristian Calude.

The preprint is now available online in the ArXiv here. The paper shows that as theoretically predicted, most machines either stop quickly or never halt. It also suggests how a theoretical result for halting times may be used to predict the number of random theorems (dis)proven in a random axiom system –see the section I’ve called “Gödel meets Turing in the computational universe”.

Turing View of The Computational Universe

The plot was the winning image in this year’s Kroto Institute Image Competition, in the Computational Imagery category titled “Runtime Space in a Peano Curve”, it shows the calculation of the distribution of runtimes from simulating (4n+2)^(4n) = 10000 Turing machines with 2 symbols and n=2 states (of a total of more than 10^12 simulated Turing machines with up to n=4 states) following a quasi-lexicographical order in a Peano curve preserving–as far as possible–the distance between 2 machines arranged in a 2-dimensional array from a 1-dimensional enumeration of Turing machines.

In the image each point or cluster of points represents a Turing machine or a group of Turing machines, and the color is determined by a spectrum encoding their halting runtimes–the lighter the square the sooner the machine entered the halting state. White cells represent machines that are proven to halt in infinite time. Red cells show the Turing machines that take longer to halt (popularly called Busy Beavers). Knowing the values of the Busy Beaver functions allows us to identify the machines that never halt (depicted in white). The image is noncomputable, meaning that the process cannot be arbitrarily extended because of the undecidability of the halting problem (i.e. there is no procedure for ascertaining the color of the following pixels to zoom out the picture and cover a larger fraction of the computational universe). Put it in the words of crystilogic,

What you’re looking at is a subset of descriptions of all possible things, and some impossible things. This is possibility and reality compressed into an image.

Turing machines with an arbitrary number of states can encode any possible mathematical problem and are therefore perfect compressors of the known, the yet to be known, and even the unknowable (due to the undecidability of the halting problem).

This is how the image looks like as displayed in the stairway of the Kroto Research Institute in the UK:

Some postcards with the winning image were printed and they can be sent to scholar or enthusiasts upon request sending an email to hectorz[at] labores.eu.

I want to dedicate this prize to my former thesis advisor Jean-Paul Delahaye who suggested me the Peano arrangement as a packing for my visual results of halting times. And also to Cristian Calude who co-advised me all along my PhD thesis and who encouraged me to publish this paper and what better place to do so than for his festschrift.

I’m releasing the images under an open licence in honour of the Turing Year, so that it may be used for any artistic illustration of some of Turing’s main results (the halting problem) or for any other purpose in any medium.

Postcards of the winning image are also available upon request. Just send an email requesting 1 or more postcards to hectorz [at] labores.eu or to let him know (if you can) that you will be using any of the images (or if you need better resolution versions).

Towards a Web of One…

Posted in General on November 25th, 2011 by Hector Zenil – 1 Comment

The author is a little more critical of personalization than I’d be inclined to be, and he gives too much credit to human editors in my opinion (I think they can be as bad or worse than algorithmic editors. Witness the hacking scandals in the UK, a phenomenon not exclusive to Great Britain).

Eli Pariser’s words resonate powerfully with the way I see the web and how it has evolved into its current form, with its relentless social networking emphasis. My main concern is not personalization but trivialisation, owing not only to the inane content posted by individual users but also by companies controlling the web.

Social networking has been driven by an insidious knack for making clever people spout nonsense, and those who succumb make the place invasive in every sense. Not to mention the frantic race to acquire more friends (there have been reports of periods in which there is a reversal of this phenomenon, perhaps when people realise how meaningless these connections are and start unfriending their superfluous contacts, trimming their friend lists. Google+ and Facebook eventually came up with the idea of Acquaintances rather than Friends, in large part to mitigate this need.

The sad thing is that in many respects the Internet has brought out pretty much the worst in us rather than making us better. Here is a little meaningful piece of content though, to make us think about it:

David Hilbert and Deutsch in den Wissenschaften

Posted in General on November 13th, 2011 by Hector Zenil – Be the first to comment

David Hilbert was probably one of the most, if not the most, prominent German mathematician. His interests ranged from quantum mechanics to topology, and notably from the foundations to the history of science and mathematics. Among his main interests there was geometry, at which he excelled by formalizing a two and a half centuries (Euclidian) geometry synthesizing it into a few (consistent and independent) different sets axioms, making the entire theory as mathematically precise as possible (in the tradition of Euclid himself but taken to culmination) at the level of precision that any theorem could from then on be carried out by a mechanical procedure manipulating symbols with no semantics attached.

His formalization of geometry certainly led him to think that math could one day be entirely formalized in the same way, proving any true theorem for a given mathematical theory. In his famous 1900 lecture at the Sorbonne in Paris Hilbert set most of the mathematics agenda for the 20th. century, and among the open problems there was the one he called the Entscheidungsproblem (the Decision problem), that is whether a mechanical treatment of math would eventually make possible theorems to be automatically proven.

In his address to the Society of German Scientists and Physicians, in Königsberg on September 8, 1930, Hilbert finished his speech with a loud “Wir müssen wissen — wir werden wissen!” (We must know — we will know!) (the radio broadcast was recorded and is available here and an English transcription here). His statement defined so well his mind that it was used as the epitaph on his tomb in Göttingen.

Hilbert’s ideas eventually led two mathematicians to make two of the greatest discoveries in the last century solving Hilbert’s open question in the negative with remarkable outcomes. First, there was Kurt Gödel, who in 1931 found that there are always (true) theorems in any (powerful enough) mathematical theory which these theories could never prove to be true (or false) calling them undecidable propositions. Then, Alan Turing in 1939, proved what he called the undecidability of the Halting problem for any general-purpose mechanical machine, with nothing else but what was the basis of the first description of the modern digital computer and the foundations of a new science: computer science.

Hilbert’s “Wir müssen wissen — wir werden wissen!” made all this possible, and its impact is yet to be measured in the centuries to come.

My entry to the Deutsch in den Wissenschaften Competition (German in the Sciences) organized by the Goethe Institut.

A bit of cryptography, passwords strength, and sysadmins of savings banks

Posted in General, Natural Language and Computational Linguistics on August 14th, 2011 by Hector Zenil – Be the first to comment

This recent webcomic from xkcd, Password Strength, relates to what I’ve been investigating lately–the question of whether there need be a tradeoff between password strength and ease of recall, using, as usual, my knowledge of finite algorithmic information theory. While my conclusions will be presented in a paper in the next few months/years, the webcomic made me realize how defective even sophisticated things sometimes turn out to be. The webcomic draws attention to an unnecessary, even silly requirement for password generation.

Password strength is related to two main parameters: length and the likelihood of being guessed. Length is a simple parameter, the longer the better, but the likelihood of something being guessed is a bit trickier. What makes a string like 0000000 more likely to be cracked than say 0110101? Classical probability tells us that both have the same likelihood of occurring, thanks to (uniform) chance. But there is a good reason not to choose a word in any language as a password, viz. that someone trying to guess a password will assume that people wish to remember their password and therefore will most likely use meaningful strings, such as words, anniversary dates, and the like. Hackers use a technique called ‘dictionary attacks’ where every entry in a dictionary is presumed to be a potential password and used as such. The results are often spectacular. Despite the general awareness that words are not good passwords, when hackers perform these attacks they usually find at least a few people with weak, word-based passwords. The only reason not to use words is dictionary attack, so sysadmins took it upon themselves to dissuade people from choosing words as passwords, and the way they set about this was to ask users, and now quite often to force them, to:

1. Observe fixed length requirements;
2. Sometimes require a combination of caps;
3. Sometimes require numbers and other symbols;
4. In the case of some sites, require frequent password changes.

While requirement number 1 makes some sense (people may argue, with reason, that a short but good password is superior to an easy-to-guess long one), requirements 2 to 4 are quite arbitrary, if not completely nonsensical.

If people didn’t pick word-based passwords, if every character were assumed to be equally likely, only the length of the password would matter. If a password were to be generated randomly, the probability of hitting a word by random letter generation is very low, because there are more letters than one-digit numbers and reasonable symbols to choose from. The problem is that words are not random in the sense that they are sequences of correlated letters according to a distribution of words in a language.

But under the circumstances, the way that sysadmins decided to address this issue was to force people not to choose single words by arbitrarily enforcing requirements 1 to 3, with the result that people frequently either no longer remember their passwords or they cannot have a range of different passwords for use at different sites, because remember that if a password is disclosed at one site it will be disclosed for all the others where said password is used. To fix this, again sysadmins introduced the procedure of forcing people to change their passwords, so that users can no longer synchronize their password among all the sites that need one, leading them to have to either write down their nonsensical passwords because they cannot recall them, store them where they would not be secure, or else ask the system to retrieve the password every time they want to log in, often through unsecured lines (e.g. public wireless networks). These arbitrary requirements aren’t listed while you’re trying to log in, so that you may not remember whether you used a number or symbol (it may seem a bad idea to make hackers privy to this, but they will be anyway if they have access to the registration process). The computational time cost of retrieving or generating a new password may be low, but the human time cost may not be.

A longer password will always be better than a shorter one, even compared to a shorter one containing all sort of numbers and symbols. Numbers and symbols are only useful for preventing dictionary attacks, but if people use leeting (or chat slang) then I’m afraid it would not solve the problem that sysadmins are faced with, because hackers could just apply the translation tables to the dictionaries and try again, which I’m sure would be very successful. There are leetspeak translation sites on the web (e.g. www.computerhope.com/cgi-bin/convert.pl). The only truly effective way to solve this problem is to require longer passwords (which also prevents the use of words, because most common words have less than 8 characters, so asking for a 10-character password may be enough to preempt the choice of most dictionary words, though more characters can be required for greater saftey).

Could this problem have been solved otherwise? It would certainly be too expensive–computationally speaking–to check every password against a dictionary to avoid every possible word in every language (and even harder to avoid employing anything having to do with the user’s life, such as his forename or anniversary date), but being asked to come up with silly characters became unbearable for many users.

The xkcd site gives you a foretaste of it of a side effect of my algorithmic approach to cryptography. One does not and should not have to sacrifice ease of recall for strength. Something as simple as composing random words has greater combinatorial strength than a leetspeak-corrupted single word or passwords comprising a few random characters, with the added advantage that even if longer, sentences are much easier to remember.

There is no trick to this. Choosing a few words means a longer password, and putting them together circumvents dictionary attacks (to perform dictionary attacks over full sentences is computationally too expensive, so there is no way hackers can start doing so).

One shouldn’t need to choose between a good password and one that’s easy to remember, because there are passwords that can be both. What one may not be able to do is have short passwords that are easy to remember and also strong enough. Thus requirement 1 is the only important one, because one can also control the number of possible dictionary words, e.g. by picking a password length greater than most commonly used words. Zip’s law tells us that the number of letters will not be that long, because the longest words are less frequently used and therefore also less likely to be picked as passwords (besides, systems could actually make sure that there is no word in any major language that is of the stipulated length).

Compare the result of putting 2 random short (but not that short) words together using Mathematica:

In[1]:= passwds=Table[StringJoin[
RandomChoice[
Select[DictionaryLookup[], (StringLength[#] <
6) && (LowerCaseQ[#]) &], 2]], {100}];
In[2]:= RandomChoice[passwds, 5]
Out[2]="rimecrave", "stoneobeys", "mixedtromp", "whiffmutt", "letopted"

with their leetspeak versions that one would probably be required by sysadmins to use:

In[3]:=ToLeetSpeak/@passwds
Out[3]=”r1m3cr4v3″, “570n30b3y5″, “m1x3d7r0mp”, “wh1ffmu77″, “l370p73d”

even if the joined words do not either exist in a standard English dictionary:

In[4]:=Select[passwds, MemberQ[DictionaryLookup[], #] &]
Out[4]={}

To have fun you can also generate 2 word passwords in other languages using Mathematica:

In Spanish:

Table[StringJoin[
RandomChoice[
Select[DictionaryLookup[{"Spanish", All}], StringLength[#] < 6 &],
2]], {100}]

or French:

Table[StringJoin[
RandomChoice[
Select[DictionaryLookup[{"French", All}], StringLength[#] < 6 &],
2]], {100}]

Or any other of these languages:

In[5]:= DictionaryLookup[All]
Out[5]= {“Arabic”, “BrazilianPortuguese”, “Breton”, “BritishEnglish”, “Catalan”, “Croatian”, “Danish”, “Dutch”, “English”, “Esperanto”, “Faroese”, “Finnish”, “French”, “Galician”, “German”, “Hebrew”, “Hindi”, “Hungarian”, “IrishGaelic”, “Italian”, “Latin”, “Polish”, “Portuguese”, “Russian”, “ScottishGaelic”, “Spanish”, “Swedish”}

An alternative method (to compression) for approximating the algorithmic complexity of strings

Posted in Algorithmic information theory, Foundations of Computation, New Ideas on July 20th, 2011 by Hector Zenil – Be the first to comment

The method introduced in my doctoral dissertation was featured in the French version of Scientific American Pour La Science in its July 2011 issue No. 405 under the title Le défi des faibles complexités.

Jean-Paul Delahaye points out that:

Comme les très petites durées ou longueurs, les faibles complexités sont délicates à évaluer. Paradoxalement, les méthodes d’évaluation demandent des calculs colossaux.
(Like long durations or very short lengths, weak complexities are tricky to evaluate. Paradoxically, the evaluation methods require colossal calculations.)

and he continues:

Pour les petites séquences, cette mesure est stable et conforme à notre idée de la complexité; pour les grandes, elle est, d’après le théorème mentionné [the invariance theorem, my comment], conforme à la meilleure mesure de complexité unanimement admise, la complexité de Kolmogorov. Que demander de plus?
(For short strings, this measure is stable and conforms to our idea of complexity; for long strings, according to the aforementioned theorem [the invariance theorem-- my comment], it conforms to the best and universally accepted measure of complexity, Kolmogorov complexity. What more can one ask for?)



Imagine you are given 2 short strings and are asked to tell which one looks more random and which more structured. The strings are 0101 and 1011. Applying the idea of algorithmic complexity, the shorter the description the less random-looking. It would seem that the first string has a repeating pattern that can be taken advantage of in describing it. In plain English one may therefore say that the first string can be described as “zero and one twice” while the second one would require a longer description. In fact, there seem to be fewer descriptions for the first string than for the second (and also notice some ambiguity in plain English). The first may also allow the description “A zero followed by a one followed by a zero followed by a one” or “zero and one followed by the same” and perhaps other variations. Descriptions of the second one can include “one and zero followed by two ones” or “one zero one one,” just as the first one could simply be “zero one zero one,” which doesn’t look like a compressed version of the string but rather a straight translation into an expanded form of plain English.

All this by way of asking whether any of the two strings is without a doubt simpler than the other, or whether the apparent repetition in the first string makes us think that the string has a pattern despite the pattern being repeated only twice. Perhaps when one looks at such a string one gets the impression that it may belong to a larger string comprising alternations of 01 and one concludes that it is simpler than the second one. To leave the subjective realm, one would need to evaluate the algorithmic complexity of the strings and compare their respective values. The algorithmic complexity of a string is the shortest program (measured in bits) producing th string in question running on a universal Turing machine. It is inconvenient that there is no algorithm that, given a string, gives you the length of the shortest program that produces it. This is by reduction to the halting problem. Which means one cannot really measure with absolute certainty the algorithmic complexity of a string because it is uncomputable. It doesn’t mean, however, that one cannot approach it; one can often do so in an effective and useful way.

Traditionally, the way to approach the algorithmic complexity of a string was by using lossless compression algorithms. Lossless means that one can recover the original string from the compressed version. The use of lossless algorithms is preferable because there may be inconsistent ways of compressing strings that give the impression of compressing a string without corresponding to a measure of complexity (e.g., noise deletion is an effective way, but noise may be algorithmically random and the aim of algorithmic complexity is to precisely measure how random something is and not merely to gauge whether it looks random). The result of a compression algorithm is an upper bound of its algorithmic complexity. While one cannot ever tell when a string is not compressible, if one succeeds at somehow shortening a string one can tell that its algorithmic complexity cannot be larger than the compressed length.

One does not want to cheat and claim that one can compress any string into a bit if the decompression algorithm interprets that bit into the desired string. A fair compression algorithm can be defined as one that transforms a string into two pieces: one is the compressed version and the other the instructions to decompress the string, together accounting for the final length of the compressed version. In other words, it would seem as if you were adding the decompression algorithm to the compressed string so that the compressed string comes with its own decompression instructions. In the long run, there is a theorem (invariance) that guarantees that complexity values converge.

But for short strings (which are often the ones useful for practical applications), adding the decompression instructions to the compressed version makes the compressed string often, if not always, longer than the string itself. If the string is, for example, shorter than the size of the decompression algorithm, there will be no way to compress the string into something shorter still, simply because the decompression instructions are at least of the length of the original string. Moreover, the result is so dependent on the size of the decompression algorithm (because it is the greatest contributor to the overall length) that the final value of the compressed length is too unstable under different lossless compression/decompression algorithms.

For example, if one tries to compress a short string using Mathematica, one gets the following results:
StringLength@Compress["0101"] = 30

Looking at the beginning of the compression line when plotting the lengths of strings (x axis) against their compressed lengths (y axis) one observes that it does not start at y=0 even when x=0.

This means that compressing the string 0101 with 4 bits took 46 characters (even more in bits). In Mathematica, strings begin to be compressed this way at about length 30. This is not a malfunction of Mathematica; it is the result of what I have explained. The Mathematica Compress function is actually based on the Deflate lossless compression algorithm, which is a combination of the LZ77 algorithm and Huffman coding, among the most popular lossless compression algorithms available, used in formats like ZIP, GZIP, GIF and PNG (these last two are therefore lossless compression image formats).

Zooming into the axis origin of the plot one can see that the beginning looks unstable. The string in question is a repetition of n 1s with n lying on the x axis (which represents different compression lengths rather than individual ones to avoid repetition and a step-like curve)

If one tries to compress 1011 one gets none other than the same value:
StringLength@Compress["1011"] = 30

The instructions obviously take up some space in the final compressed length and they cannot be compressed themselves (if they were, they would in any case be the same for all strings, taking us back to the same situation). There is a limit for compression algorithms to compress short strings. So if one wished to tell which of these two strings were objectively more or less randomly complex by approximating their algorithmic complexity using a compression algorithm, it turns out that there is no way to do so.

On the other hand, given that the definition of algorithmic complexity based on compressibility says that the less compressible the more randomly complex a string, one could immediately say that a single bit, 0 or 1, is random for certain, i.e. has maximal algorithmic complexity, given that there is no way to further compress a single bit. In other words, there is no program shorter than 1 bit that can produce 0 or 1. The shortest descriptions of 0 and 1 are therefore 0 and 1 themselves. Hence 0 and 1, according to the compressibility approach, are random strings. It may seem to make no sense to characterize a single bit as random.

On the one hand, a single bit does not carry any information and on these grounds one may think of it as somehow random. If one thinks about whether one would have been able to predict 0 or 1 as the outcome of a process, given that there is no context because they occur alone, one may also conclude that their occurrence is somehow random. In other words, if you see a string like 010101 you may bet that the next bit is 0, but if you are provided with nothing there is no way you can favor any position on whether the bit to come is 0 or 1. So much for justifying that a single bit is random.

It is hard, however, to justify how 0 could look more random than, say, any other possible string. If 0 is random how is it relatively more complex than 00? Or 01? Intuition tells us that short strings shouldn’t be that random (more random than, for example, longer random-looking strings), so if a single bit is the most random among all finite strings, how could it be that there is such a phase transition from maximal random complexity to very low complexity of, say, strings of length 2, 3 or 5 bits long?

Since intuition tells us that something random should also be uncommon and rare, what if one asks how common 0 or 1 are as results of a program? There is a measure that gives the probability of a program’s producing a given string running on a universal Turing machine. This is a measure we used to present a new method to evaluate the complexity of strings, as an alternative to the traditional use of compression algorithms. The new method aims particularly to solve the problem of the evaluation of the complexity of short strings, as we’ve discussed. It is based on Levin-Solomonoff’s algorithmic probability and is connected back to algorithmic (Kolmogorov-Chaitin) complexity by way of Chaitin-Levin’s coding theorem.

Algorithmic probability says that it is not the case that a single bit is the most complex random string, but actually the most structured possible one and, more importantly , that the complexity transition is smooth, more in accordance with intuition.

It may be that it makes sense that a single bit can be regarded as both the most simple and the most complex of strings from different perspectives, and the advantage of the algorithmic probability approach is that it provides not only a different notion of the complexity of a single bit, one that is in keeping with intuition, but also that it generates a different outcome to the compressibility approach, even when the two measures are intimately related and asymptomatically produce the same results in the long term (for longer strings). I think the two views reflect different aspects of what a single bit represents.

The paper presenting the novel method for evaluating the algorithmic complexity of short strings was first proposed and sketched in Greg Chaitin’s 60th anniversary festschrift edited by Cris Calude (J-P. Delahaye & H. Zenil, “On the Kolmogorov-Chaitin complexity for short sequences,” Randomness and Complexity: From Leibniz to Chaitin, edited by C.S. Calude, World Scientific, 2007). The method uses an exhaustive and systematic search of Turing machines inspired by Wolfram’s NKS dictum, from which a frequency distribution of the halting machines is built and the Levin-Chaitin coding theorem applied to evaluate the algorithmic complexity of binary strings.

Chaitin pointed out (regarding our method) that:

…the dreaded theoretical hole in the foundations of algorithmic complexity turns out, in practice, not to be as serious as was previously assumed.

The full technical article is available in ArXiv Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness.

You can also look at the slides of the presentation I delivered at the Alan Turing amphitheater at the Computer Science Department of the University of Lille 1:

“The World is Either Algorithmic or Mostly Random” awarded a 3rd Place Prize in this year’s FQXi contest

Posted in Algorithmic information theory, Complexity, Foundations of Computation, Foundations of Physics, General, New Ideas on June 10th, 2011 by Hector Zenil – 2 Comments

Based on the combined ratings of the contest community and the panel of expert reviewers appointed by the FXQi, which included the members of the institute, I was awarded a 3rd Place Prize for my work The World is Either Algorithmic or Mostly Random in this year’s FQXi contest on the topic Is Reality Digital or Analog? sponsored by the Foundational Questions Institute. The winners were announced at this year’s World Science Festival in New York City.

My work can be summarized in one line as an explanation of the complexification process of the world, the process whereby we have evolved from a more primitive (random) state to the current organized state of the universe.

The essay is a summary of the philosophical branch of my current scientific research on finite algorithmic information theory. This philosophical branch is concerned with the exploration of the possible connections between algorithmic complexity and the physical world (or what happens in it). I propose the notion that the universe is likely digital, not as a claim about what the universe is ultimately made of but rather about the way it unfolds. Central to the argument are concepts of symmetry breaking and algorithmic probability, which are used as tools to compare the way patterns are distributed in our world to the way patterns are distributed in a simulated digital one. These concepts provide a framework for a discussion of the informational nature of reality. I argue that if the universe were analog, then the world would likely look more random, making it largely incomprehensible. The digital model has, however, an inherent beauty in its imposition of an upper limit and in the convergence in computational power to a maximal level of sophistication. Even if deterministic, that the world is digital doesn’t necessarily mean that the world is trivial or predictable, but rather that it is built up from operations that at the lowest scale are simple but that at a higher scale look complex and even random–though in appearance only.

How have we come from the early state of the universe (left) to the structures we find today (right)?

The arguments supporting my views are partially based on the findings of my research, epitomized by our most recent paper Numerical Evaluation of Algorithmic Complexity for Short Strings: A Glance into the Innermost Structure of Randomness available in ArXiv in which my co-author and I describe a method that combines several theoretical and experimental results to numerically approximate the algorithmic (Kolmogorov-Chaitin) complexity of bitstrings by using the concept of algorithmic probability, which is connected to algorithmic complexity by way of the (Levin-Chaitin) coding theorem.

An extended (and detailed) version of The World is Either Algorithmic or Mostly Random is forthcoming and will be eventually posted.