Loading....
Recent Article links:

Archive for December, 2006

International Union of History and Philosophy of Science Conference on calculability and constructivity.

A Conference of the International Union of History and Philosophy of Science (Joint Session of the Division of Logic, Methodology and Philosophy of Science and of the Division of History of Science and Technics) was held in Paris at the Ecole Normale Supérieure on November 17-18th 2006. It was organized by Jacques Dubucs. Appointed president of the joint conference by the IUHPS,  he is also the director of the IHPST and  was   my master’s thesis adviser.

The conference began with an interesting talk  by Yiannis Moschovakis, professor of math at UCLA, about elementary algorithms, which  according to him are usually defined in terms of mathematical models of computers with “unbounded memory”. His talk was based on his article entitled “What is an algorithm?” –a very fine piece, in my opinion–included in Mathematics Unlimited - 2001 and Beyond,  edited by B. Engquist and W. Schmid and published by Springer in 2001 (also available online). Such definitions, he argued, do not coincide with our basic intuitions about algorithms. He provided examples of definitions that eschewed the use of  abstract machines. For instance McCarthy, with the concept of recursive programs, or elementary (first-order) algorithms. He took the common example of the GCD and asked whether the Euclidean algorithm was optimal among all algorithms for some primitive (There have been some results on upper bounds of the form Ce(x,y)=2log and definitions based on partial and pointed algebras). Then he asked about isomorphism between algorithms and defined what he called a “recursor”—a tuple of monotone functions on complete posets which determine a system. It  models an algorithm. A natural notion of recursor isomorphism models algorithm identity (which he seems to believe is the finest notion of an algorithm). He went through concepts like “program congruency” and a related theorem (which is easily demonstrable for McCarthy programs) : Every program E is reducible to a unique - up to congruence - irreducible program cf(E) [where cf(E) means canonical form] which is the optimal algorithm. And then he continued through more abstract terms such as “referential intension” and “partial algebras,’ which are covered  in his paper . He defined a complexity measure which he called “complexity functions for programs” where the basic idea consists in counting the number of calls to some primitives. For example, the number of primitives q1,q2,…,qk in the computation of an algorithm M(x,y1,…,y_k) using E (a McCarthy program) where x can be a vector as input. He talked about a depth complexity measure too and stated a Church-Turing thesis for elementary algorithms which would be as follows:Every algorithm which computes a partial function f:A^n->A from q1,…,1k is faithfully represented by a recursive program on some logical extension B of a partial algebra A=(A,0,1,q1,…,qk). On the basis of this assumption it would seem  that the lower bound results obtained by the embedding method hold for all elementary algorithms.

After Moschovakis, Prof. Serge Grigorieff  of the Math Dept. of  the University of Paris VII gave a talk on foundations of randomness in terms of recursivity, which was of particular interest to me given my current research on the topic. According to him, there is no formal notion of a random object in probability theory; random variables are entities having nothing to do with random objects. He talked about  Berry’s paradox as instanced by the question “What is the smallest number that needs at least n words to specify it, where n is large” or by the phrase “the first undefinable ordinal” and about a solution replacing “specify” or “describe” with “compute”. Then he went through traditional definitions by Kolmogorov: K_f(x)=min{|p|:f(p)=x} where f can be interpreted as a computer or compiler, and p the programming language with no input and x of course the object whose complexity is to be measured. In terms of compressibility, if |x|=x then x is random and if K(x)=|x|-c then x is c-incompressible. We know of course that the problem is in general non-computable.
Following that Prof. Grigorieff stated the invariance theorem, which basically says that  f (the computer program) varies by a constant. The problem, according to Kolmogorov(1964) himself,  lies with these constants, since “reasonable” optimal functions will lead to complexity estimates. In 1965 Martin-Lof gave another equivalent definition: x would be c-random id Delta(x)<=c, i.e. x passes statistical tests with significal level c. We know then that incompressible=random, from Kolmogorov’s and Martin-Lof’s work. Thus if the size of x is equal to the size of p, which is the smallest program which produces x,  x would be random. Next he cited Chaitin’s concept of the  prefix-free domain, and finally pointed out the equivalence between  them all. However all these definitions are weak in some fashion. Take for instance the notion of invariance ( “reasonable” variance if you prefer) or fragility:  a random string a0,a1,… would be random but the same with some regularity inside, like a fixed number in particular positions ( a0,0,a1,0…), wouldn’t be more so. Kolmogorov extendeded his own idea to infinite objects but it did not work. Martin-Lof’s random sequences satisfy. I found some ideas related to Schnorr, Martingale and Solovay relevant to the concept of irreducibility. Michael Detlefsen, philosophy Professor at the Univ. of Notre Dame, USA, gave another interesting lecture, more philosophically oriented, in which he merged discussion of construction, computation, exhibition and proofs. He made some remarks on the concept of proof : A proof is a sequence of judgment. A proof needs to use reference to judge. We moved on to something that caught my attention since it is a topic to which I have given considerable thought and  is related to the role of mental or graphic representation of proofs and proofs that need more powerful tools and a higher language to prove a statement posed in a lower and less powerful language. For example, according to Detlefsen, Frege  (Frege on the existential requirement of proof) pointed out that the use of objects like the square root of -1 in proofs for Real Analysis would be immediately seen as distractors; some proofs use the sqrt of -1 when the magnitude does not occur obviously in the real analysis theorem to be proven.  Such proofs would collapse if the number 1 were to be taken out. From my point of view the use of sqrt of -1 and equivalent cases can have even more implications : they could mean that sometimes proofs use stronger axiomatization in order to force a proof in a less powerful  statement in a less powerful axiomatization (I am concerned, for example, with the case of Fermat’s last theorem, but that is a subject for a separate post). According to Frege (who was not talking about Fermat’s last theorem, of course) we import something foreign into arithmetic. Gauss himself asked the same question about the significance of this foreignness of symbols and even more powerful tools. A concern with “purity” played a large part in Frege’s logicism. These questions seem to have engaged everyone from  Proclus to Leibniz to Frege. They can be found in papers about reference and rigor. Following this line of thought, in math, figures do more than simply “refer” to the objects they represent. In geometry objects are represented by entities  of the same kind– line by lines, and circles by circles– but in algebra the use of signs to represent objects avoids  explicit reference. According to Hobbes, the prover maintains contact with the reality to be proved by exhibiting it, that is, by manifestly generating it through a process of efficient causation. According to Francis Maseres, visually exhibiting the objects of geometrical reasoning increases rational confidence in its premises.

Another interesting lecture was given by Maurice Margenstern, Computer Science professor at the University of P. Verlaine, Metz, France on the “Computer Science Revolution”. The title did not really reflect the rich content of the talk which brought home to me the  importance of two concepts almost completely ignored in computer science and which seem to be of fundamental and foundational value. We have heard a lot on the equivalences between proof, algorithm and program (Curry-Howard for example and the concept of Currying). However, an algorithm is a project of execution while a program is the execution itself. According to Margenstern, time is the key concept both in an algorithm and a program (and of course the arrow of time–my contribution). The “equals” in a proof or in a program often means “compute” and what is merely a description of something to do becomes an actual computation. For example, replacing “=” for “:=”, a non symmetrical relation, introduces the role of time in computation.I have much more to say in this regard but I’ll do so in a separate post, since it is part of my personal view and pertains to the basic requirement of a computation, which includes the notion of time of course, but also the notion of the carrier and the medium, all of which are matters requiring in-depth analysis. Computation is often referred to in terms of an input, an output, and an intermediate process, but we will analyze what is involved in detail inside an actual computation, which from my point of view is inseparable from certain physical constraints.

Before the last lecturer I took some random notes: Computability is an epistemological notion. Constructivism refined by Martin-Lof. 4 features of finitism: a domain D is admisible if it is r.e.

The last lecturer was Wilfried Sieg, Professor in the Philosophy Department at Carnegie Mellon University. His lecture was based on his most recent paper “Church without Dogma : Axioms for Computability”. Prof. Sieg’s talk drew attention to some comments made by Goedel about Turing (in a letter to Martin Davis). According to Goedel, Turing’s work includes an analysis of the concept of “mechanical procedure”(I think it worth drawing attention here to Goedel’s so-called dichotomy, which casts doubt on the validity of Turing’s approach to the human brain).  An additional comment from me: In the history of calculability several terms have served as equivalents for the vague concept in the first part of Church’s thesis: ”algorithm”, “effective procedure”, “verifiable by hand”, “computable function”, “effective function”, “feasible computor”, “mechanical procedure”, “finite combinatorial procedure”, “finite machine”and almost any permutation between them. According to Sieg there are two basic constraints for mechanical procedure:

- Boundness: There is a fixed bound on the number of configurations a computer can immediately reconize, and
- Locality: A computer can modify only immediately recognizable subconfigurations.  He talked about his own definition of a k-graph machine (equivalent to a string machine)–F: D->D (operation transforming states into states).He takes finiteness for granted, from which I deduce that he inclines to the view that  Church’s thesis has no content and is therefore neither a thesis nor a hypothesis. He presented a nice diagram in which he drew arrows from the concept of effective calculability to several computation systems like lambda-definability, general recursive functions and Turing machines.  All of them were proven to be equivalent. He gave us his definition of a computor, a human computer or mechanical device for instance, inspired in some way by Gandy machines as he himself expressed it, and he noted the two main constraints which, according to him, are basic to the definition of what is effectively computable. M=(S,T,G) is a T computor on S when S is a structural class, T a finite set of patterns and G a structural operation on G. Then he presented  a list of statements expressed in first-order logic which can be found in his original paper.

Dictionaries, analytical knowledge and new approaches to translating.

When I realized that a dictionary, as a whole, could be viewed as a non -syntactical knowledge container, in other words, that there was  no new knowledge in it, I wondered how something that was such a perfect auto-referential source could actually be useful.  For since every word is defined by other words in the same language, looking up the meaning of a given word would lead you to other words and yet others and eventually back to the first word, the word whose meaning you set out to learn. This would be the case even if  the dictionary were bilingual, and the meaning of the word you wished to check was given in a second language. Thus all dictionaries are perfectly circular, closed, auto-referential sources. However, I discovered that the analytical knowledge in a dictionary comes from the net supporting words connected with each other. Thus  if “chair” is strongly connected with “table”, “office”, “dining room”, etc. it should be easy to map it to its equivalent in any other language. Of course this overlooks languages in which the word “chair” does not exist,  reflecting the lack of  a comparable object in the culture and thus the lack of a cognitive representation of it. But such instances are rare since most cultures share a basic physical reality and human experience.

Of course problems arise with  words like  “personne” in French, which  maps onto “persona” and “nadie” in Spanish,  a noun and an adjective respectively with completely different connections and different supporting nets. Or conversely, when the verb “gustar” from Spanish maps onto”gouter” and “plaire” in French.  So even when it seems that all words are surjective, the general case is not bijective, and that applies to homonyms too, which often creates ambiguities in translation. In other words, any word in any language has its equivalent in any other language, whether a single word in one langauge becomes two or more words in another, or whether two or more words become one after mapping– but one word could mean two completely different things in another language. However, the supporting network  would be able to uncover this fact and solve a possible ambuiguity based on  context by extending the word network to encompass the ambiguity. In other words, if a subnet cannot be uniquely mapped, extending it should eventually solve the ambiguity. What one would need  is a corpus big enough to build such a network once and for all and then simply make comparisons at the network level. This could work even for completely new or unknown languages, either dead or living, assuming that they share a part of our actual reality and hence some part of our mental representations  (In a sense this is what Champollion did when he deciphered the Rosetta stone– he discovered a partial mapping of a subnetwork of words from an unknown language - Egyptian - to a subnetwork of a known one - Greek ). In the final analysis, each language has a single unique network (changing slightly through time but remaining well connected and strong enough to make  it unique and recognizable while being isomorphic with that of any other language).  Thus an entire language could be identified by its fingerprint -its network.

Having  established that, what about mining the world of all possible meanings, the world of all possible translations, and the world of all possible ideas? We wouldn’t have the problem of distinguishing between a coherent idea and a non-coherent one since the network would provide some minimal coherence. Thus the net-into-the -net approach would give us a way of translating from word to word and from phrase to phrase and from idea to idea.

The above ideas would apply to unilingual dictionaries, lets say English-English. The analytical knowledge in them again comes from the net connecting the words, so even if someone does not know English at all I would say that he would be able, albeit with considerable difficulty, to learn English just by deducing the net connecting objects, in other words, by mapping his own mental  representations of objects onto words in the  English dictionary. In the process he could encounter some ambiguities, but the further he goes, the more of these he would be able to resolve. On the other hand,  speakers of those languages in which “chair” does not exist, both in the language itself and as a real object in the culture,   would be able to deduce what  a chair is by tracking its  relations with the objects they know and for which they do have mental representations and the phonemes to externalize them. So the problem of translation, which began with the mapping of word onto word and then phrase onto phrase  with statistical tools,  becomes with this approach a matter of  mapping net to net.  Indeed this seems to be the approach adopted  by Meaningful Machines http://www.meaningfulmachines.com/-.  Such ideas have been around for a while, for example at WordNet: wordnet.princeton.edu/, but they somehow remain old-fashioned even as they are shifting the paradigm.

These ideas could be carried to the limit by taking the sum total of human languages and enquiring into the mapping between such a network and our cognitive representations. Such a move would provide grounds  for rebutting the Chinese room argument, since in the end it does not matter whether someone inside the room has no knowledge at all of a language; he would be able to map what he is mechanically translating onto his own mental representations, generating what, according to the argument, could not be generated: understanding. Because Searle’s  idea was, as I recall, to build up a case against A.I. in terms of the Turing test.

My conclusion: Meaning resides in the net.

This post is  related to a previous post titled “Meaning against A.I.”: http://www.mathrix.org/liquid/?p=29

NKS in Numb3rs

nksinnumb3rs.jpg

Amazon Cloud

ACF loading animated gif