Subscribe to RSS feed

«

»

Dec
02

Meaning is in the word net: cyclic self-referential definitions, dictionaries and found in translation

In the “Period 1 Cycle English Dictionary” published by “No way, Inc.” (it’s been said to be the most accurate dictionary ever) one can read:

dog (Noun) : a dog is a dog.

The lazy creators of this dictionary appear to have forgotten what is broadly accepted by common sense. A definition would not be a definition if it were cyclical and self-referential, simply because one would, theoretically, fall into an infinite loop trying to define an unknown word that has as part of its definition the word itself.

Let’s leave aside the fictional Period 1 Cycle English Dictionary, which I just made up, and consider how things work in a regular dictionary. Assume you want to know the definition of a word. Looking it up, you’ll encounter a set of ordered words in the language in question(if the dictionary is not a bilingual one). Let’s prove that in the end, any definition of a word w in a closed finite dictionary is cyclic with period p_w:

Proof sketch:

We may hypothesize that the dictionary (D) is closed under the operation “the definition of” (i.e. all words in the definition of a word have a definition in the dictionary). One can also safely assume that all words in D have a non-empty definition (given the definition of “dictionary” itself!) and that the dictionary is finite. Then if w is a word in D, the orbit of the successive definitions d_1,d_2,…d_n, … for w (listed as sets of words themselves) will end up eventually crossing the path d_w={d_1,d_2,…d_n, …} because d_w cannot be an infinite sequence without repeating elements in D– the finite nature of D would necessitate  coming back to the original word itself after a certain period p depending on the word, making every word w in the dictionary cyclic with period p_w. 

If all the definitions in a dictionary are then cyclic and enclosed in the dictionary itself, where does the analytic knowledge that a dictionary seems to provide come from? When one consults a dictionary in a given language, one already knows some words in that language, but assuming one doesn’t, would a dictionary be completely useless, as the issue of ultimately cyclic self-referential definitions seems to suggest?

One could conceive of a dictionary as a whole as a syntactical knowledge container with no actual knowledge in it– to someone who does not bring to it any knowledge of the relevant language. One may wonder how something so perfectly self-referential could actually be of use, as dictionaries seem to be. Is it indeed because you always know some, or indeed many, words of a language already when you consult a dictionary? For since every word is defined by other words in the same language, looking up the meaning of a given word would lead you to other words and yet others and eventually back to the first word, the word whose meaning you set out to learn. This would be the case even if  the dictionary were bilingual, and the meaning of the word you wished to check was given in a second language. Thus all dictionaries are perfectly circular, closed, self-referential sources.

However, the analytical knowledge in a dictionary does not come from the definitions as words, but from the word net underneath, where the words are connected in some, perhaps unique fashion(modulo exact synonyms) to each other. That’s what quite successful projects such as WordNet and functions like WordData in Mathematica are about. The power of being able to analyze the language as a net in a computable way is priceless for the progress of computational linguistics and linguistics in general.

 

2-level depth wordnet for the word "chair"

2-level depth wordnet for the word "chair"

 

 

For example, if “chair” is connected to “table”, “office”, “dining room”, etc. it should be easy to map it to its equivalent in any other language. Unless the target language doesn’t have the concept “chair” as referring to a seat placed at a table in a dining room (which was perhaps the case in some Asian countries before colonialism), together with an explicit cognitive representation of it.

Of course problems arise when considering the mappings between words having the same written form but different senses. The word “chair,” for example, is a seat, but may also mean the officer who presides at a meeting. Also posing a problem would be cases of words being mapped to or from words belonging to different parts of speech, such as ”personne” in French, which  maps onto two completely different words in Spanish: ”persona” and “nadie”,  a noun and an adjective respectively, with completely different connections and different supporting nets. So even when it seems that most relations might be surjective, the general case is certainly not bijective, and that applies to homonyms too, which often creates ambiguities in translation. However, the supporting network  would be able to uncover this fact and solve a possible ambiguity based on  context by extending the word network to encompass the ambiguity. In other words, if a subnet cannot be uniquely mapped, extending it should eventually solve the ambiguity. What one would need  is a corpus big enough to build such a network once and for all and then simply make comparisons at the network level. This could work even for completely new or unknown languages, either dead, living or artificial, assuming that they share a part of our actual reality and hence some part of our mental representations  (In a sense this is what Champollion did when he deciphered the Rosetta stone– he discovered a partial mapping of a subnetwork of words from an unknown language – Egyptian – to a subnetwork of a known one – Greek). In the final analysis, each language has a single unique network (changing slightly through time but remaining well connected and strong enough to make  it unique and recognizable while being isomorphic with that of any other language).  All languages could be identified by their fingerprints -their word net. This kind of analysis would identify the lack of a word net structure in hoax languages, such as perhaps the Voynich manuscript.

Having  established that, what about mining the world of all possible meanings, the world of all possible translations, and the world of all possible ideas? We wouldn’t have the problem of distinguishing between a coherent idea and a non-coherent one since the network would provide some minimal coherence. Thus the net-into-the -net approach would give us a way of translating from word to word and from phrase to phrase and from idea to idea as faithfully as possible in most cases, since in the end all of us as human beings share a single reality, though we perhaps approach it from different points of view.

Again, the analytical knowledge in a dictionary comes from the net connecting the words, so even if someone does not know English at all, I would say that he would be able, albeit with considerable difficulty, to learn English just by deducing the net connecting objects, in other words, by mapping his own mental  representations of objects onto words in the  English dictionary. In the process he could encounter some ambiguities, but the further he goes, the more of these he would be able to resolve. On the other hand,  speakers of those languages in which “chair” does not exist, both in the language itself and as a real object in the culture,   would be able to deduce what  a chair is by tracking its  relations with the objects they know and for which they do have mental representations and the phonemes to externalize them. So the problem of translation, which began with the mapping of word onto word and then phrase onto phrase  with statistical tools,  becomes, with this approach, a matter of  mapping net to net.  Indeed this seems to be the approach adopted  by Meaningful Machines.  

These ideas could be carried to the limit by taking the sum total of human languages and enquiring into the mapping between such a network and our cognitive representations. Such a move would provide grounds  for rebutting the Chinese room argument, since in the end it does not matter whether someone inside the room has no knowledge at all of a language; he would be able to map what he is mechanically translating onto his own mental representations, generating what, according to the argument, could not be generated: understanding. Because Searle’s  idea was, as I recall, to build up a case against A.I. in terms of the meaningless of the Turing test and true AI in general.

One may actually use a dictionary without knowing a single word in it! That is because there is a mapping between the word net in the dictionary and one’s own language, or even better, a mapping (not necessarily injective or surjective) between the word net of the dictionary and your cognitive personal word net.

Oversimplifying, translation might be reduced to the search for the homomorphism between algebraic groups, with each group G={W,d} being a language dictionary, with W the set of words in that language and d the closed operation “definition of”. One can then see each definition as an oversimplification of a directed, ordered graph g={w,s}, with w the set of vertex words involved in the definition and s the ordered (since definitions may be not commutative) set of edges for an ideally formal- enough language dictionary.

Bad practices of cyclic definitions in a dictionary should rather be expressed as the practice of period 1 cycles, i.e. words that have in their  immediate definition the word itself.

This post is  related to a previous post titled “Meaning against A.I.

No comment yet

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>