A scholar with the ability and audacity to rebuild the Tower of Babel died a year ago, but his controversial project lives on
Next time, the origins of pitch accent in Old Japanese," Sergei Starostin called out to his graduate students as they left his afternoon seminar in historical linguistics at the Russian State University for the Humanities. No one in the room that day, September 30, 2005, had any inkling that with these words he was saying farewell to an illustrious academic career and to life. Half an hour later, he lay dead in a corridor from a massive heart attack. Starostin was 52.
The devastating loss felt by his wife and two grown sons was shared by colleagues within the field of historical (or comparative) linguistics, where he was a renowned, if polarizing, figure. Ilia Peiros, himself a Russian linguist of international reputation, said: "Before his death he was probably the best comparative linguist in the world. He had an ability to learn languages quickly and well, to see patterns in their relationships and order them into systems, and to work extremely hard. Other linguists know a lot of languages but aren't productive. He was both. He was unique."
Proud but not boastful of his abilities, Starostin let his résumé do the talking. His shyness with strangers disappeared when asked about his esoteric field. Answers emerged slowly, in smooth paragraphs, dotted with footnotes referencing the centuries-long history of linguistics.
His facility in learning and speaking languages was not, he believed, singular. "I can name several people who are much better in this respect than I am," he told me. Forced to make an accounting, he would admit to oral fluency in English, German, French, Polish, and his native Russian; and to reading fluency in 13 other Slavic languages, along with Chinese, Japanese, Dutch, Spanish, Italian, Latin, Greek, and Sanskrit. The number of languages he could understand, however, was "unlimited," in his estimation. Armed with a dictionary, he was confident that he could decrypt any text. "Well, some grammar is also required," he conceded. "But it is usually no big deal."
The six books that he authored or co-authored suggest this claim was not overstated. Written over the last 20 years, the books span the linguistic continent of Asia: a reconstruction of Old Chinese phonology (a model of the phonetic system underlying Middle and Modern Chinese); an etymological dictionary of five Sino-Tibetan languages; a nearly 1,500-page etymological dictionary of North Caucasian (a family of about 35 languages, including Chechen, Ingush, and Lak, famous for their jaw-breaking intricacies, some with as many as 80 consonants); and a three-volume etymological dictionary of Altaic (a contested entity among some linguists, with Turkic, Tungus, and Mongolian as uncontested main branches but allied with Japanese and Korean, according to Starostin and his two Russian coauthors).
"For many linguists it would be the achievement of a lifetime to produce one etymological dictionary," says Peiros. "Sergei wrote three."
Starostin proposed answers to some of the most daunting questions in his field. From 2001 until his death he had been directing a colossal project that in Russia he chose to call the Tower of Babel and in the United States the Evolution of Human Languages or EHL. This international effort, involving some 40 scholars on four continents, is nothing less than an attempt to provide an etymological map of every language ever spoken or written, living and extinct-some 6,000 by Starostin's estimate.
A sort of Human Genome Project for historical linguistics, EHL proposes to gather a core of the same words from every human tongue and, by comparing their phonetic and semantic roots, to discover unsuspected or previously unproven kinships between individual languages or language groups. The hope for the program is that, in Starostin's words, "an absolute majority of the world's languages can be reduced to a minimum number of huge language superfamilies." The ultimate goal upon completion is to link EHL's databases with others in archaeology, genetics, and climatology.
A committed and visionary generalist, a member of what the linguist Paul Benedict has called the "long rangers," Starostin spent years delving into the prehistory of language, speculating how superfamilies might be related in the distant past. The price he paid for such an audacious career path was exile from most circles of American historical linguistics to the outer regions of respectability. With fewer than 300 linguists in the world doing serious work on long-range comparisons, the discipline is small and perennially insecure about its scientific standards. Given the dearth of rigorous proof for some of Starostin's assertions, many American linguists felt within their rights to dismiss his research or at least to exclude him from their conferences and symposiums.
This neglect even followed his death, which went unnoticed beyond a few journals. Hearing the news of his passing in early October, I alerted the obituary department of The New York Times and provided them with contact numbers of eminent linguists in the United States and Europe. Several days later, having read nothing, I called back and asked what had happened. According to a Times obituary writer, Starostin was deemed "too controversial" for a write-up. I expressed amazement and said that even if such a charge were true, he was worthy of respect as a member of the Russian Academy of Sciences and the author of an immense body of scholarly literature. But I was told if the paper decided that he or his work deserved mention, the Science section would take care of it. So far, nothing has appeared.
When I told this story to William Baxter, associate professor of Chinese and linguistics at the University of Michigan, he was not surprised. "There are historical linguists who consider it their job to stamp out the kind of work that engaged Starostin. They've tried to intimidate anyone from even entertaining some of the ideas that Starostin investigated."
The physicist Murray Gell-Mann, who secured funding for the EHL project at the Santa Fe Institute in New Mexico, was also less than shocked that his friend's death received a news blackout in this country. "Most of the university establishment regarded Sergei's work as crazy and wrong. He was actually quite conservative and paid a lot of attention to detail. He used traditional comparative methods. But he was happy, unlike his colleagues, to apply these methods to long-range relationships and superfamilies. For many, this is a sign of insanity."
The search for a mother tongue has produced its share of crackpots, amateurs, and Ph.D.s alike, who believe they have discovered the primal language and identified the people who spoke it. It has become tiresome and frustrating for many historical linguists that the questions the public and popular press most commonly ask them-when did language begin and where?-cannot in good conscience be answered for lack of convincing evidence.
Starostin was not a crackpot, however, and the hostility he faced seems out of proportion to his ideas and methods (not all of which, given the scale on which he worked, were first-rate or well-proven). At the same time, EHL's enormous scope offered Starostin an unrivaled Olympian view of the world's languages. If his reputation is still in flux a year after his death, this could have less to do with the quality of his scholarship than with the possibility that no one else is competent to assess in which of his many areas of activity he succeeded and failed. Perhaps the only person qualified to judge a body of work as diverse and far-reaching as Starostin's was Starostin himself.
Historical linguists have been rearranging, grafting onto, and pruning from the branches on the tree of human languages ever since 1786, when Sir William Jones, a British philologist and Supreme Court justice, delivered his celebrated talk to the Royal Asiastic Society in Calcutta. In that talk he argued that Greek, Latin, and Sanskrit, and probably Celtic, Gothic, and old Persian "have sprung from some common source, which, perhaps, no longer exists."
A family has been defined by the anthropologist and linguist Merritt Ruhlen "as a group of languages more closely related to each other than to any language outside that group; the proof is a set of exclusively shared innovations that characterize those languages and no others." English, German, Dutch, Icelandic, and Norwegian, for example, are members of the Germanic family. All are more closely related to each other, and to the extinct languages of Gothic and Burgundian, than any of them are to French or Romanian. And yet all carry traces in their phonemes and morphemes-the most basic units of sound and meaning-that link them together in the Indo-European (IE) family hypothesized by Jones (and also by the 17th-century Dutch scholar Marcus Zuerius van Boxhorn, who called his superfamily "Scythian").
From 10 to 20 superfamilies are now accepted by mainstream academics. But they disagree, often bitterly, about which ones they are, how they are constituted, even what they should be called. Starostin was an authority on protolanguages, theoretical reconstructions that explain how families of languages might be related. Disputes arise about whether any of these reconstructions faithfully reflect actual languages; in no case is there written evidence of their existence. But this has not deterred linguists over the centuries from inventing such models to determine if certain languages might belong together in larger groups, much as paleontologists might propose a common ancestor-not yet discovered and perhaps fated not to be-that would link together one or more species in the fossil record and lift all of them into a more inclusive group.
Even more speculatively, Starostin believed that a careful study of superfamilies should allow researchers to date how and when many of them split off from one another, many thousands or even tens of thousands of years ago. Plotting linguistic evidence against genetic and archaeological findings, EHL's scholars aim to trace the routes of human migrations as widely and as far back as the data will lead them, perhaps with enough reliability to make a case for the beginning of all human language in a single location.
The project's foundation is a computer database of immense proportions that allows languages to be compared at a glance and in a matter of seconds, even ones so recondite that few people have a thorough grasp of them. Using a computer program devised by Starostin called STARLING, one can look up the word for, say, tooth in Hittite (an extinct Anatolian language) and check the cognates, if any, to the same word in Tocharian (another extinct language, once spoken in Chinese Turkestan). Or one could search all of the 145 or so IE languages to see if the word has common roots, called cognates, in any of the other superfamilies.
Unprecedented and experimental, EHL operated outside American academia, where linguists are loath to risk their hard-earned scientific credibility on prehistorical hypotheses. Starostin's invitations to lecture in this country came from anthropology or Near Eastern history departments. Historical linguistics in this country has never quite recovered from Noam Chomsky's radical and influential ideas, which proposed language as an innate cognitive system and ability. The theory put far more emphasis on the hard-wiring of the brain and the mechanisms of grammar, and less on the cultural history of speech.
"If you were a smart linguistics student in the '60s and '70s you would likely be swept away by what Chomsky was doing," says Baxter. "A lot of people went into historical linguistics because they didn't like math and Chomsky looked as if it required a lot of math. The field is not in great shape in the U.S. A lot of universities just gave up their traditional places for historical linguistics. There's a refugee mentality among those that remain."
Unlike some historical linguists, Starostin was not afraid of math and was thoroughly at home on a computer. On one of my visits to Santa Fe, he was completing negotiations with Google, which bought his Russian morphology program-the code that sorts through and recognizes key words-for their new search engine in that language.
"Russian grammar is quite tricky and presents quite a problem for the formalizer," he said.
If Starostin's database allows mass comparisons of etymologies at a speed and on a scale never before feasible, its conclusions about any language or superfamily will only be as good as the information programmed into it. Bernard Comrie, who directs the Department of Linguistics at the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, believes that the inherent problems with reconstructing any protolanguage are magnified to an almost impossible degree in a project as far-reaching as EHL.
"Every reconstruction is a hypothesis in which some of the details will necessarily be wrong," he says. "As you go further back in time, the details become less and less reliable. Most linguists won't say exactly how far back they will accept evidence. But even with a time depth of 6,000 years, exceptions start creeping in. You just don't have enough good etymologies."
Despite these caveats, Comrie strongly endorsed Starostin in his quest. "I think we'll learn a lot from this project," he said, "even if it's negative evidence about how certain families aren't related." He regarded Starostin as brilliant, although he thought that his ideas were often wrong. "People who are always right," Comrie said, "aren't always the ones who contribute the most to the field."
Starostin's office in the sunny pueblo-style complex at the Santa Fe Institute, the prestigious think tank for interdisciplinary research in the sciences, was rather forlorn: a blackboard, a table, two chairs, and some bookcases with a few dictionaries (Dravidian, Korean, Tlingit) that he was reviewing for the EHL databases. He liked to work from home. But the flickering computer screen on his desk contained much of what he had been doing for the previous 10 years, and he was eager to show off his program.
As the director of EHL, Starostin no longer programmed languages into the etymological database, relying on others in Russia, the United Kingdom, the Netherlands, Israel, and the United States to keep the system primed. But he needed to check their work and to cheerlead when the tedium of keying in and proofreading the minutiae of linguistics terminology threatened to cause delays. Almost none of the contributors can afford to work on EHL full-time. Nonetheless, after almost five years they have the continents of Europe, Asia, and Australia mapped, some 2,300 languages so far.
There is a web-based version of EHL, free to anyone with access to a computer: (http://ehl.santafe.edu/main.html). But Starostin wanted to show me the desktop version, which has special features. As he sat down and opened up a file on the Altaic superfamily, he almost purred with happiness. He recited softly the names of obscure languages-Evenki, Chuvash, Nanai, Juchen-with the pride of a sommelier reading the carte des vins offerings from his huge and exclusive wine cellar.
"This is the genealogical tree of the Tungus-Manchu languages," he explained. As he pressed a key, a network of horizontal lines sprouted across the screen. Each line was labeled with a language and plotted against a timeline that estimates when each branch diverged.
To simplify the test for relatedness, EHL uses what are called "Swadesh lists." Named after the American linguist Morris Swadesh (1909-1967), these are elemental vocabulary, words such as tongue, stone, finger, hand, blood, moon, found in languages across oceans and millennia. The list is kept deliberately small, between 100 and 200 words, in order to facilitate data collection and comparison and, if possible, to prevent cultural bias by fieldworkers and infiltration by loanwords (vocabulary assimilated from other languages), both of which can taint a sample.
Once dismissed as hopelessly inadequate for the study of a system as complex as human language, the lists are now viewed as a legitimate if crude tool for establishing kinship. In certain instances, comparing lists of the same words can be more revealing than testing for other language qualities. For example, English syntax has over the centuries diverged sharply from German, with the former shedding the elaborate case endings still evident in the latter. But in nouns such as mouth/mund, or fire/feuer, or hound/hund, the ancestry is transparent.
Although all words eventually decay over time, shifting in meanings or phonetic form, some are more stable than others. Pronouns (especially for the first-person plural, we and us), numbers (one and two), and body parts ( tooth and tongue) are less apt to mutate drastically over long periods of time. Each of the 100 words used in the EHL databases is ranked for stability: we is number one; mountain is number 100.
By comparing Swadesh lists from two or many more languages-or superfamilies-and then measuring the percentage of words that have common roots, Starostin claimed that EHL could estimate when one language or family separated from another. Such a belief, based on another disputed technique called "glotto-chronology," was invented in the 1950s by Swadesh and the MIT linguist Robert Lees (1922-1996) as an attempt to do for linguistics what carbon-14 dating had done for archaeology: give researchers a handy method for dating a sample.
If vocabulary decays at a steady rate, they argued, the more cognates shared by two languages, the more recently they split; the fewer cognates, the more distant they are. Although scant evidence supports the notion that words decay at a uniform rate-a variable rate seems just as probable-the theory has some merit. The many cognates found in a Swadesh list for, say, English and German, suggest a more recent and less distant relationship than a comparison between the same words in English and Sanskrit. Indeed, the results from glotto-chronology roughly match the historical and archaeological record.
"We can see on the tree that Ulcha and Orok are very closely related, with an approximate date of separation around the start of the 14th century, 1390 to be exact," continued Starostin. He pressed another key and the computer tabulated the number of cognates shared by the two languages. According to the common-word list used by the program, Ulcha and Orok have 96.6 percent correspondences.
"That means that each of them has preserved about 97 percent of their common vocabulary," he said. "You can press this and it gives you the exact number of matches." Another set of numbers whirls on the screen. "Just three mismatches: 89 out of 92. There are eight instances in which the word was either not recorded or borrowed. Those are excluded." He then traced a series of detailed etymologies for "bone" across dozens of languages from different superfamilies, dropping into Indo-European, Sino-Tibetan, Yeniseian, and Chukchi-Kamchatkan.
It was the range, power, and flexibility of these databases that first impressed Murray Gell-Mann. Winner of the 1969 Nobel Prize in Physics for his elegant theory describing interactions between all the elementary particles at the time, including the quark, which he named, Gell-Mann was a founding member in 1989 of the Santa Fe Institute, a place that sometimes seems to exist to nurture his many serious and far-flung scientific interests.
One of these is the history of language. A white-haired elfin figure who turns 77 this fall, he is an irrepressible showoff, fond of dazzling strangers with his verbal dexterity and his knowledge across all disciplines-he was once offered the Lucasian Chair of Mathematics at Cambridge University, a position once held by Isaac Newton and now by Stephen Hawking. He is famous for having delivered his Stockholm banquet speech in Swedish. "Guardian or keeper of the woods," were his first words to me in a Santa Fe restaurant, guessing the roots of my surname. During the meal he also seemed eager to exhibit his Etruscan vocabulary.
One of SFI's few permanent faculty members, Gell-Mann has helped to organize several workshops there in linguistics. At one of these, in 1997, focusing on distant relationships between families, he met Starostin and was so intrigued by his already formidable database and the promise it held for various fields, including anthropology and archaeology, that he vowed to bring the project to Santa Fe and find funding for it. Many people had said the same sort of thing to Starostin over the years. "I never expected that Murray actually meant it," he said dryly.
But Gell-Mann was a man of his word. Upon retiring in 1998 from the board of the MacArthur Foundation, where he helped to establish the so-called genius grants, he was given an award of $1 million to be donated to a charity or project of his choosing. He asked that $200,000 be given to nature conservation and that the remaining $800,000 go to launch EHL.
In many ways a Russian was a natural choice to lead a mission of this scope. For decades, Soviet linguists have accepted the existence of a superfamily called Nostratic. As a teenage student at Moscow State University, where he started to sit in on courses at age 13, Starostin was exposed to the belief that all languages are at some level related. He studied Japanese for five years as well as Chinese and did fieldwork in the Caucasus every summer, immersing himself in cultures that few other linguists in the world had access to. On the island of Sakhalin, he lived among the Nivkh, a Siberian people decimated by the Bolsheviks shortly after the Revolution.
Starostin may have been better equipped in other ways too. His father, Anatoliy Starostin, was a translator into Russian from about 30 languages, especially admired for his versions of the classical Persian poets Nizami and Saadi. In the 1950s and early 1960s, he had been an editor in the publishing house Khudozhestvennava Literatura and was the official editor of Doctor Zhivago before the state cracked down on Boris Pasternak.
The National Public Radio journalist Anne Garrels remembers Sergei Starostin fondly from her years as an American television correspondent in Moscow. He was not yet 30 when she hired him in 1980 to improve her Russian. Starostin and his first wife, Tatiana, were, in the words of Garrels, "the classic intellectual couple, living on air. They had nothing. For me to bring them coffee was a huge deal. Whatever money they had, they spent on books." He would work all night and sleep until early afternoon, a schedule he favored all his life; Tatiana did her own post-graduate work in linguistics and raised their child, Georgiy, in a shoebox-sized apartment full of broken furniture. His keen ear for English made him the unofficial translator for underground Soviet bands eager to sing lyrics by the Beatles and Rolling Stones.
Without ever trying to go into exile, the Starostins spent much of their time with foreigners, a risky strategy for academics. He was detained by the RGB on more than one occasion and never allowed to travel outside the Soviet Union, not even to Bulgaria. Nor could he write for the best linguistic publications, a fact that may have harmed Starostin's reputation in the United States.
"To the extent that his work from that period overlaps with my own-on Old Chinese phonology-I can attest that Starostin did excellent work," says William Baxter. "But there's no tradition of peer review in their journals. He and his linguist friends were sitting around in each other's flats, presenting work to each other. They didn't have to get their papers into what we would regard as polished shape."
Errors, inevitable when juggling so many languages across many thousands of years, have allowed some linguists to dismiss the EHL project in the name of scientific accuracy. Sheila Embleton, author of a standard text, Statistics in Historical Linguistics, was alarmed at the tendency by Starostin "to play fast and loose with mathematical formulas." Even so, she counts herself among his admirers. "Work on distant relationships and a 'mother tongue' makes linguists nervous," she says. "There's a sense that it's all too big and unprovable and so they retreat. To me that's a shame. I'm not taking sides. But fear of making a mistake-automatically assuming a hypothesis is wrong that may turn out to be right-can lead to serious errors, too."
Critics of Starostin's project tend to come from two main camps. The first are linguists who resent anything associated with Joseph Greenberg. A professor for almost 40 years in the anthropology and social science departments at Stanford, Greenberg was undoubtedly one of the great linguists of the 20th century. He earned an international reputation in the late 1950s and early 1960s for pioneering work on universals, specific qualities or tendencies built into all languages. But it was his periodic reclassifications of the world's superfamilies-by himself he attempted what EHL is programmed to do-that truly riled historical linguists. He revolutionized the taxonomy of Africa in 1963 by proposing that its more than 1,000 languages could be arranged into four large groups. Attacked at the time, his theory is now widely accepted, even if the method he used to draw his conclusions-Mass (or Multilateral) Lexical Comparison-is still heretical.
Greenberg did not think it necessary to work out the rules that showed how various phonemes had evolved from one language into another, a technique that has marked the comparative method. With little data to work with other than dictionaries, and before computers were widespread, he mainly looked at sound patterns of groups of words and, noting their similarities and divergences from each other, reasoned out their larger kinships-in many cases correctly, it has turned out.
When he went on to contend in his 1987 book, Language in the Americas, that the vast majority of native languages could be grouped into a superfamily he called "Amerind," the book and Greenberg himself were pilloried by American Indian specialists. They picked apart his etymologies and ridiculed the idea that close to 600 distinct tongues, once spoken from Alaska to Patagonia, could be so easily classified. Unlike his African classification, Greenberg's Amerind is highly controversial to this day.
Greenberg visited the Santa Fe Institute more than once before his death in 2001, and his most ardent disciple, Merritt Ruhlen, has been a frequent guest and a co-director of EHL. But Starostin, while an admirer of Greenberg, wanted the project to adhere to stricter standards, providing when at all possible the sound correspondences and laws that serve as traditional evidence of relatedness in the discipline. This did not prevent EHL from being unofficially shunned by American Indian language specialists, a problem that has left a conspicuous hole in the attempt to build a comprehensive database.
"Who's going to work on this for us?" asked Gell-Mann. "The people in American Indian languages are not just opposed to us, they are viciously opposed."
They are joined in their general distrust (or contempt) by a second camp of critics: linguists who find the techniques of EHL insufficiently rigorous. The misgivings of some are summarized in a paper given by Brett Kessler at a Harvard workshop in the spring of 2005, "Mathematical Modeling and Analysis of Language Diversification."
Kessler wrote his dissertation on statistical methods for testing historical connections between languages at Stanford and now teaches psychology at Washington University in St. Louis. In his Harvard paper, called "Verifying Historical Relationships Between Groups of Languages," he argued that in essence linguists searching for signs that suggest languages share a kinship are destined to find what they're looking for. As he says, "There will always be patterns and the more languages you look at, the more resemblances turn up."
From 16 languages in the Indo-European and Uralic superfamilies he made a Swadesh list of 100 words. Then he put them through a computer program to test their relationship. It recorded a high percentage of matches. Then he scrambled the data, so that words and their meanings were randomly attached. He got the same percentages.
"The random correspondences were as numerous as the actual ones," Kessler says. "Linguists don't have any way of verifying whether what they're seeing is true relatedness or the result of chance. You can convince yourself, but there isn't any method to convince the person in the next cubicle."
Not surprisingly, Starostin, Gell-Mann, and Ruhlen hotly disputed Kessler's results. Ruhlen wrote, "It's a caricature of both taxonomy and historical linguistics." He cited the intuited creations of Indo-European by Sir William Jones as an example of successful comparison among previously unconnected languages. "That everything is just an accident is one of the oldest criticisms of long-range comparisons and has been dealt with over and over."
Starostin was even more specific in his rebuttal. He agreed that after 6,000 to 8,000 years of separation between languages, "the vast majority of original cognates are lost and their phonologies are usually drastically different, since both phonological and lexical changes accumulate inevitably. At this point if you compare two (or more) modern languages, you may encounter a situation like the one Kessler observes," he wrote by e-mail.
But he cited a paper from 2000 by Baxter and Alexis Manaster Ramer that used a shorter Swadesh list of 35 words from Indo-European and Uralic. With the same randomizing procedure, and scrambling the data 10,000 times, it produced a significantly higher percentage of actual matches than the average random number.
In the EHL database, Indo-European and Uralic share 10 or more cognates, including those for I and we, thou, name, die, and not. And in the EHL reconstruction for Proto-Indo-European and Proto-Uralic, there are 34 percent matches, while modern English and Finnish (members of IE and Uralic superfamily respectively) have only 20 percent matches.
"If those were random," reasoned Starostin, "one would expect the same number of matches between Proto-Indo-European and Proto-Uralic as between English and Finnish, and this is quite evidently not the case. The fact that Kessler cannot provide a statistical proof for Indo-European and Uralic being related does not mean that they aren't: it just proves that he has not found a good solution."
In response, Kessler described EHL as a "really exciting idea, and I am very appreciative as well of the way Starostin and his colleagues have shared their data and results. But my experience with long-range theories has always been one of disappointment."
Even within the EHL project, participants disagree about methodology. Ruhlen is skeptical that one can trace human migrations with linguistic evidence. "We've gathered languages into about a dozen families," he says. "But we haven't shown how they developed, which broke off first. Geneticists are much better at showing how the human population broke up and its intermediate stages. They have a fairly convincing tree and we don't."
Giotto-chronology has been refined in the last 15 years. But many are uncomfortable with the theory of a constant rate of word decay as a criterion for language divergence. "I should be doing much more theoretical work on glotto," says Gell-Mann. "The math really isn't very good. It is something I really could be contributing to if I weren't so lazy. There's no harm in having these dates. We just have to realize that they're crude."
A more crucial issue is that the linguistic evidence from EHL is not correlating to the archaeological or genetic data. In his celebrated 2000 book, Genes, Peoples, and Languages, the Italian biologist Luigi Luca Cavalli-Sforza (a regular visitor to SFI) found a startling unity between the history of language and the history of human migrations. But EHL is finding that the two do not so easily match up.
"The crazy thing is language looks more recent than we want it to," said Gell-Mann. "The heresy used to be to suppose that language was new. But according to Sergei's formulas for when languages in Africa or Eurasia began to split, we can go back to maybe 20,000 years. Even with a few thousand years of slop, that's not when Homo sapiens began to paint in caves and use beautiful upper Paleolithic tools. That's more like 50,000 years ago. Suppose we do something to include all the other languages, maybe we could push it back to 25,000 years. But how are we going to get to 50?"
Starostin's death shook the foundations of EHL. It took six months for the new chain of command to reenergize the troops, with Ilia Peiros now overseeing the input of etymological data from Santa Fe in conjunction with Starostin's son, Georgiy, also a gifted linguist who directs the Sinology Department at the Institute of Oriental Cultures at the Russian State University for the Humanities in Moscow. The money donated from the MacArthur Foundation via Gell-Mann runs out in October. But Gell-Mann hopes to have a new donor soon.
Like any scholar, Sta rostin feared but also expected that over the years errors would be detected in EHL's thousands of etymologies. But as he pointed out, perhaps one-half of those in early Indo-European dictionaries have been replaced with better ones. Self-correction is a process integral to science and scholarship. He sounded baffled that American academics were not more excited by EHL, even if some parts may be too conjectural. "They're terrified of making mistakes," he said with a surprising flash of anger.
Starostin in the months before his death was learning more languages and preparing for a workshop at the Santa Fe Institute on human migrations. "I have to know more about Na-Dene," he stressed to me a week before he left Santa Fe for Moscow. John Bengston, a member of the EHL team, has gathered data to support a thesis, originally put forth in 1929 by Edward Sapir, that situates Basque in the Na-Dene family, along with a group of American Indian languages, including Tlingit, Navaho, and Kiowa Apache-a bewildering network of regions to connect in terms of human travels across the millennia.
In the e-mails we traded in the summer of 2005, Starostin sounded excited and relaxed. In May, Leiden University had presented him with his first doctor honoris causa and organized a symposium in his honor. He was relieved to be home in Russia and teaching again.
"I still like being in Moscow among old friends, colleagues, and students and somehow feel more comfortable here," he wrote. When the end came, "It was a typical Russian death, unfortunately," says Peiros. "Sergei never went to a doctor. Never had a checkup. He smoked, didn't exercise. He did everything to destroy himself. His father died in the Moscow airport of a heart attack at almost the same age."
Starostin believed that EHL had reduced the sum of all human languages to four superfamilies: Dene-Caucasian (the Sino-Tibetan, North Caucasian, Yenisseian, Burashaski, Na-Dene families, as well as Basque); Euroasiatic or Nostratic (including IE, Altaic, Dravidian, Uralic, Samoyedic, and Finno-Ugric); Afro-Asiatic (with the Semitic, Berber, and Chadic families, as well as ancient Egyptian) ; and Austric (a gigantic entity of almost 2,000 languages, its two major branches being Austronesian and Austroasiatic, and two minor ones, Miao-Yao and Mon-Khmer). Since Starostin's death, EHL researchers have investigated whether the language families of New Guinea and South Africa might belong with others to an ancient "Borean" superfamily.
Peiros stresses that the taxonomy is purely a hypothesis. Many of these groups are based on incomplete or almost nonexistent data. "We have some solid parts but a lot of vague parts, too," he says candidly. It may be that EHL is still caught between an older model of historical linguistics, based on traditional methods developed in the 19th century, and a newer, and still embryonic style dial relies more on inferential statistics. That is the view of William Baxter, who sees Starostin as an important transitional figure in the history of the field.
"In the next few decades I predict linguists, inspired by geneticists, will develop better algorithms that will allow us to settle matters of significance in data without the present acrimony," he says. He commends Starostin as "a pioneer in so many ways-for bringing the power of computers to the study of historical problems, for putting his data up on die Web for anyone to share and judge, and for challenging assumptions about, say, what is an acceptable time-depth to study, assumptions that are, when you examine them closely, little more than folklore."
At the same time, diough, Baxter believes Starostin's mathematics was not deep enough to answer the demands he made of it. "A lot of his work was exploratory and can be improved upon. I don't think the particular hypotheses he came up with are as important as his willingness to go beyond the limits others had arbitrarily set up. He was intellectually fearless."
A volume of papers dedicated to Starostin's memory will be published by SFI by the end of the year. Based on talks given in Moscow last March by GellMann, Peiros, and others, it can only graze over the many topics that consumed him. How far EHL can advance without his leadership is the unanswered question. It is a terrible loss to scholarship that he did not live to take the story of language further, perhaps back before the confusion of so many tongues. The Tower of Babel is now and for the foreseeable future an intriguing ruin.
"THERE ARE historical linguists who consider it their job to stamp out the kind of work that engaged Starostin."
"A LOT OFPEOPLE went into historical linguistics because they didn't like math and Chomsky looked as if it required a lot of math. The field is not in great shape in the U.S."
"WORK ON DISTANT relationships and a 'mother tongue' makes linguists nervous. . . . But fear of making a mistake can lead to serious errors, too."
Richard B. Woodward is an arts critic and journalist in New York.The Man Who Loved Languages
Byline: Woodward, Richard B
Publication Date: 10-01-2006
Copyright Phi Beta Kappa Society Autumn 2006