FOXP2 and the Evolution of Language
This article addresses the history and the significance of the
discovery of the relevance of FOXP2 in the development of speech. It is a remarkable
scientific detective story that has been in the making for some time. In
its earlier stages, there was serious disagreement within the
scientific community about how the scientific findings should be
interpreted, and this was set against a background of sensationalist
reporting by the popular press.
The story goes like this: The KE family
were brought to the attention of the scientific community in about 1990.
Over three generations of this family, about half the family members suffer from a number
of problems, the most obvious of which is severe difficulty in speaking,
to such an extent that the speech of the affected people is largely unintelligible,
they are taught signs as a supplement to speech as children. It is a
complicated condition including elements of impairment in speech
articulation and other linguistic skills, and broader intellectual and
physical problems. From the outset it seemed quite likely, from the
pattern of inheritance, that the disorder is associated with a mutation in a
single autosomal-dominant gene. It is rather surprising that such a diffuse
condition should be linked to a single genetic defect, but it turned out to
be so for reasons that we shall see later.
From the beginning, there has been a range of views in the professional scientific community with regard to whether the gene in question is a `language' or a `grammar' specific gene. Those disagreements continue in a somewhat abated form today.
The Disorder is not grammar or speech-specific
In 1995, Vargha-Khadem et al, published a paper investigating the
phenotype of the disorder and showing quite clearly that it is not
grammar or speech specific (1). They tested affected and unaffected
family members and concluded that the disorder had the following
characteristics: defects in processing words according to grammatical rules;
understanding of more complex sentence structure such as sentences with
embedded relative clauses; inability to form intelligible speech; defects in
the ability to move the mouth and face not associated with speaking
(relative immobility of the lower face and mouth, particularly the upper
lip); and significantly reduced IQ in the affected compared with the
unaffected in both the verbal and the non-verbal domain.
This last finding, about IQ, has been swept under the carpet by some commentators ( including Nicholas Wade in the New York Times (2) ), who claim that since the ranges of IQ of the affected and unaffected overlap and furthermore that since some of the affected achieve scores above the population mean for non-verbal IQ, then the disorder does not include a general intellectual challenge and is therefore language specific. But, note that the mean of the affected non-verbal IQ is 86 (range 71 – 111) versus a mean IQ for the unaffected of 104 (range 84 – 119). Not only is the mean significantly different between affected and unaffected family members but three of the affected had non-verbal IQ scores below 85, which is the normal lower limit for classifying speech defects as `specific language impairment' (that is, any disorder that affects speech only and is not caused by more general cognitive problems). Note, however, that the KE disorder cannot be explained solely by a general cognitive deficiency, because it is present in individuals whose non-verbal IQ is close to or a little above the population average and because it is accompanied by deficiencies in motor control of the face and mouth.
The gene is found and is shown to cause brain abnormalities
In 1998, two important papers were
published. Fisher et al (3), carried out a linkage study
that narrowed the location of the mutation to a short stretch of code on
chromosome 7 (7q31). They named the gene SPCH1.
Vargha-Khadem et al looked at functional and structural abnormalities in the brains of the affected subjects (4). It is a long and involved paper, but basically they found several structural (size - by MRI scan), and functional (degree of activity -by PET scan) brain abnormalities in the affected family members. The most significant abnormality was bilateral reduction in the size of caudate nucleus (a component of the basal ganglia) coupled with abnormal high activity in the left caudate nucleus during speech tasks. The caudate nucleus is implicated in motor co-ordination and also processes information that is being sent from other areas of the brain to the frontal lobe. Broca's area, important for speech production, was also smaller and over-activated during speech production in affected subjects.
So, at this stage, researchers
were faced with a diffuse condition that
affects grammar, speech production, non-verbal intelligence and non-speech
related movement of the mouth and face, with detectable brain pathologies
caused by a mutation in a single gene on chromosome 7 region
7q31. Then in 2001, Lai et al (authors include Anthony Monaco) published
a paper (5) announcing that they had identified the
precise gene that has a mutation in affected members of the KE family.
All hell broke loose in the popular press, with renewed claims for a gene
for language or a gene for grammar. Reading the popular reports, one would
think that humans have a gene for language that all other creatures lack.
That is not so and we'll see why in a minute. Interestingly, all the popular
reports miss the real scoop about the story which was about the kind of gene
that Lai et al actually uncovered. Lai et al homed in on a gene within the region
that Fisher et al had determined. First they found that an individual
unrelated to the KE family who has an almost identical suite of
deficiencies had a gene in 7q31 region that was broken by a
translocation - that is code that moved from a different location into
the middle of the gene they were looking for, thereby breaking it. When they
looked at the KE family they found a single point mutation in the same gene:
a G to A single nucleotide mutation on exon 14 that changes the amino acid
arginine to histidine. Amazingly, this mutation is found in all the affected
individuals in the KE family and in none of the unaffected individuals nor
in any controls for the normal population (about 360 people tested). There
is absolutely no doubt that this single substitution breaks the gene and
causes the condition.
The broken gene codes for a transcription factor
Now we get to the really interesting part. The
popular press had gone bananas over the fact that a single gene (a gene for
language in popular parlance) was found to cause the condition. But that was
no news at all - we knew from the first studies of the family in 1990
that, according to the pattern of inheritance of the condition, it is a
single gene trait. No, the news that they all missed lay in the nature of
the gene itself. The gene that Lai et al identified contains what is known
variously as a forkhead binding or a winged helix domain. There is a big
family of genes that contain this domain. They are called FOX genes and
according to convention in gene nomenclature this one was called FOXP2.
The forkhead domain is an important functional part of the function of the gene - it codes for a transcription factor. What is a transcription factor? It is a protein that binds to the promoter region of other genes and facilitates their transcription from DNA to RNA. In other words, in the presence of the transcription factor a gene makes protein; in its absence it does not. The forkhead binding domain of FOX genes is the critical portion of FOX genes that allows their protein to bind to the promoter region of other genes. The particular amino acid that is mutated in the unfortunate members of the KE family is extremely highly conserved. In all cases where FOX genes have been sequenced, from yeast to people, that amino acid is always an arginine: always, every time, except in affected member of the KE family where the residue is a histidine. Just that one change breaks the gene as thoroughly as it is broken by a translocation that disrupts its open reading frame.
So what is the sensational finding? The key point, that all the popular reports missed, is that FOXP2 is a transcription factor - in other words it has the potential to affect the expression of an unknown, but potentially large number of other genes. No wonder the syndrome presents in such a diffuse way. We know now that a FOXP2 homologue is strongly expressed in the development of the mouse brain. So not only does it potentially affect many other genes, but it is known to be important in the development of the brain (by being strongly expressed in the brain of the mouse embryo). I expect that breaking FOXP2 in mice would result in some compromises to brain structure and function - an experiment that someone is sure to do.
27th June 2005: That experiment has just been reported. (Shu et al, Altered ultrasonic vocalization in mice with a disruption in the FOXP2 gene (7). They report that: 'Disruption of both copies of the Foxp2 gene caused severe motor impairment, premature death, and an absence of ultrasonic vocalizations that are elicited when pups are removed from their mothers. Disruption of a single copy of the gene led to modest developmental delay but a significant alteration in ultrasonic vocalization in response to such separation. Learning and memory appear normal in the heterozygous animals. Cerebellar abnormalities were observed in mice with disruptions in Foxp2, with Purkinje cells particularly affected. Our findings support a role for Foxp2 in cerebellar development and in a developmental process that subsumes social communication functions in diverse organisms.'
This is exactly as I predicted above, two years ago.
So the mutation to FOXP2 seems to result in brain defects during embryo development that result in disruption of neural pathways essential for human speech, but which also has other effects.
The molecular evolution of FOXP2
Now let's move on to 2002 when Enard et al (authors include Svante Paabo & Anthony Monaco) (6) published a paper describing work that investigated the evolution of FOXP2. The first thing they note is the extremely highly conserved nature of FOXP2. We have already seen that in all cases, in all species investigated, the amino acid mutated in the KE family is identical. The major splice form of the protein encoded by the gene (it has a pair of alternatively spliced exons) is 715 amino acids long and the protein is identical with no differences whatsoever in chimpanzee, gorilla and rhesus monkey. The mouse FOXP2 differs in just one amino acid from these three species. However, human FOXP2 differs from gorilla, chimp and rhesus macaque in two further amino acids (and thus differs from mouse in three amino acids out of 715). So, in 75 million years since the divergence of mouse and chimpanzee lineages only one change has occurred in FOXP2, (and that equates to 150 million years of evolution as we don't know whether the mutation occurred in the mouse or the primate lineage) whilst in the six million years since the divergence of man and chimpanzee lineages two changes have occurred in the human lineage.
Fig 1: Silent and replacement nucleotide substitutions mapped on a phylogeny of primates. Bars represent nucleotide changes. Grey bars indicate amino-acid changes - after reference (6)
Furthermore the one change between chimp and mouse is predicted to be functionally unimportant, and both changes that have occurred in the human lineage occur on the same exon of the gene and at least one of them is predicted to be functionally important (an aspargine to serine change that provides a target site for phosphorylation which is a process that has been shown to mediate transcription in forkhead transcription factors).
Next, Enard et al searched for evidence of a selective sweep in the region of the FOXP2 gene. A selective sweep occurs when a particularly beneficial mutation spreads rapidly through a population. The selective sweep eliminates not just alternative alleles of the gene in question but also reduces the polymorphism of other genes that are physically close (linked) on the chromosome. Enard et al found strong evidence of this both in introns within FOXP2 and in recombining loci. So they suggest that the selective sweep was caused by one or both of the mutations in humans, and then they go on to speculate that this change has been critical in the evolution of human speech, perhaps by enabling fine oro-facial movements essential to speech which are absent in affected members of the KE family. The mutation that affects half the members of the KE family is known to cause the problems with speech, but it is in an entirely different part of the gene from the changes that have occurred in the human lineage. What would happen if we reverted FOXP2 in humans to the ancestral form still found in monkeys, chimps and gorilla? It is an experiment that could probably be done but ought never to be done, being of course, entirely unethical.
Finally, by looking at silent polymorphisms in the gene, Enard et al estimate that the mutations in the FOXP2 in the human lineage occurred between 10,000 and 100,000 years ago and speculate that the mutations have been critical for the development of human speech as we understand it and also critical for the development of fully human society and cognition.
The whole saga has been
an amazing scientific detective story; I have barely scratched the surface:
each individual paper referenced has as much detailed detective work as the whole story
I have told here and more.
We should beware of popular reports of scientific discoveries: almost all the popular reports of FOXP2 claimed that it was the gene for language or even more ludicrously the gene for grammar - the truth is more complicated and far more interesting than that. There are many popular reports of scientific discoveries which are equally sensationalised.
No-one should imagine that the development of language relied exclusively on a single mutation in FOXP2. They are many other changes that enable speech. Not least of these are profound anatomical changes that make the human supralarygeal pathway entirely different from any other mammal. The larynx has descended so that it provides a resonant column for speech (but, as an unfortunate side-effect, predisposes humans to choking on food). Also, the nasal cavity can be closed thus preventing vowels from being nasalised and thus increasing their comprehensibility. These changes cannot have happened over such a short period as 100,000 years. Furthermore the genetic basis for language will be found to involve many more genes that influence both cognitive and motor skills
Human mind needs human cognition and human cognition relies on human speech. We cannot envisage humanness without the ability to think abstractly, but abstract thought requires language. This finding confirms that the molecular basis for the origin of human speech and, indeed, the human mind, is critical. Ultimately, we will find great insight from further unravelling the evolutionary roots of human speech - in contrast to Noam Chomsky's lack of interest in this subject
Steven Pinker's view about FOXP2 is that the fixed human-specific mutations in the gene might enable fine oro-facial movements and so trigger the development of language.
My personal view is that the breaking of FOXP2
in the KE family is more likely to
have caused a cognitive deficiency during development in those affected rather
than a purely physical deficiency in oro-facial motor skills, and
that these motor deficiencies are a secondary phenomenon,
perhaps caused by lack of use
It will not be easy to unravel the pathways by which language evolved in humans. If we are to have any hope of doing so, we will need close collaboration between linguists and biologists, who have, until recently, been rather suspicious of one another.
1. Vargha-Khadem et al, 'Praxic and nonverbal cognitive deficits in a large family with a genetically transmitted speech and language disorder', Proc Nat Acad Sci USA 92, 930 – 933 (1995)
2. Nicholas Wade, 'Early Voices: The Leap to Language', New York Times, July 15th 2003
3. Fisher et al, Nat Genet 18, 168 –170 (1998)
4.Vargha-Khadem et al,'Neural basis of an inherited speech and language disorder', Proc Nat Acad Sci USA 95, 12695 – 12700 (1998)
5. Lai et al, 'A forkhead-domain gene is mutated in a severe speech and language disorder', Nature 413, 519 - 523 (2001)
6. Enard et al, 'Molecular evolution of FOXP2, a gene involved in speech and language', Nature 418, 869 - 872, (2002)
7. Shu et al, 'Altered ultrasonic vocalization in mice with a disruption in the FOXP2 gene', published on-line in the Proceedings of the National Academy of Sciences and available on-line here: