Ten ago, in the pre-ancient DNA "Dark Ages" a big debate raged on about the origin of R1a men in India. The stage had been set even earlier, by the pioneering Eurasian heartland paper which was the first (to my memory) to link M17 with steppe migrations and Indo-Iranians. Yet, there was pushback as the distribution of M17 was better described, and people started using Y-STRs to try to date and place phylogeographically its migrations.
The two poles of the debate were the "Out-of-India", which relied primarily on Y-STR based time estimates that seemed very old (even Paleolithic, if one used the wrong mutation rate) in India, and the "Into-India" which thought that the R1a distribution pointed to its being brought into India by the Indo-Aryans in the conventional ~3,500BC time frame of the "Aryan Invasion Theory" (AIT).
AIT has been much maligned because it has been received as a Western colonialist imposition on Indian history: a way to claim that Indian civilization was not native but European in origin. Europeans were certainly guilty of misusing AIT: for British colonials it represented a precedent for their colonization of India; for German National Socialists it was evidence for the greatness of the Aryan race and its past expansions eastward. It also played into internal Indian politics, espoused by some as a means of furthering their superiority as either descendants of "Aryan conquerors" or as oppressed victims of the same.
Of course, a misuse of a theory does not mean it is wrong, and if a new preprint based on ancient and modern DNA is correct, it means that AIT was basically correct: Indo-Aryans did come to India in the Late Bronze Age, via the steppe, and ultimately from central Europe.
The opposing Out-of-India theory is all but dead, although failed theories often have a long half-life, especially if they are espoused for psycho-political reasons. I would argue that Out-of-India was dead for thousands of years before it was conceived, since even in Homer's time it was known that "India" was not "one thing" but was peopled by Indians in the north and "Eastern Ethiopians" in the south (which differed from their western "actual" Ethiopians of Africa by their possession of straight rather than curly hair). These were the "Ancestral North Indians" and "Ancestral South Indians" that modern science has revealed. Out-of-India is little more than a nationalistic myth functioning as an antidote to this basic dichotomy, a way to imbue India's diverse citizens with a myth of common origins.
Yet, proponents of AIT (who have a non-trivial overlap with R1an enthusiasts) are also scratching their heads because of the 27 ancient South Asian males from South Asia studied in the preprint there is exactly one R1a, who also happened to live after the time of the Buddha and not during the Bronze Age.
Both OIT enthusiasts (who expected copious and abundant R1a in India and its environs since the Paleolithic) and AIT/R1an enthusiasts (who expected to see it come in c. 3,500BC) are bound to be disappointed.
Perhaps the R1a Indo-Aryans did come to South Asia in a conventional AIT time frame and they haven't been sampled. Or, maybe they were, indeed, there, but were not R1ans. Or, maybe both sides missed the bigger story which is that the Indo-Aryans (so closely associated with India today) were simply not there as early as people have thought.
bioRxiv: doi: https://doi.org/10.1101/292581
The Genomic Formation of South and Central Asia
Vagheesh M Narasimhan, Nick J Patterson et al.
The genetic formation of Central and South Asian populations has been unclear because of an absence of ancient DNA. To address this gap, we generated genome-wide data from 362 ancient individuals, including the first from eastern Iran, Turan (Uzbekistan, Turkmenistan, and Tajikistan), Bronze Age Kazakhstan, and South Asia. Our data reveal a complex set of genetic sources that ultimately combined to form the ancestry of South Asians today. We document a southward spread of genetic ancestry from the Eurasian Steppe, correlating with the archaeologically known expansion of pastoralist sites from the Steppe to Turan in the Middle Bronze Age (2300-1500 BCE). These Steppe communities mixed genetically with peoples of the Bactria Margiana Archaeological Complex (BMAC) whom they encountered in Turan (primarily descendants of earlier agriculturalists of Iran), but there is no evidence that the main BMAC population contributed genetically to later South Asians. Instead, Steppe communities integrated farther south throughout the 2nd millennium BCE, and we show that they mixed with a more southern population that we document at multiple sites as outlier individuals exhibiting a distinctive mixture of ancestry related to Iranian agriculturalists and South Asian hunter-gathers. We call this group Indus Periphery because they were found at sites in cultural contact with the Indus Valley Civilization (IVC) and along its northern fringe, and also because they were genetically similar to post-IVC groups in the Swat Valley of Pakistan. By co-analyzing ancient DNA and genomic data from diverse present-day South Asians, we show that Indus Periphery-related people are the single most important source of ancestry in South Asia — consistent with the idea that the Indus Periphery individuals are providing us with the first direct look at the ancestry of peoples of the IVC — and we develop a model for the formation of present-day South Asians in terms of the temporally and geographically proximate sources of Indus Periphery-related, Steppe, and local South Asian hunter-gatherer-related ancestry. Our results show how ancestry from the Steppe genetically linked Europe and South Asia in the Bronze Age, and identifies the populations that almost certainly were responsible for spreading Indo-European languages across much of Eurasia.
Link
Showing posts with label India. Show all posts
Showing posts with label India. Show all posts
April 13, 2018
April 24, 2016
Jewish and Indian ancestry in the Bene Israel
PLoS ONE 11(3): e0152056. doi:10.1371/journal.pone.0152056
The Genetics of Bene Israel from India Reveals Both Substantial Jewish and Indian Ancestry
Yedael Y. Waldman , Arjun Biddanda , Natalie R. Davidson, Paul Billing-Ross, Maya Dubrovsky, Christopher L. Campbell, Carole Oddoux, Eitan Friedman, Gil Atzmon, Eran Halperin, Harry Ostrer, Alon Keinan
The Bene Israel Jewish community from West India is a unique population whose history before the 18th century remains largely unknown. Bene Israel members consider themselves as descendants of Jews, yet the identity of Jewish ancestors and their arrival time to India are unknown, with speculations on arrival time varying between the 8th century BCE and the 6th century CE. Here, we characterize the genetic history of Bene Israel by collecting and genotyping 18 Bene Israel individuals. Combining with 486 individuals from 41 other Jewish, Indian and Pakistani populations, and additional individuals from worldwide populations, we conducted comprehensive genome-wide analyses based on FST, principal component analysis, ADMIXTURE, identity-by-descent sharing, admixture linkage disequilibrium decay, haplotype sharing and allele sharing autocorrelation decay, as well as contrasted patterns between the X chromosome and the autosomes. The genetics of Bene Israel individuals resemble local Indian populations, while at the same time constituting a clearly separated and unique population in India. They are unique among Indian and Pakistani populations we analyzed in sharing considerable genetic ancestry with other Jewish populations. Putting together the results from all analyses point to Bene Israel being an admixed population with both Jewish and Indian ancestry, with the genetic contribution of each of these ancestral populations being substantial. The admixture took place in the last millennium, about 19–33 generations ago. It involved Middle-Eastern Jews and was sex-biased, with more male Jewish and local female contribution. It was followed by a population bottleneck and high endogamy, which can lead to increased prevalence of recessive diseases in this population. This study provides an example of how genetic analysis advances our knowledge of human history in cases where other disciplines lack the relevant data to do so.
Link
The Genetics of Bene Israel from India Reveals Both Substantial Jewish and Indian Ancestry
Yedael Y. Waldman , Arjun Biddanda , Natalie R. Davidson, Paul Billing-Ross, Maya Dubrovsky, Christopher L. Campbell, Carole Oddoux, Eitan Friedman, Gil Atzmon, Eran Halperin, Harry Ostrer, Alon Keinan
The Bene Israel Jewish community from West India is a unique population whose history before the 18th century remains largely unknown. Bene Israel members consider themselves as descendants of Jews, yet the identity of Jewish ancestors and their arrival time to India are unknown, with speculations on arrival time varying between the 8th century BCE and the 6th century CE. Here, we characterize the genetic history of Bene Israel by collecting and genotyping 18 Bene Israel individuals. Combining with 486 individuals from 41 other Jewish, Indian and Pakistani populations, and additional individuals from worldwide populations, we conducted comprehensive genome-wide analyses based on FST, principal component analysis, ADMIXTURE, identity-by-descent sharing, admixture linkage disequilibrium decay, haplotype sharing and allele sharing autocorrelation decay, as well as contrasted patterns between the X chromosome and the autosomes. The genetics of Bene Israel individuals resemble local Indian populations, while at the same time constituting a clearly separated and unique population in India. They are unique among Indian and Pakistani populations we analyzed in sharing considerable genetic ancestry with other Jewish populations. Putting together the results from all analyses point to Bene Israel being an admixed population with both Jewish and Indian ancestry, with the genetic contribution of each of these ancestral populations being substantial. The admixture took place in the last millennium, about 19–33 generations ago. It involved Middle-Eastern Jews and was sex-biased, with more male Jewish and local female contribution. It was followed by a population bottleneck and high endogamy, which can lead to increased prevalence of recessive diseases in this population. This study provides an example of how genetic analysis advances our knowledge of human history in cases where other disciplines lack the relevant data to do so.
Link
January 26, 2016
History of extant populations of India
The five components they speak of are ANI, ASI, AAA (Ancestral Austro-Asiatic), ATB (Ancestral Tibeto-Burman), and a distinct fifth ancestry in the Andaman archipelago.
The differentiation of the four main components seems clear enough on the figure (left). The big question is how and in what order the different components got into India. I would wager that ASI was first and I modify my New Year's wish to ask for some ancient DNA from India too.
An interesting bit from the paper:
PNAS doi: 10.1073/pnas.1513197113
Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure
Analabha Basu, Neeta Sarkar-Roya, and Partha P. Majumder
India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.
Link
The differentiation of the four main components seems clear enough on the figure (left). The big question is how and in what order the different components got into India. I would wager that ASI was first and I modify my New Year's wish to ask for some ancient DNA from India too.
An interesting bit from the paper:
...that the practice of endogamy was established almost simultaneously, possibly by decree of the rulers, in upper-caste populations of all geographical regions, about 70 generations before present, probably during the reign (319–550 CE) of the ardent Hindu Gupta rulersHow plausible is that to anyone familiar with Indian history?
PNAS doi: 10.1073/pnas.1513197113
Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure
Analabha Basu, Neeta Sarkar-Roya, and Partha P. Majumder
India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.
Link
August 12, 2014
168 South Asian Genomes
PLoS ONE 9(8): e102645. doi:10.1371/journal.pone.0102645
The South Asian Genome
John C. Chambers et al.
The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.
Link
The South Asian Genome
John C. Chambers et al.
The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.
Link
March 27, 2014
Origins of the Tharu
European Journal of Human Genetics advance online publication 26 March 2014; doi: 10.1038/ejhg.2014.36
Unravelling the distinct strains of Tharu ancestry
Gyaneshwer Chaubey et al.
The northern region of the Indian subcontinent is a vast landscape interlaced by diverse ecologies, for example, the Gangetic Plain and the Himalayas. A great number of ethnic groups are found there, displaying a multitude of languages and cultures. The Tharu is one of the largest and most linguistically diverse of such groups, scattered across the Tarai region of Nepal and bordering Indian states. Their origins are uncertain. Hypotheses have been advanced postulating shared ancestry with Austroasiatic, or Tibeto-Burman-speaking populations as well as aboriginal roots in the Tarai. Several Tharu groups speak a variety of Indo-Aryan languages, but have traditionally been described by ethnographers as representing East Asian phenotype. Their ancestry and intra-population diversity has previously been tested only for haploid (mitochondrial DNA and Y-chromosome) markers in a small portion of the population. This study presents the first systematic genetic survey of the Tharu from both Nepal and two Indian states of Uttarakhand and Uttar Pradesh, using genome-wide SNPs and haploid markers. We show that the Tharu have dual genetic ancestry as up to one-half of their gene pool is of East Asian origin. Within the South Asian proportion of the Tharu genetic ancestry, we see vestiges of their common origin in the north of the South Asian Subcontinent manifested by mitochondrial DNA haplogroup M43.
Link
Unravelling the distinct strains of Tharu ancestry
Gyaneshwer Chaubey et al.
The northern region of the Indian subcontinent is a vast landscape interlaced by diverse ecologies, for example, the Gangetic Plain and the Himalayas. A great number of ethnic groups are found there, displaying a multitude of languages and cultures. The Tharu is one of the largest and most linguistically diverse of such groups, scattered across the Tarai region of Nepal and bordering Indian states. Their origins are uncertain. Hypotheses have been advanced postulating shared ancestry with Austroasiatic, or Tibeto-Burman-speaking populations as well as aboriginal roots in the Tarai. Several Tharu groups speak a variety of Indo-Aryan languages, but have traditionally been described by ethnographers as representing East Asian phenotype. Their ancestry and intra-population diversity has previously been tested only for haploid (mitochondrial DNA and Y-chromosome) markers in a small portion of the population. This study presents the first systematic genetic survey of the Tharu from both Nepal and two Indian states of Uttarakhand and Uttar Pradesh, using genome-wide SNPs and haploid markers. We show that the Tharu have dual genetic ancestry as up to one-half of their gene pool is of East Asian origin. Within the South Asian proportion of the Tharu genetic ancestry, we see vestiges of their common origin in the north of the South Asian Subcontinent manifested by mitochondrial DNA haplogroup M43.
Link
February 28, 2014
4.2 kiloyear event and the demise of Indus Valley megacities
From the paper:
Geology doi: 10.1130/G35236.1
Abrupt weakening of the summer monsoon in northwest India ∼4100 yr ago
Yama Dixit et al.
Climate change has been suggested as a possible cause for the decline of urban centers of the Indus Civilization ∼4000 yr ago, but extant paleoclimatic evidence has been derived from locations well outside the distribution of Indus settlements. Here we report an oxygen isotope record of gastropod aragonite (δ18Oa) from Holocene sediments of paleolake Kotla Dahar (Haryana, India), which is adjacent to Indus settlements and documents Indian summer monsoon (ISM) variability for the past 6.5 k.y. A 4‰ increase in δ18Oa occurred at ca. 4.1 ka marking a peak in the evaporation/precipitation ratio in the lake catchment related to weakening of the ISM. Although dating uncertainty exists in both climate and archaeological records, the drought event 4.1 ka on the northwestern Indian plains is within the radiocarbon age range for the beginning of Indus de-urbanization, suggesting that climate may have played a role in the Indus cultural transformation.
Link
The 4.2 ka aridification event is regarded as one of the most severe climatic changes in the Holocene, and affected several Early Bronze Age populations from the Aegean to the ancient Near East (Cullen et al., 2000; Weiss and Brad- ley, 2001). This study demonstrates that the cli- mate changes at that time extended to the plains of northwestern India. The Kotla Dahar record alone cannot fully explain the role of climate change in the cultural evolution of the Indus civilization. The Indus settlements spanned a diverse range of environmental and ecological zones (Wright, 2010; Petrie, 2013); therefore, correlation of evidence for climate change and the decline of Indus urbanism requires a comprehensive assessment of the relationship between settlement and climate across a sub- stantial area (Weiss and Bradley, 2001; Petrie, 2013). The impact of the abrupt climate event in India and West Asia records, and that observed at Kotla Dahar, on settled life in the Indus region warrants further investigation.Plato (or rather the Egyptian priest speaking through Plato) may have been the first one to note the differential survival of people as a result of natural catastrophes. It is hard to imagine that such an extreme event would not unbalance agricultural economies leading to famine and also endanger the supply systems on which early cities were based. The failure of cities would in turn lead to a failure of governing elites centered on them and a power vacuum which new elites (armed with bronze weapons at this time) might take advantage of. Climate may have ended the Bronze Age civilization itself 1000 years later.
Geology doi: 10.1130/G35236.1
Abrupt weakening of the summer monsoon in northwest India ∼4100 yr ago
Yama Dixit et al.
Climate change has been suggested as a possible cause for the decline of urban centers of the Indus Civilization ∼4000 yr ago, but extant paleoclimatic evidence has been derived from locations well outside the distribution of Indus settlements. Here we report an oxygen isotope record of gastropod aragonite (δ18Oa) from Holocene sediments of paleolake Kotla Dahar (Haryana, India), which is adjacent to Indus settlements and documents Indian summer monsoon (ISM) variability for the past 6.5 k.y. A 4‰ increase in δ18Oa occurred at ca. 4.1 ka marking a peak in the evaporation/precipitation ratio in the lake catchment related to weakening of the ISM. Although dating uncertainty exists in both climate and archaeological records, the drought event 4.1 ka on the northwestern Indian plains is within the radiocarbon age range for the beginning of Indus de-urbanization, suggesting that climate may have played a role in the Indus cultural transformation.
Link
August 08, 2013
Major admixture in India took place ~4.2-1.9 thousand years ago (Moorjani et al. 2013)
A new paper on the topic of Indian population history has just appeared in the American Journal of Human Genetics. In previous work it was determined that Indians trace their ancestry to two major groups, Ancestral North Indians (ANI) (= West Eurasians of some kind), and Ancestral South Indians (ASI) (= distant relatives of Andaman Islanders, existing today only in admixed form). The new paper demonstrates that admixture between these two groups took place ~4.2-1.9 thousand years ago.
The authors caution about this evidence of admixture:
Such reproductive isolation would require a cultural shift from a long period of endogamy (ANI migration, followed by ANI/ASI co-existence without admixture) to exogamy ~4.2-1.9kya (to explain the thoroughness of blending that left no group untouched), and then back to fairly strict exogamy (within the modern caste system). It might be simpler to postulate only one cultural shift (migration with admixture soon thereafter, with later introduction of endogamy which greatly diminished the admixture.
The authors cite the evidence from neolithic Sweden which does, indeed, suggest that the neolithic farmers this far north were "southern European" genetically and had not (yet) mixed with contemporary hunter-gatherers, as they must have done eventually. But, perhaps farmers and hunters could avoid each other during first contact, when Europe was sparsely populated. It is not clear whether the same could be said for India ~4 thousand years ago with the Indus Valley Civilization providing evidence for a large indigenous population that any intrusive group would have encountered. In any case, the problem of when the West Eurasian element arrived in India will probably be solved by relating it to events elsewhere in Eurasia, and, in particular, to the ultimate source of the "Ancestral North Indians".
It is also possible that some of the ANI-ASI admixture might actually pre-date migration. At present it's anyone's guess where the original limes between the west Eurasian and ASI worlds were. There is some mtDNA haplogroup M in Iran and Central Asia, which is otherwise rare in west Eurasia, so it is not inconceivable that ASI may have once extended outside the Indian subcontinent: the fact that it is concentrated today in southern India (hence its name) may indicate only the area of this element's maximum survival, rather than the extent of its original distribution. In any case, all mixture must have taken place somewhere in the vicinity of India.
A second interesting finding of the paper is that admixture dates in Indo-European groups are later than in Dravidian groups. This is demonstrated quite clearly in the rolloff figure on the left. Moreover, it does not seem that the admixture times for Indo-Europeans coincide with the appearance of the Indo-Aryans, presumably during the 2nd millennium BC: they are much later. I believe that this is fairly convincing evidence that north India has been affected by subsequent population movements from central Asia of "Indo-Scythian"-related populations, for which there is ample historical evidence. So, the difference in dates might be explained by secondary (later) admixture with other West Eurasians after the arrival of Indo-Aryans. Interestingly, the paper does not reject simple ANI-ASI admixture "often from tribal and traditionally lower-caste groups," while finding evidence for multiple layers of ANI ancestry in several other populations.
My own analysis of Dodecad Project South Indian Brahmins arrived at a date of 4.1ky, and of North Indian Brahmins, a date of 2.3ky, which seems to be in good agreement with these results.
The authors also report that "we find that Georgians along with other Caucasus groups are consistent with sharing the most genetic drift with ANI". I had made a post on the differential relationship of ANI to Caucasus populations which seems to agree with this, and, of course, in various ADMIXTURE analyses, the component which I've labeled "West Asian" tends to be the major west Eurasian element in south Asia.
Here are the estimated admixture proportions/times from the paper:
Sadly, the warm and moist climate of India, and the adoption of cremation have probably destroyed any hope of studying much of its recent history with ancient DNA. On the other hand, the caste system has probably "fossilized" old socio-linguistic groups, allowing us to tell much by studying their differences and correlating them with groups outside India.
Coverage elsewhere: Gene Expression, HarappaDNA
Related podcast on BBC.
AJHG doi:10.1016/j.ajhg.2013.07.006
Genetic Evidence for Recent Population Mixture in India
Priya Moorjani et al.
Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.
Link
The authors caution about this evidence of admixture:
It is also important to emphasize what our study has not shown. Although we have documented evidence for mixture in India between about 1,900 and 4,200 years BP, this does not imply migration from West Eurasia into India during this time. On the contrary, a recent study that searched for West Eurasian groups most closely related to the ANI ancestors of Indians failed to find any evidence for shared ancestry between the ANI and groups in West Eurasia within the past 12,500 years3 (although it is possible that with further sampling and new methods such relatedness might be detected). An alternative possibility that is also consistent with our data is that the ANI and ASI were both living in or near South Asia for a substantial period prior to their mixture. Such a pattern has been documented elsewhere; for example, ancient DNA studies of northern Europeans have shown that Neolithic farmers originating in Western Asia migrated to Europe about 7,500 years BP but did not mix with local hunter gatherers until thousands of years later to form the present-day populations of northern Europe.15, 16, 44 and 45This is of course true, because admixture postdates migration and it is conceivable that the West Eurasian groups might not have admixed with ASI populations immediately after their arrival into South Asia. On the other hand, a long period of co-existence without admixture would be against much of human history (e.g., the reverse movement of the Roma into Europe, who picked up European admixture despite strong social pressure against it by both European and Roma communities, or the absorption of most Native Americans by incoming European, and later African, populations in post-Columbian times). It is difficult to imagine really long reproductive isolation between neighboring peoples.
Such reproductive isolation would require a cultural shift from a long period of endogamy (ANI migration, followed by ANI/ASI co-existence without admixture) to exogamy ~4.2-1.9kya (to explain the thoroughness of blending that left no group untouched), and then back to fairly strict exogamy (within the modern caste system). It might be simpler to postulate only one cultural shift (migration with admixture soon thereafter, with later introduction of endogamy which greatly diminished the admixture.
The authors cite the evidence from neolithic Sweden which does, indeed, suggest that the neolithic farmers this far north were "southern European" genetically and had not (yet) mixed with contemporary hunter-gatherers, as they must have done eventually. But, perhaps farmers and hunters could avoid each other during first contact, when Europe was sparsely populated. It is not clear whether the same could be said for India ~4 thousand years ago with the Indus Valley Civilization providing evidence for a large indigenous population that any intrusive group would have encountered. In any case, the problem of when the West Eurasian element arrived in India will probably be solved by relating it to events elsewhere in Eurasia, and, in particular, to the ultimate source of the "Ancestral North Indians".
It is also possible that some of the ANI-ASI admixture might actually pre-date migration. At present it's anyone's guess where the original limes between the west Eurasian and ASI worlds were. There is some mtDNA haplogroup M in Iran and Central Asia, which is otherwise rare in west Eurasia, so it is not inconceivable that ASI may have once extended outside the Indian subcontinent: the fact that it is concentrated today in southern India (hence its name) may indicate only the area of this element's maximum survival, rather than the extent of its original distribution. In any case, all mixture must have taken place somewhere in the vicinity of India.
A second interesting finding of the paper is that admixture dates in Indo-European groups are later than in Dravidian groups. This is demonstrated quite clearly in the rolloff figure on the left. Moreover, it does not seem that the admixture times for Indo-Europeans coincide with the appearance of the Indo-Aryans, presumably during the 2nd millennium BC: they are much later. I believe that this is fairly convincing evidence that north India has been affected by subsequent population movements from central Asia of "Indo-Scythian"-related populations, for which there is ample historical evidence. So, the difference in dates might be explained by secondary (later) admixture with other West Eurasians after the arrival of Indo-Aryans. Interestingly, the paper does not reject simple ANI-ASI admixture "often from tribal and traditionally lower-caste groups," while finding evidence for multiple layers of ANI ancestry in several other populations.
My own analysis of Dodecad Project South Indian Brahmins arrived at a date of 4.1ky, and of North Indian Brahmins, a date of 2.3ky, which seems to be in good agreement with these results.
The authors also report that "we find that Georgians along with other Caucasus groups are consistent with sharing the most genetic drift with ANI". I had made a post on the differential relationship of ANI to Caucasus populations which seems to agree with this, and, of course, in various ADMIXTURE analyses, the component which I've labeled "West Asian" tends to be the major west Eurasian element in south Asia.
Here are the estimated admixture proportions/times from the paper:
Sadly, the warm and moist climate of India, and the adoption of cremation have probably destroyed any hope of studying much of its recent history with ancient DNA. On the other hand, the caste system has probably "fossilized" old socio-linguistic groups, allowing us to tell much by studying their differences and correlating them with groups outside India.
Coverage elsewhere: Gene Expression, HarappaDNA
Related podcast on BBC.
AJHG doi:10.1016/j.ajhg.2013.07.006
Genetic Evidence for Recent Population Mixture in India
Priya Moorjani et al.
Most Indian groups descend from a mixture of two genetically divergent populations: Ancestral North Indians (ANI) related to Central Asians, Middle Easterners, Caucasians, and Europeans; and Ancestral South Indians (ASI) not closely related to groups outside the subcontinent. The date of mixture is unknown but has implications for understanding Indian history. We report genome-wide data from 73 groups from the Indian subcontinent and analyze linkage disequilibrium to estimate ANI-ASI mixture dates ranging from about 1,900 to 4,200 years ago. In a subset of groups, 100% of the mixture is consistent with having occurred during this period. These results show that India experienced a demographic transformation several thousand years ago, from a region in which major population mixture was common to one in which mixture even between closely related groups became rare because of a shift to endogamy.
Link
July 22, 2013
Y chromosomes in Lingayat and Vokkaliga Dravidians from SW India
From the paper:
Also of interest:
Gene Available online 7 May 2013
Indigenous and foreign Y-chromosomes characterize the Lingayat and Vokkaliga populations of Southwest India
Shilpa Chennakrishnaiah et al.
Previous studies have shown that India's vast coastal rim played an important role in the dispersal of modern humans out of Africa but the Karnataka state, which is located on the southwest coast of India, remains poorly characterized genetically. In the present study, two Dravidian populations, namely Lingayat (N = 101) and Vokkaliga (N = 102), who represent the two major communities of the Karnataka state, were examined using high-resolution analyses of Y-chromosome single nucleotide polymorphisms (Y-SNPs) and seventeen short tandem repeat (Y-STR) loci. Our results revealed that the majority of the Lingayat and Vokkaliga paternal gene pools are composed of four Y-chromosomal haplogroups (H, L, F* and R2) that are frequent in the Indian subcontinent. The high level of L1-M76 chromosomes in the Vokkaligas suggests an agricultural expansion in the region, while the predominance of R1a1a1b2-Z93 and J2a-M410 lineages in the Lingayat indicates gene flow from neighboring south Indian populations and West Asia, respectively. Lingayat (0.9981) also exhibits a relatively high haplotype diversity compared to Vokkaliga (0.9901), supporting the historical record that the Lingayat originated from multiple source populations. In addition, we detected ancient lineages such as F*-M213, H*-M69 and C*-M216 that may be indicative of genetic signatures of the earliest settlers who reached India after their migration out of Africa.
Link
The virtual absence of Z283 subclades, namely Z280 and M458, and the total representation of R1a1a-derived samples by the Z93 marker in our dataset support an earlier observation that the M198 chromosome likely differentiated in the region between Eastern Europe and South Asia (Pamjav et al., 2012), and subsequently expanded in opposite directions. However, it will require additional R1a1a* samples from different populations across Eurasia to comprehensively evaluate the geographic origins, distribution and ethno-linguistic associations of the individual M198-derived lineages (Pamjav et al., 2012).This study extends the results of Pamjav et al. 2012 which found only Z93 within R1a1 in mainland Indian populations. The authors estimate 12.8ky as the age of R-Z93 in the Lingayat, but since this uses the evolutionary mutation rate it should actually be divided by a factor of 3.6 which translates into ~1,500BC. So, it seems quite likely that R-Z93 moved from Central->South Asia during the Bronze Age, both on account of its age and the fact that it is a subset of Central Asian diversity. Haplogroup R2 with a nominal age of ~22ky in the Lingayat seems more like a Neolithic lineage.
Also of interest:
Another haplogroup that is associated with the spread of agriculture from the Fertile Crescent and Anatolia regions is J2-M172 (Cinnioğlu et al., 2004 and Semino et al., 2004). According to Sahoo et al. (2006), only J2 lineages, originating from West Asia rather than Central Asia, represent an external contribution to the Indian paternal gene pool. In particular, subclade J2a-M410 is believed to have entered through the northwestern corridor and subsequently diffused to the south and east (Sahoo et al., 2006 and Thangaraj et al., 2010). This haplogroup is present exclusively in the Lingayat (6.93%), except for one individual from Vokkaliga, suggesting gene flow from West Asia (Sahoo et al., 2006 and Thangaraj et al., 2010). Interestingly, four J2b2-M241 Lingayat males displayed a null allele at DYS458 and failed to produce the AMGY PCR amplicon while their X homolog (AMGX) amplified successfully. Comparison of Y-STR haplotypes of the affected males from the present study with those from the literature (Cadenas et al., 2006), demonstrated a high level of allele sharing, implying shared paternal lineages or a recent common ancestry for these groups of individuals.According to the paper the Lingayat are a community that originally attracted members from across the caste system, while the Vokkaliga are involved in farming. An uneven distribution of haplogroup J2a has been previously observed, so I guess this paper adds to this evidence.
Gene Available online 7 May 2013
Indigenous and foreign Y-chromosomes characterize the Lingayat and Vokkaliga populations of Southwest India
Shilpa Chennakrishnaiah et al.
Previous studies have shown that India's vast coastal rim played an important role in the dispersal of modern humans out of Africa but the Karnataka state, which is located on the southwest coast of India, remains poorly characterized genetically. In the present study, two Dravidian populations, namely Lingayat (N = 101) and Vokkaliga (N = 102), who represent the two major communities of the Karnataka state, were examined using high-resolution analyses of Y-chromosome single nucleotide polymorphisms (Y-SNPs) and seventeen short tandem repeat (Y-STR) loci. Our results revealed that the majority of the Lingayat and Vokkaliga paternal gene pools are composed of four Y-chromosomal haplogroups (H, L, F* and R2) that are frequent in the Indian subcontinent. The high level of L1-M76 chromosomes in the Vokkaligas suggests an agricultural expansion in the region, while the predominance of R1a1a1b2-Z93 and J2a-M410 lineages in the Lingayat indicates gene flow from neighboring south Indian populations and West Asia, respectively. Lingayat (0.9981) also exhibits a relatively high haplotype diversity compared to Vokkaliga (0.9901), supporting the historical record that the Lingayat originated from multiple source populations. In addition, we detected ancient lineages such as F*-M213, H*-M69 and C*-M216 that may be indicative of genetic signatures of the earliest settlers who reached India after their migration out of Africa.
Link
July 08, 2013
Continuity of microblade technology in India since 45ka
An interesting new paper extends continuity of microblade technologies in India to ~45ka, and hence makes it probable that these were introduced by AMH together with the UP colonization of the rest of Eurasia.
I am not sure that the authors' suggestion that early modern humans were "tropically adapted" is certain. Personally, my idea du jour is to derive them from the Saharo-Arabian belt. In any case, as an advocate of "early OOA" (in the sense of pre-UP/LSA), it makes sense to me that modern humans in Eurasia would be initially climate-limited and at a disadvantage vis a vis archaic Eurasians inhabiting regions for which they were maladapted. In my opinion, it was the technological revolution of ~50ka being responsible for the extension of their range at the expense of other Eurasians.
PLoS ONE 8(7): e69280. doi:10.1371/journal.pone.0069280
Continuity of Microblade Technology in the Indian Subcontinent Since 45 ka: Implications for the Dispersal of Modern Humans
Sheila Mishra et al.
We extend the continuity of microblade technology in the Indian Subcontinent to 45 ka, on the basis of optical dating of microblade assemblages from the site of Mehtakheri, (22° 13' 44″ N Lat 76° 01' 36″ E Long) in Madhya Pradesh, India. Microblade technology in the Indian Subcontinent is continuously present from its first appearance until the Iron Age (~3 ka), making its association with modern humans undisputed. It has been suggested that microblade technology in the Indian Subcontinent was developed locally by modern humans after 35 ka. The dates reported here from Mehtakheri show this inference to be untenable and suggest alternatively that this technology arrived in the Indian Subcontinent with the earliest modern humans. It also shows that modern humans in Indian Subcontinent and SE Asia were associated with differing technologies and this calls into question the “southern dispersal” route of modern humans from Africa through India to SE Asia and then to Australia. We suggest that modern humans dispersed from Africa in two stages coinciding with the warmer interglacial conditions of MIS 5 and MIS 3. Competitive interactions between African modern humans and Indian archaics who shared an adaptation to tropical environments differed from that between modern humans and archaics like Neanderthals and Denisovans, who were adapted to temperate environments. Thus, while modern humans expanded into temperate regions during warmer climates, their expansion into tropical regions, like the Indian Subcontinent, in competition with similarly adapted populations, occurred during arid climates. Thus modern humans probably entered the Indian Subcontinent during the arid climate of MIS 4 coinciding with their disappearance from the Middle East and Northern Africa. The out of phase expansion of modern humans into tropical versus temperate regions has been one of the factors affecting the dispersal of modern humans from Africa during the period 200–40 ka.
Link
I am not sure that the authors' suggestion that early modern humans were "tropically adapted" is certain. Personally, my idea du jour is to derive them from the Saharo-Arabian belt. In any case, as an advocate of "early OOA" (in the sense of pre-UP/LSA), it makes sense to me that modern humans in Eurasia would be initially climate-limited and at a disadvantage vis a vis archaic Eurasians inhabiting regions for which they were maladapted. In my opinion, it was the technological revolution of ~50ka being responsible for the extension of their range at the expense of other Eurasians.
PLoS ONE 8(7): e69280. doi:10.1371/journal.pone.0069280
Continuity of Microblade Technology in the Indian Subcontinent Since 45 ka: Implications for the Dispersal of Modern Humans
Sheila Mishra et al.
We extend the continuity of microblade technology in the Indian Subcontinent to 45 ka, on the basis of optical dating of microblade assemblages from the site of Mehtakheri, (22° 13' 44″ N Lat 76° 01' 36″ E Long) in Madhya Pradesh, India. Microblade technology in the Indian Subcontinent is continuously present from its first appearance until the Iron Age (~3 ka), making its association with modern humans undisputed. It has been suggested that microblade technology in the Indian Subcontinent was developed locally by modern humans after 35 ka. The dates reported here from Mehtakheri show this inference to be untenable and suggest alternatively that this technology arrived in the Indian Subcontinent with the earliest modern humans. It also shows that modern humans in Indian Subcontinent and SE Asia were associated with differing technologies and this calls into question the “southern dispersal” route of modern humans from Africa through India to SE Asia and then to Australia. We suggest that modern humans dispersed from Africa in two stages coinciding with the warmer interglacial conditions of MIS 5 and MIS 3. Competitive interactions between African modern humans and Indian archaics who shared an adaptation to tropical environments differed from that between modern humans and archaics like Neanderthals and Denisovans, who were adapted to temperate environments. Thus, while modern humans expanded into temperate regions during warmer climates, their expansion into tropical regions, like the Indian Subcontinent, in competition with similarly adapted populations, occurred during arid climates. Thus modern humans probably entered the Indian Subcontinent during the arid climate of MIS 4 coinciding with their disappearance from the Middle East and Northern Africa. The out of phase expansion of modern humans into tropical versus temperate regions has been one of the factors affecting the dispersal of modern humans from Africa during the period 200–40 ka.
Link
June 14, 2013
Prince William's Indian matrilineage?
William has a hint of Indian in his DNA, find British researchers
Personally, I wouldn't be so quick in discounting the traditional genealogical story. A lineage that occurs at a frequency of 0.3% will almost certainly be missed in any small sample if it occurs at similar trace frequencies in other populations.
Researchers have sourced William’s Indian ancestry to Eliza Kewark, his great-great-great-great-great grandmother, who was assumed to be Armenian, but now has been revealed as an Indian by genetic research.Chaubey et al. (2008) is an article that touches upon the subject of R30b.
...
“Through genealogy we traced two living direct descendants of Eliza and by reading the sequence of their mtDNA, we showed not only that they matched, but also that it belongs to a haplogroup called R30b, thus determining Eliza Kewark’s haplogroup,” the research team revealed.
The haplogroup, which is a group of related ancestral lineages, in this case was revealed to be rare and found only in South Asia. Other related branches of R30a and R30* are also entirely South Asian.
“This confirms therefore that the mtDNA of Eliza Kewark of Surat was of Indian heritage. R30b is rare even in India, where roughly 0.3 per cent of people carry this lineage,” the researchers revealed.
Personally, I wouldn't be so quick in discounting the traditional genealogical story. A lineage that occurs at a frequency of 0.3% will almost certainly be missed in any small sample if it occurs at similar trace frequencies in other populations.
May 30, 2013
ESHG 2013 abstracts
A couple of weeks before the ESHG 2013 conference, here are a few abstracts that caught my eye. Feel free to point to more interesting presentations in the comments section.
- J16.05 - Checking the hypothesis of a Balkan origin of the Armenians
- P17.5 - Y chromosome haplogroup analysis to estimate genetic origin of Balts
- J16.27 - Armenian Highland as a transition corridor for the spread of Neolithic agriculturists
- J16.14 - In Search of the Origin of Haplogroup J1-P58
- J16.03 - Y-chromosome haplogroup analysis in the Besermyan ethnic group
- P17.4 - Mitochondrial DNA Analysis of the Southeast European Genetic Variation Reveals a New, Local Subbranching in Hg X2
- P16.085 - Population diversity and history of the Indian subcontinent: Uncovering the deeper mosaic of sub-structuring and the intricate network of dispersals
- J16.43 - Ancient mtDNA diversity in Bulgaria
- P16.072 - Mitochondrial DNA diversity in medieval and modern Romanian population
- P16.012 - Frequency analysis of the CCR5-Delta32 allele in medieval and modern Romanian population
- J16.78 - The gene pool of Argyn in the context of generic structure of Kazakhs according to data on SNP-Y-Chromosome markers.
January 14, 2013
Gene flow between Indian populations and Australasia ~4,000 years ago
Only the press release is available so far, I will add the paper abstract when I see it on the PNAS website:
UPDATE: Ed Yong covers the story in Nature News:
UPDATE II: I added the abstract.
PNAS doi: 10.1073/pnas.1211927110
Genome-wide data substantiate Holocene gene flow from India to Australia
Irina Pugach et al.
The Australian continent holds some of the earliest archaeological evidence for the expansion of modern humans out of Africa, with initial occupation at least 40,000 y ago. It is commonly assumed that Australia remained largely isolated following initial colonization, but the genetic history of Australians has not been explored in detail to address this issue. Here, we analyze large-scale genotyping data from aboriginal Australians, New Guineans, island Southeast Asians and Indians. We find an ancient association between Australia, New Guinea, and the Mamanwa (a Negrito group from the Philippines), with divergence times for these groups estimated at 36,000 y ago, and supporting the view that these populations represent the descendants of an early “southern route” migration out of Africa, whereas other populations in the region arrived later by a separate dispersal. We also detect a signal indicative of substantial gene flow between the Indian populations and Australia well before European contact, contrary to the prevailing view that there was no contact between Australia and the rest of the world. We estimate this gene flow to have occurred during the Holocene, 4,230 y ago. This is also approximately when changes in tool technology, food processing, and the dingo appear in the Australian archaeological record, suggesting that these may be related to the migration from India.
Link
Researcher Irina Pugach and colleagues now analysed genetic variation from across the genome from aboriginal Australians, New Guineans, island Southeast Asians, and Indians. Their findings suggest substantial gene flow from India to Australia 4,230 years ago. i.e. during the Holocene and well before European contact. “Interestingly,” says Pugach, “this date also coincides with many changes in the archaeological record of Australia, which include a sudden change in plant processing and stone tool technologies, with microliths appearing for the first time, and the first appearance of the dingo in the fossil record. Since we detect inflow of genes from India into Australia at around the same time, it is likely that these changes were related to this migration.”
Their analyses also reveal a common origin for populations from Australia, New Guinea and the Mamanwa – a Negrito group from the Philippines – and they estimated that these groups split from each other about 36,000 years ago. Mark Stoneking says: “This finding supports the view that these populations represent the descendants of an early ‘southern route’ migration out of Africa, while other populations in the region arrived later by a separate dispersal.“ This also indicates that Australians and New Guineans diverged early in the history of Sahul, and not when the lands were separated by rising sea waters around 8,000 years ago.A relationship between Indian and Australasian populations has long been suspected on various grounds (e.g., HGDP Papuans often show membership in a "South Asian" ancestral component at low levels of resolution). It will be interesting to see the model proposed in the new paper about the admixture event leading to modern Australasians.
UPDATE: Ed Yong covers the story in Nature News:
Some aboriginal Australians can trace as much as 11% of their genomes to migrants who reached the island around 4,000 years ago from India, a study suggests. Along with their genes, the migrants brought different tool-making techniques and the ancestors of the dingo, researchers say1.From World News Australia:
The study suggests that in addition to an earlier northern route of migration out of Africa, into Asia, and then South East Asia about 60,000 to 70,000 years ago, the second wave occurred much later, arriving during the Holocene period about 4,230 years ago.
...
“About that point in the archaeological record, there were significant changes in the use of stone tools, in hunting techniques and significantly, the introduction of the dingo,” Professor Cooper said.
...
There are other theories that may support the evidence of a more recent influx of migrants from India, including that they brought with them a disease of epidemic proportions that wiped out earlier Aboriginal populations.
UPDATE II: I added the abstract.
PNAS doi: 10.1073/pnas.1211927110
Genome-wide data substantiate Holocene gene flow from India to Australia
Irina Pugach et al.
The Australian continent holds some of the earliest archaeological evidence for the expansion of modern humans out of Africa, with initial occupation at least 40,000 y ago. It is commonly assumed that Australia remained largely isolated following initial colonization, but the genetic history of Australians has not been explored in detail to address this issue. Here, we analyze large-scale genotyping data from aboriginal Australians, New Guineans, island Southeast Asians and Indians. We find an ancient association between Australia, New Guinea, and the Mamanwa (a Negrito group from the Philippines), with divergence times for these groups estimated at 36,000 y ago, and supporting the view that these populations represent the descendants of an early “southern route” migration out of Africa, whereas other populations in the region arrived later by a separate dispersal. We also detect a signal indicative of substantial gene flow between the Indian populations and Australia well before European contact, contrary to the prevailing view that there was no contact between Australia and the rest of the world. We estimate this gene flow to have occurred during the Holocene, 4,230 y ago. This is also approximately when changes in tool technology, food processing, and the dingo appear in the Australian archaeological record, suggesting that these may be related to the migration from India.
Link
December 10, 2012
Roma origins once more (Moorjani et al. 2012)
I had first noticed that this new paper by Moorjani et al. was referenced by Loh et al., and it has now been posted on arXiv. In the last week, a couple of other papers on the same topic (Mendizabal et al. on autosomal DNA and Rai et al. on a Y-chromosome founder lineage) have also appeared.
All three studies appear to converge on NW India as the place of origin of the European Roma, and on a recent admixture between this "Proto-Roma" population and Europeans. It will be interesting to see if there are any substantial differences between Moorjani et al. and Mendizabal et al. in the reconstruction of Roma origins. There is also an appendix on updates to rolloff and other topics of a technical nature that ought to be useful to readers irrespective of their interest in this particular population.
It'll probably take me a while to digest everything in this paper, but I will make one quick observation after (virtually) leafing through the article; the observation that {CEU, ANI} form a clade with Adygei as an outgroup is used to infer admixture proportions. I recently had a blog post on the differential relationship of ANI to Caucasus populations, in which I showed that while D(CEU, Adygei; South Asian, Onge) was positive, and significant in some cases -- indicating CEU being more closely related to ANI (Ancestral North Indians) than Adygei -- the reverse was the case for D(CEU, Georgian/Lezgin; South Asian, Onge).
A second observation was inspired by the following figure:
High IBD sharing with Romanians makes sense, because there is good evidence (e.g., presence of Y-haplogroup E-V13) that the Roma picked up European ancestry in the Balkans. So, I'm fairly sure that we are seeing a real signal that the Roma have Romanian-like recent European ancestors. But, we ought to be vigilant, because it is possible that some Romanians may have Roma ancestry too! This was the case in a couple of individuals from the Romanian sample of Behar et al. (2010).
This is a more general issue: IBD sharing occasionally involves strictly -or mostly- unidirectional gene flow, e.g., sharing between European and African Americans largely went EA->AA way, so an AA sharing with a EA more often than not involves EA->AA gene flow.
But, in other cases, the direction of gene flow is more obscure (so, e.g., sharing between German, Magyar, and Slavic speakers, and Jews in the old Austro-Hungarian Empire). This issue often comes up in the genealogical community, with a typical example being a couple of individuals (let's call them Klaus and Mikolaj) discovering a shared IBD segment, and Klaus thinking he's found a Polish ancestor, and Mikolaj a German one.
In any case, as the authors themselves note it will be interesting to use more European reference populations, and this might indicate whether they picked up European ancestry in one particular region, carrying it with them as they expanded into the Balkans and beyond, or whether they picked it up by interacting with different host populations (e.g., Greek Gypsies with Greeks, Romanian Gypsies with Romanians, and so on).
arXiv:1212.1696 [q-bio.PE]
Reconstructing Roma history from genome-wide data
Priya Moorjani et al.
The Roma people, living throughout Europe, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1000-1500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry-deriving from a combination of European and South Asian sources- and that the date of admixture of South Asian and European ancestry was about 850 years ago. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which we hypothesize was followed by a major demographic expansion once the population arrived in Europe.
Link
All three studies appear to converge on NW India as the place of origin of the European Roma, and on a recent admixture between this "Proto-Roma" population and Europeans. It will be interesting to see if there are any substantial differences between Moorjani et al. and Mendizabal et al. in the reconstruction of Roma origins. There is also an appendix on updates to rolloff and other topics of a technical nature that ought to be useful to readers irrespective of their interest in this particular population.
It'll probably take me a while to digest everything in this paper, but I will make one quick observation after (virtually) leafing through the article; the observation that {CEU, ANI} form a clade with Adygei as an outgroup is used to infer admixture proportions. I recently had a blog post on the differential relationship of ANI to Caucasus populations, in which I showed that while D(CEU, Adygei; South Asian, Onge) was positive, and significant in some cases -- indicating CEU being more closely related to ANI (Ancestral North Indians) than Adygei -- the reverse was the case for D(CEU, Georgian/Lezgin; South Asian, Onge).
A second observation was inspired by the following figure:
High IBD sharing with Romanians makes sense, because there is good evidence (e.g., presence of Y-haplogroup E-V13) that the Roma picked up European ancestry in the Balkans. So, I'm fairly sure that we are seeing a real signal that the Roma have Romanian-like recent European ancestors. But, we ought to be vigilant, because it is possible that some Romanians may have Roma ancestry too! This was the case in a couple of individuals from the Romanian sample of Behar et al. (2010).
This is a more general issue: IBD sharing occasionally involves strictly -or mostly- unidirectional gene flow, e.g., sharing between European and African Americans largely went EA->AA way, so an AA sharing with a EA more often than not involves EA->AA gene flow.
But, in other cases, the direction of gene flow is more obscure (so, e.g., sharing between German, Magyar, and Slavic speakers, and Jews in the old Austro-Hungarian Empire). This issue often comes up in the genealogical community, with a typical example being a couple of individuals (let's call them Klaus and Mikolaj) discovering a shared IBD segment, and Klaus thinking he's found a Polish ancestor, and Mikolaj a German one.
In any case, as the authors themselves note it will be interesting to use more European reference populations, and this might indicate whether they picked up European ancestry in one particular region, carrying it with them as they expanded into the Balkans and beyond, or whether they picked it up by interacting with different host populations (e.g., Greek Gypsies with Greeks, Romanian Gypsies with Romanians, and so on).
arXiv:1212.1696 [q-bio.PE]
Reconstructing Roma history from genome-wide data
Priya Moorjani et al.
The Roma people, living throughout Europe, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1000-1500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry-deriving from a combination of European and South Asian sources- and that the date of admixture of South Asian and European ancestry was about 850 years ago. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which we hypothesize was followed by a major demographic expansion once the population arrived in Europe.
Link
December 06, 2012
Romani origins and admixture (Mendizabal et al.)
I will comment on the paper in this space after I read it. For the time being here's a link to the press release.
Current Biology dx.doi.org/10.1016/j.cub.2012.10.039
Reconstructing the Population History of European Romani from Genome-wide Data
Isabel Mendizabal et al.
The Romani, the largest European minority group with approximately 11 million people [1], constitute a mosaic of languages, religions, and lifestyles while sharing a distinct social heritage. Linguistic [2] and genetic [3, 4, 5, 6, 7 and 8] studies have located the Romani origins in the Indian subcontinent. However, a genome-wide perspective on Romani origins and population substructure, as well as a detailed reconstruction of their demographic history, has yet to be provided. Our analyses based on genome-wide data from 13 Romani groups collected across Europe suggest that the Romani diaspora constitutes a single initial founder population that originated in north/northwestern India ∼1.5 thousand years ago (kya). Our results further indicate that after a rapid migration with moderate gene flow from the Near or Middle East, the European spread of the Romani people was via the Balkans starting ∼0.9 kya. The strong population substructure and high levels of homozygosity we found in the European Romani are in line with genetic isolation as well as differential gene flow in time and space with non-Romani Europeans. Overall, our genome-wide study sheds new light on the origins and demographic history of European Romani.
Link
"From a genome-wide perspective, Romani people share a common and unique history that consists of two elements: the roots in northwestern India and the admixture with non-Romani Europeans accumulating with different magnitudes during the out-of-India migration across Europe," Kayser said. "Our study clearly illustrates that understanding the Romani's genetic legacy is necessary to complete the genetic characterization of Europeans as a whole, with implications for various fields, from human evolution to the health sciences."The results seem to complement a recent Y-chromosome study of the major founder lineage of European Roma H-M82.
Current Biology dx.doi.org/10.1016/j.cub.2012.10.039
Reconstructing the Population History of European Romani from Genome-wide Data
Isabel Mendizabal et al.
The Romani, the largest European minority group with approximately 11 million people [1], constitute a mosaic of languages, religions, and lifestyles while sharing a distinct social heritage. Linguistic [2] and genetic [3, 4, 5, 6, 7 and 8] studies have located the Romani origins in the Indian subcontinent. However, a genome-wide perspective on Romani origins and population substructure, as well as a detailed reconstruction of their demographic history, has yet to be provided. Our analyses based on genome-wide data from 13 Romani groups collected across Europe suggest that the Romani diaspora constitutes a single initial founder population that originated in north/northwestern India ∼1.5 thousand years ago (kya). Our results further indicate that after a rapid migration with moderate gene flow from the Near or Middle East, the European spread of the Romani people was via the Balkans starting ∼0.9 kya. The strong population substructure and high levels of homozygosity we found in the European Romani are in line with genetic isolation as well as differential gene flow in time and space with non-Romani Europeans. Overall, our genome-wide study sheds new light on the origins and demographic history of European Romani.
Link
November 29, 2012
Pinpointing Roma origins: Out of Northwestern India
Interestingly, besides H-M82, there has been recent evidence that R-Z93 might also represent a second founder haplogroup of the European Roma populations; it will be interesting to study it in the future in order to confirm the scenario presented in this new paper.
From the paper:
PLoS ONE 7(11): e48477. doi:10.1371/journal.pone.0048477
The Phylogeography of Y-Chromosome Haplogroup H1a1a-M82 Reveals the Likely Indian Origin of the European Romani Populations
Niraj Rai et al.
Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.
Link
From the paper:
This first genetic evidence of this nature allows us to develop a more detailed picture of the paternal genetic history of European Roma, revealing that the ancestors of present scheduled tribes and scheduled caste populations of northern India, traditionally referred to collectively as the Ḍoma, are the likely ancestral populations of modern European Roma. Our findings corroborate the hypothesized cognacy of the terms Rroma and Ḍoma and resolve the controversy about the Gangetic plain and the Punjab in favour of the northwestern portion of the diffuse widespread range of the Ḍoma ancestral population of northern India.A paper about Roma origins based on autosomal DNA is also apparently in the works, so it will be interesting to see how it might tie in with the Y-chromosome evidence.
PLoS ONE 7(11): e48477. doi:10.1371/journal.pone.0048477
The Phylogeography of Y-Chromosome Haplogroup H1a1a-M82 Reveals the Likely Indian Origin of the European Romani Populations
Niraj Rai et al.
Linguistic and genetic studies on Roma populations inhabited in Europe have unequivocally traced these populations to the Indian subcontinent. However, the exact parental population group and time of the out-of-India dispersal have remained disputed. In the absence of archaeological records and with only scanty historical documentation of the Roma, comparative linguistic studies were the first to identify their Indian origin. Recently, molecular studies on the basis of disease-causing mutations and haploid DNA markers (i.e. mtDNA and Y-chromosome) supported the linguistic view. The presence of Indian-specific Y-chromosome haplogroup H1a1a-M82 and mtDNA haplogroups M5a1, M18 and M35b among Roma has corroborated that their South Asian origins and later admixture with Near Eastern and European populations. However, previous studies have left unanswered questions about the exact parental population groups in South Asia. Here we present a detailed phylogeographical study of Y-chromosomal haplogroup H1a1a-M82 in a data set of more than 10,000 global samples to discern a more precise ancestral source of European Romani populations. The phylogeographical patterns and diversity estimates indicate an early origin of this haplogroup in the Indian subcontinent and its further expansion to other regions. Tellingly, the short tandem repeat (STR) based network of H1a1a-M82 lineages displayed the closest connection of Romani haplotypes with the traditional scheduled caste and scheduled tribe population groups of northwestern India.
Link
South Indian Y chromosomes (+ a little complaining about methods)
The table of haplogroup frequencies (left) may prove quite useful, but I am fairly disappointed with what appears to be the state of the art in recent published research on Y chromosome variation. This is not to belittle the tremendous amount of labor and money needed to collect and genotype large representative samples of individuals; only to express hope that better use of the collected samples could be achieved.
First of all, it is inconceivable to me how scientists can continue to use the 3x slower "evolutionary mutation rate" for their analyses of Y-chromosome ages on the basis of Y-STR markers. I have done my small part in my Y-STR series to show that this mutation rate is applicable only for a rather specific demographic history, and completely unsuitable to real growing human populations where Y-STR variance accumulates at close to the genealogical rate. And, my observations merely elaborated quantitatively what was already present in Zhivotovsky et al. (2006) but has been completely ignored since:
This "mutation rate" issue notwithstanding, it was also recently shown that by Busby et al. that Y-STR based estimates have a dependence on the set of Y-STRs used, with markers exhibiting linear behavior across different time spans. This does not invalidate their use as molecular clocks, but highlights the need to not only select a bunch of Y-STRs, but also either (i) demonstrate that the selected set exhibits linear behavior for the time span of interest, or (ii) correct for deviations from linearity. Again, this type of modelling of microsatellite behavior was recently achieved for autosomal STRs by Sun et al. Note that such deviations result in a slower rate than the genealogical one, but the mechanism whereby this is produced is completely different than the one proposed by Zhivotovsky et al.: it is not drift in a non-growing (m=1) population that reduces the effective rate, but rather "saturation" of the mutation process, whereby the variance at fast-mutating markers grows sub-linearly with time, because of physical constraints on their possible range of values.
I don't hope that Y-STR based age estimation will have much to offer in the coming years. But the third set of the 1000 Genomes Project is on its way, and this will include a variety of South Asian samples. Very soon we will be in a good position to study the time depth of common ancestry between e.g., European and South Asian Y-chromosomes within various haplogroups using point mutations, and these are not plagued by many of the problems associated with Y-STR variation and its interpretation.
Finally, I can't help but notice that this paper has not acknowledged the tremendous progress in resolving the Y chromosome phylogeny done by non-academic researchers. With the current state of our knowledge, the claim that haplogroup R1a1 is "autochthonous" in India is not tenable. Even if one discounts all the evidence made by SNP discoveries in the commercial testing world (and why should they?), finer-scale structure within this haplogroup has now been officially published and appears to be inconsistent with a South Asian origin of this haplogroup.
Certainly, not all is resolved; for example, the representation of tribal populations in commercial DNA testing is almost non-existent, and a sampling of their Y-SNP diversity is urgently needed. A very useful paradigm of research is that of recent work on the most basal clade of the Y-chromosome phylogeny (A00) in which the identification of very unique Y-chromosomes by genetic genealogists was combined with academic samples of "indigenous" peoples to produce new knowledge.
Much of population genetic research will benefit from such consilience between academics and amateurs. This is not an idle hope, but a recognition that this field is one in which the public not only has a substantial interest but can also do something about it. Many might be interested in Mars exploration, but without Elon Musk's bank account, most are consigned to being consumers of information about the Red Planet. Hopefully, better ways of combining the efforts of research scientists and the educated public can be identified and used in the near future.
PLoS ONE 7(11): e50269. doi:10.1371/journal.pone.0050269
Population Differentiation of Southern Indian Male Lineages Correlates with Agricultural Expansions Predating the Caste System
GaneshPrasad ArunKumar et al.
Previous studies that pooled Indian populations from a wide variety of geographical locations, have obtained contradictory conclusions about the processes of the establishment of the Varna caste system and its genetic impact on the origins and demographic histories of Indian populations. To further investigate these questions we took advantage that both Y chromosome and caste designation are paternally inherited, and genotyped 1,680 Y chromosomes representing 12 tribal and 19 non-tribal (caste) endogamous populations from the predominantly Dravidian-speaking Tamil Nadu state in the southernmost part of India. Tribes and castes were both characterized by an overwhelming proportion of putatively Indian autochthonous Y-chromosomal haplogroups (H-M69, F-M89, R1a1-M17, L1-M27, R2-M124, and C5-M356; 81% combined) with a shared genetic heritage dating back to the late Pleistocene (10–30 Kya), suggesting that more recent Holocene migrations from western Eurasia contributed less than 20% of the male lineages. We found strong evidence for genetic structure, associated primarily with the current mode of subsistence. Coalescence analysis suggested that the social stratification was established 4–6 Kya and there was little admixture during the last 3 Kya, implying a minimal genetic impact of the Varna (caste) system from the historically-documented Brahmin migrations into the area. In contrast, the overall Y-chromosomal patterns, the time depth of population diversifications and the period of differentiation were best explained by the emergence of agricultural technology in South Asia. These results highlight the utility of detailed local genetic studies within India, without prior assumptions about the importance of Varna rank status for population grouping, to obtain new insights into the relative influences of past demographic events for the population structure of the whole of modern India.
Link
First of all, it is inconceivable to me how scientists can continue to use the 3x slower "evolutionary mutation rate" for their analyses of Y-chromosome ages on the basis of Y-STR markers. I have done my small part in my Y-STR series to show that this mutation rate is applicable only for a rather specific demographic history, and completely unsuitable to real growing human populations where Y-STR variance accumulates at close to the genealogical rate. And, my observations merely elaborated quantitatively what was already present in Zhivotovsky et al. (2006) but has been completely ignored since:
In simulations of a neutral process with average rate of increase m = 1, the number of surviving haplogroups rapidly decreased with time and corresponded well with the theory of mutant survival (Li 1955, p. 242), and the average size of the surviving haplogroups increased each generation by a value rapidly approaching 0.5 (data not shown), which agrees with asymptotic fraction of 2/t of haplotypes that survive at generation t (Athreya and Ney 1972, p. 19). The accumulated variance increased almost linearly (fig. 1), at a rate of increase about 0.00028 per generation; that is, the actual rate of accumulation microsatellite variation was about 3.6 times less than that predicted from the germ line mutation rate. This corresponds perfectly to the 3- to 4-fold difference observed between germ line and evolutionarily effective mutation rate.The issue is all but resolved in the amateur "genetic genealogy" community, but even professional geneticists often use either genealogical or evolutionary rate, or take an agnostic stance by reporting results based on both rates. To arrive at strong conclusions about a topic on the basis of a mutation rate that is, to say the least, controversial, without even acknowledging the existence of a controversy is unsatisfactory. Y-chromosome researchers ought to copy the attitude of those working with autosomal DNA, where a corresponding mutation rate controversy was not swept under the carpet, but acknowledged (e.g., in the recent Meyer et al. high-coverage Denisova paper), with the implications of the uncertainty during the present "transitional" period quantified in the form of wider confidence intervals.
This "mutation rate" issue notwithstanding, it was also recently shown that by Busby et al. that Y-STR based estimates have a dependence on the set of Y-STRs used, with markers exhibiting linear behavior across different time spans. This does not invalidate their use as molecular clocks, but highlights the need to not only select a bunch of Y-STRs, but also either (i) demonstrate that the selected set exhibits linear behavior for the time span of interest, or (ii) correct for deviations from linearity. Again, this type of modelling of microsatellite behavior was recently achieved for autosomal STRs by Sun et al. Note that such deviations result in a slower rate than the genealogical one, but the mechanism whereby this is produced is completely different than the one proposed by Zhivotovsky et al.: it is not drift in a non-growing (m=1) population that reduces the effective rate, but rather "saturation" of the mutation process, whereby the variance at fast-mutating markers grows sub-linearly with time, because of physical constraints on their possible range of values.
I don't hope that Y-STR based age estimation will have much to offer in the coming years. But the third set of the 1000 Genomes Project is on its way, and this will include a variety of South Asian samples. Very soon we will be in a good position to study the time depth of common ancestry between e.g., European and South Asian Y-chromosomes within various haplogroups using point mutations, and these are not plagued by many of the problems associated with Y-STR variation and its interpretation.
Finally, I can't help but notice that this paper has not acknowledged the tremendous progress in resolving the Y chromosome phylogeny done by non-academic researchers. With the current state of our knowledge, the claim that haplogroup R1a1 is "autochthonous" in India is not tenable. Even if one discounts all the evidence made by SNP discoveries in the commercial testing world (and why should they?), finer-scale structure within this haplogroup has now been officially published and appears to be inconsistent with a South Asian origin of this haplogroup.
Certainly, not all is resolved; for example, the representation of tribal populations in commercial DNA testing is almost non-existent, and a sampling of their Y-SNP diversity is urgently needed. A very useful paradigm of research is that of recent work on the most basal clade of the Y-chromosome phylogeny (A00) in which the identification of very unique Y-chromosomes by genetic genealogists was combined with academic samples of "indigenous" peoples to produce new knowledge.
Much of population genetic research will benefit from such consilience between academics and amateurs. This is not an idle hope, but a recognition that this field is one in which the public not only has a substantial interest but can also do something about it. Many might be interested in Mars exploration, but without Elon Musk's bank account, most are consigned to being consumers of information about the Red Planet. Hopefully, better ways of combining the efforts of research scientists and the educated public can be identified and used in the near future.
PLoS ONE 7(11): e50269. doi:10.1371/journal.pone.0050269
Population Differentiation of Southern Indian Male Lineages Correlates with Agricultural Expansions Predating the Caste System
GaneshPrasad ArunKumar et al.
Previous studies that pooled Indian populations from a wide variety of geographical locations, have obtained contradictory conclusions about the processes of the establishment of the Varna caste system and its genetic impact on the origins and demographic histories of Indian populations. To further investigate these questions we took advantage that both Y chromosome and caste designation are paternally inherited, and genotyped 1,680 Y chromosomes representing 12 tribal and 19 non-tribal (caste) endogamous populations from the predominantly Dravidian-speaking Tamil Nadu state in the southernmost part of India. Tribes and castes were both characterized by an overwhelming proportion of putatively Indian autochthonous Y-chromosomal haplogroups (H-M69, F-M89, R1a1-M17, L1-M27, R2-M124, and C5-M356; 81% combined) with a shared genetic heritage dating back to the late Pleistocene (10–30 Kya), suggesting that more recent Holocene migrations from western Eurasia contributed less than 20% of the male lineages. We found strong evidence for genetic structure, associated primarily with the current mode of subsistence. Coalescence analysis suggested that the social stratification was established 4–6 Kya and there was little admixture during the last 3 Kya, implying a minimal genetic impact of the Varna (caste) system from the historically-documented Brahmin migrations into the area. In contrast, the overall Y-chromosomal patterns, the time depth of population diversifications and the period of differentiation were best explained by the emergence of agricultural technology in South Asia. These results highlight the utility of detailed local genetic studies within India, without prior assumptions about the importance of Varna rank status for population grouping, to obtain new insights into the relative influences of past demographic events for the population structure of the whole of modern India.
Link
October 17, 2012
The tangled web of humanity
Indian populations are composed of two ancestral components: Ancestral North Indians (ANI) and Ancestral South Indians (ASI), discovered by Reich et al. (2009). In that paper, it was also shown that ASI forms a clade with East Eurasians, while ANI does so with West Eurasians.
Consider the following D-statistic:
As we shall see, this takes positive values, consistent with the idea of gene flow between Europeans and Indians at the exclusion of Sardinians. However, this gene flow may involve either the West Eurasian component in the ancestry of Indians (i.e., this component is more related to Europeans than to Sardinians), or to the ASI component (which is related to Europeans via the common "red arrow" portions of ancestry).
We can figure out what is going on by trying different Indian populations along the Indian Cline, and seeing whether the D-statistic is inflated/deflated in populations of greater ANI/ASI ancestry.
Here are the results:
Russian Orcadian French Lithuanians ANI
Mala 0.0153 0.0120 0.0088 0.0131 38.86
Madiga 0.0153 0.0122 0.0091 0.0111 40.66
Chenchu 0.0157 0.0108 0.0088 0.0115 40.76
Bhil 0.0149 0.0115 0.0086 0.0124 42.96
Satnami 0.0166 0.0125 0.0091 0.0126 43.06
Kurumba 0.0156 0.0117 0.0095 0.0121 43.26
Kamsali 0.0139 0.0105 0.0088 0.0098 44.56
Vysya 0.0130 0.0099 0.0083 0.0102 46.26
Lodi 0.0143 0.0124 0.0092 0.0125 49.96
Naidu 0.0138 0.0104 0.0092 0.0108 50.16
Tharu 0.0150 0.0112 0.0095 0.0118 51.06
Velama 0.0126 0.0107 0.0083 0.0095 54.76
Srivastava 0.0144 0.0124 0.0091 0.0116 56.46
Meghawal 0.0131 0.0107 0.0088 0.0117 60.36
Vaish 0.0143 0.0144 0.0099 0.0128 62.66
Kashmiri_Pandit 0.0119 0.0116 0.0090 0.0116 70.66
Sindhi 0.0106 0.0112 0.0095 0.0111 73.76
Pathan 0.0098 0.0114 0.0087 0.0106 76.96
In other words, the evidence for gene flow between Russians and Indians is maximized when south Indian (ASI-rich) populations are used.
Patterson et al. (2012) published a different pattern: non-Sardinian Europeans have North Eurasian-like ancestry that links them to Amerindian populations. It is thus possible that ASI and the East Eurasian-like admixture in North Europeans may share a common evolutionary history:
Now, consider a hypothetical population of the Indian Cline. A European population is related to it both via its ANI/West Eurasian ancestry, but also via its ASI ancestry, because the East_Eurasian component in Europeans shares a portion of ancestry (indicated by the red arrow) with ASI.
Sardinians lack (or have less of) this "red arrow" portion of ancestry.
It is also possible that ANI itself may have some East_Eurasian ancestry, like Europeans do; this is not indicated in the figure. More on this later.
D(European, Sardinian, Indian, San)
As we shall see, this takes positive values, consistent with the idea of gene flow between Europeans and Indians at the exclusion of Sardinians. However, this gene flow may involve either the West Eurasian component in the ancestry of Indians (i.e., this component is more related to Europeans than to Sardinians), or to the ASI component (which is related to Europeans via the common "red arrow" portions of ancestry).
We can figure out what is going on by trying different Indian populations along the Indian Cline, and seeing whether the D-statistic is inflated/deflated in populations of greater ANI/ASI ancestry.
Here are the results:
Russian Orcadian French Lithuanians ANI
Mala 0.0153 0.0120 0.0088 0.0131 38.86
Madiga 0.0153 0.0122 0.0091 0.0111 40.66
Chenchu 0.0157 0.0108 0.0088 0.0115 40.76
Bhil 0.0149 0.0115 0.0086 0.0124 42.96
Satnami 0.0166 0.0125 0.0091 0.0126 43.06
Kurumba 0.0156 0.0117 0.0095 0.0121 43.26
Kamsali 0.0139 0.0105 0.0088 0.0098 44.56
Vysya 0.0130 0.0099 0.0083 0.0102 46.26
Lodi 0.0143 0.0124 0.0092 0.0125 49.96
Naidu 0.0138 0.0104 0.0092 0.0108 50.16
Tharu 0.0150 0.0112 0.0095 0.0118 51.06
Velama 0.0126 0.0107 0.0083 0.0095 54.76
Srivastava 0.0144 0.0124 0.0091 0.0116 56.46
Meghawal 0.0131 0.0107 0.0088 0.0117 60.36
Vaish 0.0143 0.0144 0.0099 0.0128 62.66
Kashmiri_Pandit 0.0119 0.0116 0.0090 0.0116 70.66
Sindhi 0.0106 0.0112 0.0095 0.0111 73.76
Pathan 0.0098 0.0114 0.0087 0.0106 76.96
For each Indian Cline population, I list the ANI percentage, as estimated by Reich et al. (2009) in the last column, and the D-statistic of the above given form for different pairs of Indian and European populations.
We can plot the D-statistic vs. ANI for each of our European populations:
The correlation coefficients confirm the visual impression, that for the HGDP Russians there is a significantly negative relationship between ANI admixture in an Indian Cline population and the D-statistic:
Russian Orcadian French Lithuanians
-0.8631118 0.08670188 0.1870127 -0.1889908
In other words, the evidence for gene flow between Russians and Indians is maximized when south Indian (ASI-rich) populations are used.
The lack of a clear pattern in the other three populations is itself interesting. One possible explanation involves East Eurasian-like admixture in the ANI, a conjecture which would make sense, given that all non-Sardinian continental West Eurasians seem to possess it.
If that is true, then as we go "south" along the Indian Cline, ASI related admixture inflates the D-statistic by increasing the "red arrow" overlap with the East Eurasian-like admixture in Europeans. As we go "north" along this cline, then the D-statistic decreases, due to ASI-reduction, but also increases, due to East Eurasian-like admixture in ANI, with an end result of no clear pattern in the superposition of processes.
In any case, this is an interesting example of a crisscrossing type of admixture where unrelated processes (east Eurasian-like admixture in Russians and ASI admixture in Indians) combine to present an unusual effect.
If that is true, then as we go "south" along the Indian Cline, ASI related admixture inflates the D-statistic by increasing the "red arrow" overlap with the East Eurasian-like admixture in Europeans. As we go "north" along this cline, then the D-statistic decreases, due to ASI-reduction, but also increases, due to East Eurasian-like admixture in ANI, with an end result of no clear pattern in the superposition of processes.
In any case, this is an interesting example of a crisscrossing type of admixture where unrelated processes (east Eurasian-like admixture in Russians and ASI admixture in Indians) combine to present an unusual effect.
October 14, 2012
Differential relationship of ANI to Caucasus populations
The observation in Reich et al. (2009) that Ancestral North Indians (ANI) and CEU (HapMap White Utahns) form a clade to the exclusion of Adygei (a NW Caucasian HGDP population) has always puzzled me, because in my ADMIXTURE experiments, the dominant West Eurasian component in South Asia has always been one centered in the Caucasus rather than Europe, an observation also confirmed by Metspalu et al. (2011).
I have now used the qpDstat program of ADMIXTOOLS to calculate some D-statistics using a wide variety of West Asian populations that have appeared in the literature since 2009 (mainly Behar et al. 2010, and Yunusbayev et al. 2011), in addition to the Adygei. This analysis is based on 87,925 SNPs. I have kept SNPs included in the Rutgers map for Illumina chips, since most of the datasets merged with the Reich et al. (2009) dataset were genotyped on such chips, and applied a --geno 0.01 flag after merging the various datasets.
The following populations were considered:
I have now used the qpDstat program of ADMIXTOOLS to calculate some D-statistics using a wide variety of West Asian populations that have appeared in the literature since 2009 (mainly Behar et al. 2010, and Yunusbayev et al. 2011), in addition to the Adygei. This analysis is based on 87,925 SNPs. I have kept SNPs included in the Rutgers map for Illumina chips, since most of the datasets merged with the Reich et al. (2009) dataset were genotyped on such chips, and applied a --geno 0.01 flag after merging the various datasets.
The following populations were considered:
North_Kannadi, Sindhi, Pathan, Kashmiri_Pandit, Brahmins_from_Uttar_Pradesh_M, Iyer_D, Iyengar_D, CEU30, Onge, Adygei, Lezgins, Georgians, Ukranians_Y, Abhkasians_Y, Chechens_Y, North_Ossetians_Y, Armenians_Y, Kurds_Y, Iranians_19, Romanians_14, Bulgarians_Y, Greek_D
I calculated D-statistics of the form:
D(CEU30, non-CEU West Eurasian; South Asian, Onge)
I report, for each South Asian population, the score for non-CEU West Eurasian being Adygei, and the most negative Z-score:
It is clear, that while CEU are more related to Indian cline populations than Adygei are, at least for the case of the Pathans, they are less related to them than Georgians are. The Georgian population is one of the modal populations of the West Asian autosomal component.
The full set of results can be found here. It appears that North Ossetians (who are also from the NW Caucasus) follow the Adygei pattern, while Abkhazians, Lezgins, and Armenians appear more related to ANI than CEU are, similar to the Georgian pattern.
Interestingly, D(CEU, Iranian; South Asian, Onge) appear positive, and this is probably not because CEU are more related to ANI than Iranians, but because Iranians also have ASI admixture.
Ukrainians do not appear more closely related to ANI than CEU are, rather the opposite. This is consistent with the recent f3-statistics analysis of South Indian Brahmins, in which the strongest signals of admixture involved populations from Western Europe, the Balkans, and West Asia, but not from eastern Europe.
All the available evidence suggests that ANI is most related to populations of the South and NE Caucasus, and not to those of the NW Caucasus like Adygei. To confirm this, I calculated some additional D-statistics (also included in the spreadsheet):
All in all, this seems to be very consistent with my working model of Eurasian prehistory. It is also in agreement with proposals for a genetic relationship between Indo-European and NE Caucasian/Hurrian and/or early contacts between it and Kartvelian. No such relationship, as far as I can tell, has been seriously advanced with respect to NW Caucasian languages.
A valuable lesson from this analysis is that now that multiple West Asian populations have been genotyped, caution must be exercised when using the HGDP Adygei, because they are clearly not representative of the different language families (NE/S Caucasian and Indo-European) present in West Asia. Surprises may lurk even at the sub-1000km scale in a region as diverse as the Caucasus.
A valuable lesson from this analysis is that now that multiple West Asian populations have been genotyped, caution must be exercised when using the HGDP Adygei, because they are clearly not representative of the different language families (NE/S Caucasian and Indo-European) present in West Asia. Surprises may lurk even at the sub-1000km scale in a region as diverse as the Caucasus.
October 08, 2012
rolloff analysis of North Indian Brahmins as Orcadian+North Kannadi
In a previous experiment, I tested the Dodecad Project South Indian Brahmin sample (Iyer and Iyengar) using Orcadians and North Kannadi as reference populations. In the current one, I use the same references to investigate admixture in the Uttar Pradesh Brahmins included in the Metspalu et al. (2011) dataset. A total of 473,837 SNPs are used in this experiment.
I first verified that f3(Brahmins_from_Uttar_Pradesh_M; Orcadian, North_Kannadi) is negative, using qp3Pop:
This is about a thousand years younger than the signal observed for the South Indian Brahmin group. A possible explanation has to do with the fact that South Indian Brahmins migrated to South India, and hence did not intermarry with successive waves of invaders into India in historical times. Uttar Pradesh, on the other hand, received multiple invasions from the direction of Central Asia:
I first verified that f3(Brahmins_from_Uttar_Pradesh_M; Orcadian, North_Kannadi) is negative, using qp3Pop:
Source 1 Source 2 Target f_3 std. err Z SNPs
result: Orcadian North_Kannadi Brahmins_from_Uttar_Pradesh_M -0.007882 0.000359 -21.951 463297
The exponential fit can be seen below:
The estimated age is 79.706 +/- 9.197 generations, or 2,310 +/- 270 years.This is about a thousand years younger than the signal observed for the South Indian Brahmin group. A possible explanation has to do with the fact that South Indian Brahmins migrated to South India, and hence did not intermarry with successive waves of invaders into India in historical times. Uttar Pradesh, on the other hand, received multiple invasions from the direction of Central Asia:
Most of the invaders of North India passed through the Gangetic plains of what is today Uttar Pradesh. Control over this region was of vital importance to the power and stability of all of India's major empires, including the Maurya (320–200 BC), Kushan (100–250 CE), Gupta (350–600 CE), and Gurjara-Pratihara (650–1036 CE) empires.[11] Following the Huns invasions that broke the Gupta empire, the Ganges-Yamuna Doab saw the rise of Kannauj.[12] During the reign of Harshavardhana (590–647 CE), the Kannauj empire reached its zenith.[12]It will be interesting to see whether a young admixture signal also exists in my 5-strong sample of Jatts, since that population has traditions of "Scythian" origins.
October 03, 2012
rolloff analysis of South Indian Brahmins as Armenian+Chamar
The first analysis of this population showed that there were negative f3(Brahmin; X, Y) signals when X were a variety of West European, Balkan, and West Asian population, and Y either the Chamar or North Kannadi. In the first analysis I used Orcadians and North Kannadi. I have now carried out a new rolloff analysis on 470,559 SNPs, using Armenians_Y and Chamar_M as the reference populations.
The exponential fit can be seen below.
The admixture date is 142.814 +/- 15.010 generations, or 4,140 +/- 440 years, which seems to correspond quite well with commonly accepted dates for the formation of Indo-Iranian.
I have previously observed that:
The exponential fit can be seen below.
The admixture date is 142.814 +/- 15.010 generations, or 4,140 +/- 440 years, which seems to correspond quite well with commonly accepted dates for the formation of Indo-Iranian.
I have previously observed that:
These patterns can be well-explained, I believe, if we accept that Indo-Iranians are partially descended not only from the early Proto-Indo-Europeans of the Near East, but also from a second element that had conceivable "South Asian" affiliations. The most likely candidate for the "second element" is the population of the Bactria Margiana Archaeological Complex (BMAC). The rise and demise of the BMAC fits well with the relative shallowness of the Indo-Iranian language family and its 2nd millennium BC breakup, and has been assigned an Indo-Iranian identity on other grounds by its excavator. As climate change led to the decline and abandonment of BMAC sites, its population must have spread outward: to the Iranian plateau, the steppe, and into South Asia, reinforcing the linguistic differentiation that must have already began over the extensive territory of the complex.
Quite possibly, as the West Asian element began mixing with the Sardinian-like population in Greece, another branch of the Indo-Europeans made its appearance east of the Caspian, in the territory of the BMAC, admixing with South Asian-like populations. Thus, it might seem that the Graeco-Aryan clade of Indo-European broke down during the Bronze Age, with one branch heading off to the Balkans, and another to the east.
This scenario would also explain how the likely J2-bearing population associated with the earliest Proto-Indo-Europeans may have acquired the contrasting pattern I have previously described: the western (cis-Caspian) population would have admixed with R1b-bearers who occupy the "small arc" west and south of the Caspian, while the eastern (trans-Caspian) populations would have admixed with R1a-bearers who occupy the "large arc" in the flatlands north and east of the Caspian. It would also explain how the "western" branch (Graeco-Armenian) would have picked up Sardinian-like "Atlantic_Med" admixture, which is absent in the "eastern" Indo-Iranian branch.
At the same time, this scenario would explain the lack of "North European" admixture in the "western" branch (since this was shielded by the Caucasus and Black Sea from the northern Europeoids who may have lived north of these barriers), and explain it in the "eastern" branch (since the BMAC agriculturalists were in contact with presumably northern Europeoid groups inhabiting the steppelands, unhindered by any major physical barriers). (The relative absence of this admixture in the Graeco-Armenian branch may be advanced on the strength of its absence in Armenians, the evidence of a Sardinian-like Iron Age individual from Bulgaria, and the historical-era timing of admixture for the Greek population.)
It would be interesting to carry out similar experiments on Iranian groups, to see if they, too, present a similar pattern of admixture.
Subscribe to:
Posts (Atom)