We sequenced great ape genomes to a mean of 25-fold coverage per individual (Table 1, Supplementary Information and Supplementary Table 1) sampling natural diversity by selecting captive individuals of known wild-born origin as well as individuals from protected areas in Africa (Fig. 1a). We also included nine human genomes—three African and six non-African individuals11 . Variants were called using the software package GATK (ref. 12) (Methods), applying several quality filters, including conservative allele balance filters, and requiring that genomes showed <2% contamination between samples (Methods and Supplementary Information). In order to assess the quality of single nucleotide variant (SNV) calls, we performed three sets of independent validation experiments with concordance rates ranging from 86% to 99% depending on allele frequency, the great ape population analysed and the species reference genome used (Supplementary Information and Supplementary Table 2). In total, we discovered 84.0 million fixed substitutions and 88.8 million segregating sites of high quality (Table 1 and Supplementary Table 3), providing the most comprehensive catalogue of great ape genetic diversity to date. From these variants we also constructed a list of potentially ancestry-informative markers (AIMs) for each of the surveyed populations, although a larger sampling of some subspecies is still required (Supplementary Information).
We initially explored the genetic relationships between individuals by constructing neighbour-joining phylogenetic trees from both autosomal and mitochondrial genomes (Supplementary Information). The autosomal tree identified separate monophyletic groupings for each species or subspecies designation (Supplementary Fig. 8.5.1) and supports a split of extant chimpanzees into two groups. Nigeria–Cameroon and western chimpanzees form a monophyletic clade (>97% of all autosomal trees); central and eastern chimpanzees form a second group (72% of all autosomal trees).
Genome-wide patterns of heterozygosity (Fig. 1b) reveal a threefold range in single nucleotide polymorphism (SNP) diversity. Non-African humans, eastern lowland gorillas, bonobos and western chimpanzees show the lowest genetic diversity (∼0.8 × 10−3 heterozygotes per base pair (bp)). In contrast, central chimpanzees, western lowland gorillas and both orangutan species show the greatest genetic diversity (1.6 × 10−3 – 2.4 × 10−3 heterozygotes per bp). These differences are also reflected by measures of inbreeding from runs of homozygosity13 (Fig. 1c and Supplementary Information). Bonobos and western lowland gorillas, for example, have similar distributions of tracts of homozygosity as human populations that have experienced strong genetic bottlenecks (Karitiana and Papuan). Eastern lowland gorillas appear to represent the most inbred population, with evidence that they have been subjected to both recent and ancient inbreeding.
To examine the level of genetic differentiation between individuals we performed a principal component analysis (PCA) of SNP genotypes (Supplementary Information). Chimpanzees were stratified between subspecies with PC1 separating western and Nigeria–Cameroon chimpanzees from the eastern and central chimpanzees and PC2 separating western and Nigeria–Cameroon chimpanzees. In gorillas, PC1 clearly separates eastern and western gorillas, whereas the western lowland gorillas are distributed along a gradient of PC2, with individuals from the Congo and western Cameroon positioning in opposite directions along the axis. The isolated Cross River gorilla is genetically more similar to Cameroon western lowland gorillas and can be clearly differentiated with PC3 (Supplementary Fig. 8.2.9).
We explored the level of shared ancestry among individuals within each group14 using an admixture model (FRAPPE). In chimpanzees, the four known subspecies are clearly distinguished when fitting the model using four ancestry components (K = 4) (Fig. 1d). Additional substructure is identified among the eastern chimpanzees Vincent and Andromeda (K = 6), who hail from the most eastern sample site (Gombe National Park, Tanzania). As in Gonder et al.2, we have identified three Nigeria–Cameroon samples (Julie, Tobi and Banyo, K = 3–5) with components of central chimpanzee ancestry. However, taking central chimpanzees and the remaining Nigeria–Cameroon chimpanzees as ancestral populations shows no evidence of gene flow by either the F3 statistic or HapMix. This indicates that these three samples are not the result of a recent admixture and may represent a genetically distinct population (Supplementary Information).
In gorillas, following the separation of eastern and western lowland species (K = 2), an increasing number of components further subdivide western lowland populations distinguishing Congolese and Cameroonian gorillas—a pattern consistent with the structure observed in the PCA analysis (Supplementary Fig. 8.2.9). One striking observation is the extent of admixed ancestry predicted for captive individuals when compared to wild-born. Our analysis suggests that most captive individuals included in this study are admixed from two or more genetically distinct wild-born populations leading to an erosion of phylogeographic signal. This finding is consistent with microsatellite analyses of captive gorillas15 and the fact that great ape breeding programs have not been managed at the subspecies level.
As great apes have been evolving on separate lineages since the middle Miocene, we attempted to reconstruct the history of these various species and subspecies by applying methods sensitive to branching processes, changes in effective population size (Ne), and gene flow occurring at different time scales. Using a combination of speciation times inferred from a haploid pairwise sequential Markovian coalescent (PSMC) analysis16, a coalescent hidden Markov model (CoalHMM)3 and incomplete lineage sorting approaches, we were able to estimate the most ancient split times and effective population sizes among the great ape species. By combining these estimates with an approximate Bayesian computation (ABC)17 analysis applied to the more complex chimpanzee phylogeny, we constructed a composite model of great ape population history over the last ∼15 million years (Fig. 2). This model presents a complete overview of great ape divergence and speciation events in the context of historical effective population sizes.
PSMC analyses of historical Ne (Fig. 3) suggests that the ancestral Pan lineage had the largest effective population size of all lineages >3 million years ago (Myr), after which time the population of the common ancestor of both bonobos and chimpanzees experienced a dramatic decline. Both PSMC and ABC analyses support a model of subsequent increase in chimpanzee Ne starting ∼1 Myr, before their divergence into separate subspecies. Following an eastern chimpanzee increase in Ne (∼500 thousand years ago, kyr), the central chimpanzees reached their zenith ∼200–300 kyr followed by the western chimpanzee ∼150 kyr. Although the PSMC profiles of the two subspecies within each of the major chimpanzee clades (eastern/central and Nigeria–Cameroon/western) closely shadow each other between 100 kyr and 1 Myr, the western chimpanzee PSMC profile is notable for its initial separation from that of the other chimpanzees, followed by its sudden rise and decline (Fig. 3 and Supplementary Information). The different gorilla species also show variable demographic histories over the past ∼200 kyr. Eastern lowland gorillas have the smallest historical Ne, consistent with smaller present-day populations and a history of inbreeding (Fig. 1c). A comparison of effective population sizes with the ratio of non-synonymous to synonymous substitutions finds that selection has acted more efficiently in populations with higher Ne, consistent with neutral theory (Supplementary Information).
Although the phylogeny of bonobos and western, central and eastern common chimpanzees has been well established based on genetic data18, there is still uncertainty regarding their relationship to Nigeria–Cameroon chimpanzees2,19. Regional neighbour-joining trees and a maximum-likelihood tree estimated from allele frequencies both show that Nigeria–Cameroon and western chimpanzees form a clade. A complex demographic history has been previously reported for chimpanzees with evidence of asymmetrical gene flow among different subspecies. For instance, migration has been identified from western into eastern chimpanzees4, two subspecies that are currently geographically isolated. We find support for this using the D-statistic, a model-free approach that tests whether unequal levels of allele sharing between an outgroup and two populations that have more recently diverged (D(H,W;E,C)>16 s.d.). However, no previous genome-wide analysis that has examined gene flow included chimpanzees from the Nigeria–Cameroon subspecies and a comparison of them with eastern chimpanzees results in a highly significant D-statistic (D(H,E;W,N)>25 s.d.). Furthermore, TreeMix, a model-based approach that identifies gene flow events to explain allele frequency patterns not captured by a simple branching phylogeny, infers a signal of gene flow between Nigeria–Cameroon and eastern chimpanzees (P = 2 × 10300). A more detailed treatment of gene flow applying different models and methods may be found in the Supplementary Information.
Genetic diversity is depressed at or close to genes in almost all species (Supplementary Fig. 11.1) with the effect less pronounced in subspecies with lower estimated Ne, consistent with population genetic theory. When we compare the relative level of X chromosome and autosomal (X/A) diversity across great apes as a function of genetic distance from genes, the eastern lowland gorillas and Bornean orangutans are outliers, with substantially reduced X/A diversity compared to the neutral expectation of 0.75, regardless of the distance to genes. This pattern is consistent with a recent reduction in effective population size20, clearly visible in the PSMC analysis for both species (Fig. 3). However, bonobos also demonstrate a relatively constant level of X/A diversity regardless of distance from genes, with values very much in line with neutral expectations. All other subspecies demonstrate a pattern consistent with previous studies in humans21 where X/A diversity is lower than 0.75 close to genes and higher farther away from genes.
It has been proposed that loss of gene function may represent a common evolutionary mechanism to facilitate adaptation to changes in an environment22. There has been speculation that the success of humans may have, in part, been catalysed by an excess of beneficial loss-of-function mutations23. We thus characterized the distribution of fixed loss-of-function mutations among different species of great apes identifying nonsense and frameshift mutations resulting from SNVs (n = 806) and indels (n = 1080) in addition to gene deletion events (n = 96) (Supplementary Table 4). We assigned these events to the phylogeny and determined that the number of fixed loss-of-function mutations scales proportionally to the estimated branch lengths (R2 = 0.987 SNVs, R2 = 0.998 indels). In addition, we found no evidence of distortion on the terminal branches of the tree compared to point mutations based on a maximum likelihood analysis (Supplementary Information). Thus, the human branch in particular showed no excess of fixed loss-of-function mutations even after accounting for human-specific pseudogenes24 (Supplementary Information).
Our analysis provides one of the first genome-wide views of the major patterns of evolutionary diversification among great apes. We have generated the most comprehensive catalogue of SNPs for chimpanzees (27.2 million), bonobos (9.0 million), gorillas (19.2 million) and orangutans (24.3 million) (Table 1) to date and identified several thousand AIMs, which provides a useful resource for future analyses of ape populations. Humans, western chimpanzees and eastern gorillas all show a remarkable dearth of genetic diversity when compared to other great apes. It is striking, for example, that sequencing of 79 great ape genomes identifies more than double the number of SNPs obtained from the recent sequencing of more than a thousand diverse humans25—a reflection of the unique out-of-Africa origin and nested phylogeny of our species.
We provide strong genetic support for distinct populations and subpopulations of great apes with evidence of additional substructure. The common chimpanzee shows the greatest population stratification when compared to all other lineages with multiple lines of evidence supporting two major groups: the western and Nigeria–Cameroon and the central and eastern chimpanzees. The PSMC analysis indicates a temporal order to changes in ancestral effective population sizes over the last two million years, previous to which the Pan genus suffered a dramatic population collapse. Eastern chimpanzee populations reached their maximum size first, followed by the central and western chimpanzee. The Nigeria–Cameroon chimpanzee population size appears much more constant.
Despite their rich evolutionary history, great apes have experienced drastic declines in suitable habitat in recent years26, along with declines in local population sizes of up to 75% (ref. 27). These observations highlight the urgency to sample from wild ape populations to more fully understand reservoirs of genetic diversity across the range of each species and to illuminate how basic demographic processes have affected it. The >80 million SNPs we identified in this study may now be used to characterize patterns of genetic differentiation among great apes in sanctuaries and zoos and, thus, are of great importance for the conservation of these endangered species with regard to their original range. These efforts will greatly enhance conservation planning and management of apes by providing important information on how to maintain genetic diversity in wild populations for future generations.
--> Next: Response to Dennis Venema's Blog Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)
Email to Dennis Venema about human population bottlenecks
Posted 29th September 2017
A few months ago, I was reading a new book by Dennis Venema and Scot McKnight entitled Adam and the Genome. I was surprised to find a claim within the book that the past effective population size of humans has definitely never dropped below 10,000 individuals and that this is a fact of comparable scientific certainty to heliocentrism. I emailed Dennis Venema, the biologist author of the book, to query this. Unfortunately, he has not yet responded. I therefore remain unconvinced that it is a scientific impossibility for human beings to have all descended from a single couple. If I am wrong, though, I would like to know. I therefore post my email here, in hopes of garnering responses to my objections. When I have time, I may post a blog on this topic, which would be more polished and tie up some of the loose ends [Note added on 28th October 2017: I have now blogged about this here].
From: Richard Buggs
Subject: Human population bottlenecks
Date: 10 May 2017 at 19:05:18 BST
To: Dennis Venema
I hope you don’t mind me emailing you out of the blue. I am a biologist and a Christian. I have recently been asked by a couple of friends from my church in London about the issue of a historical Adam and Eve. They are studying for ordination at a local Church of England theological college, and were set an essay on the topic by one of their lecturers. In this context, I have been reading your recent book "Adam and the genome”.
Whilst I very much sympathise with your desire to bring science and the bible together, and help Christians not to be alienated from science, I have a few concerns about the parts of chapter three in your book on past effective population sizes. This is an areas that I have had a fair bit of exposure to over the past few years, having, for example published PSMC analyses in my research (see Nature 541: 212–216). [I then mentioned a couple of my MSS in preparation.]
I was a bit surprised that you categorically state in your book that the past human effective population size has definitely never dropped below 10,000 individuals and say that this is a fact of comparable scientific certainty to heliocentrism. Most people working in the field take reconstructions of effective population size with a pinch of salt. I well remember my surprise as a newly graduated PhD student attending a summer school on molecular evolution at Edinburgh University in 2005 on hearing Gil McVean from Oxford say over breakfast that effective population size is a nebulous concept. As I am sure you know, effective population size is a measure of a population’s susceptibility to drift, rather than an attempt to measure census population size. I would be very hesitant to rely too heavily on any estimate of past effective population size.
To get more specific, I think you are mistaken when you say this:
"If a species were formed through such an event [by a single ancestral breeding pair] or if a species were reduced in numbers to a single breeding pair at some point in its history, it would leave a telltale mark on its genome that would persist for hundreds of thousands of years— a severe reduction in genetic variability for the species as a whole”
It is easy to have misleading intuitions about the population genetic effects of a short, sudden bottleneck. For example, Ernst Mayr suggested that many species had passed through extreme bottlenecks in founder events. He argued that extreme loss of diversity in such events would promote evolutionary change. His intuition about loss of diversity in bottlenecks was wrong, though, and his argument lost much of its force when population geneticists (M. Nei, T. Maruyama and R. Chakraborty 1975 Evolution, 29(1):1-10) showed that even a bottleneck of a single pair would not lead to massive decreases in genetic diversity, if followed by rapid population growth. When two individuals are taken at random from an existing large population, they will on average carry 75% of its heterozygosity (M. Slatkin and L. Excoffier 2012 Genetics 191:171–181). From a bottleneck of a single fertilised female, if population size doubles every generation, after many generations the population will have over half of the heterozygosity of the population before the bottleneck (Barton and Charlesworth 1984, Ann. Rev. Ecol. Syst. 15:133-64). If population growth is faster than this, the proportion of heterozygosity maintained will be higher.
This means that a single pair of individuals can carry a great deal of heterozygosity with them through a bottleneck, if they come from an ancestral population with high diversity, and they will pass that on to the population they found, so long as it grows rapidly.
As you will know, it is a feature of humans that despite our current census population size of over seven billion individuals, we have lower genetic diversity than the world’s much smaller current day population of chimpanzees. The average human has 3.1 million single nucleotide variants (SNVs), but the average chimp has 5.7 million (Prado-Martinez et al 2013 Nature). African humans approximately 1.1 heterozygous SNVs in every 1000bp, whereas central chimpanzees have approximately 1.75 (Prado-Martinez et al 2013 Nature). Thus, if two central African chimpanzees were taken today and used to found an isolated population that experienced explosive population growth, the new population would have similar levels of genetic variability to modern humans.
I am not stating these figures because existing populations of chimpanzee gave rise to modern humans, but simply to show that it is hard to see how overall levels of SNP diversity and heterozygosity in modern humans could exclude the possibility of a past bottleneck of two individuals.
On top of this, we need to add in the fact that explosive population growth in humans has allowed many new mutations to rapidly accumulate in human populations, accounting for many SNPs with low minor allele frequencies (A. Keinan and A. G. Clark (2012) Science 336 (6082): 740-743).
I am also concerned about your interpretation of PSMC analysis. I do not think that a PSMC analysis that never drops below an Ne of 10,000 can be used to prove that a sudden, short bottleneck never happened. Because a single couple can carry with them 0.75 of the heterozygosity of their ancestral population, we would not expect a huge number of coalescence events at the bottleneck, and those that are there were would be smeared out over a long period of time around the bottleneck, as within the orginal Li and Durbin 2011 paper, the authors note:
“The simulations did, however, reveal a limitation of PSMC in recovering sudden changes in effective population size. For example, the instantaneous reduction from 12,000 to 1,200 at 100 kyr ago in the simulation was spread over several preceding tens of thousands of years in the PSMC reconstruction.” (Li and Durbin 2011).
Work by a graduate student in Beth Shapiro’s lab has shown that the PSMC method cannot accurately reconstruct sharp bottlenecks (https://users.soe.ucsc.edu/~chkcole/Shapiro.html).
In general, I am concerned that the studies you cite did not set out to test the hypothesis that humans have passed through a single-couple bottleneck. They are simply trying to reconstruct the most probable past effective population sizes of humans given the standard assumptions of population genetic models. I personally would feel ill at ease claiming that they prove that a short sudden bottleneck is impossible.
Sorry to send an email of such length, but I wanted to let you know that in my view you seem to be on very shaky ground here, and in danger of alienating Christians from science on the basis of a wrong interpretation of the current literature. In this case, I think you are being a bit over-zealous for science, and insisting on an overly literal interpretation of the past Ne literature. I would encourage you to step back a bit from the strong claims you are making that a two person bottleneck is disproven. Maybe you could write a blog dialing back on this a bit?
With best regards,
--> Next: Response to Dennis Venema's Blog Adam, Eve and Population Genetics: A Reply to Dr. Richard Buggs (Part 1)