Author post: Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle

This guest post is by Jared Decker on his preprint (with colleagues) “Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle“, arXived here. The post is a response to the review posted by Joe Pickrell here.

I have posted an updated version of my preprint on arXiv. Because Joe Pickrell posted his review of my preprint “Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle” on Haldane’s Sieve, I thought readers might enjoying seeing my response. I have really enjoyed having the process open to the public.

Reviewers comments are in blue.
My comments are in black.
Quotes from the manuscript are in Arial font.

Reviewer #1 [Joe Pickrell]

Overall comments:

1. A lot of interpretation depends on the robustness of the inferred population graph from TreeMix. It would be extremely helpful to see that the estimated graph is consistent across different random starting points. The authors could run TreeMix, say, five different times, and compare the results across runs. I expect that many of the inferred migration edges will be consistent, but a subset will not. Itís probably most interesting to focus interpretation on the edges that are consistent.

We followed Reviewer #1ís recommendation and have included 6 phylogenetic networks (the original network and 5 replicates) as supplementary Figure S4. The admixed histories of several of the sample populations are quite complex, and as seen in Figure S4, the same relationships can be represented multiple ways. For example if population A is admixed between populations B, C, and D, it can be placed sister to population B with migration edges from C and D, or it can be placed sister to C with migration edges from B and D. We have tried to note in the manuscript when migration edges are not consistent. But, one of the main points of the paper, introgression for an ancestral population into the African taurine clade is consistent across all replicates.

To the third paragraph of the Admixture in Europe subsection we added, “The placement of Italian breeds is not consistent across independent TreeMix runs (Figure S4), likely due to their complicated history of admixture.”

In the second to last paragraph of the manuscript we state. “In TreeMix replicates, Texas Longhorn and Romosinuano are either sister to admixed Anatolian breeds or they receive a migration edge that originates near Brahman (Figure S4).

2. Throughout the manuscript, inference from genetics is mixed in with evidence from other sources. At points it sometimes becomes unclear which points are made strictly from genetics and which are not.

We have edited the manuscript by adding citations to clarify which inference is from genetics and which is from previous studies or breed histories.

For example, the authors write, “Anatolian breeds are admixed between European, African, and Asian cattle, and do not represent the populations originally domesticated in the region”. It seems possible that the first part of that statement (about admixture) could be their conclusion from the genetic data, but itís difficult to make the second statement (about the original populations in the region) from genetics, so presumably this is based on other sources.

We edited this sentence to say, “Anatolian breeds (AB, EAR, TG, ASY, and SAR) are admixed between blue European-like, grey African-like, and green indicine-like cattle (Figures 5 and 6), and we infer they do not represent the taurine populations originally domesticated in this region due to a history of admixture.

In general, I would suggest splitting the results internal to this paper apart from the other statements and making a clear firewall between their results and the historical interpretation of the results (right now the authors have a “Results and Discussion” section, but it might be easiest to do this by splitting the “Results” from the “Discussion”. But this is up to the authors.).

The corresponding authors of this manuscript (Decker and Taylor) prefer to have the results and discussion sections combined, so we appreciate Review #1 leaving that decision up to us. But, we recognize that he brings up a valid point and have strived to make the distinction between results and discussion clearer throughout the manuscript.

3. Related to the above point, could the authors add subsection headings to the results/discussion section? Right now the topic of the paper jumps around considerably from paragraph to paragraph, and at points I had difficulty following. One possibility would be to organize subheading by the claims made in the abstract, e.g. “Cline of indicine introgression into Africa”, “wild African auroch ancestry”, etc.

Subsection headings have been added.

Specific comments:

There are quite a few results claimed in this paper, so Iím going to split my comments apart by the results reported in the abstract. As mentioned above, it would be nice if the authors clearly stated exactly which pieces of evidence they view as supporting each of these, perhaps in subheadings in the Results section. In italics is the relevant sentence in the abstract, followed by my thoughts:

Using 19 breeds, we map the cline of indicine introgression into Africa.

This claim is based on interpretation of the ADMIXTURE plot in Figure 5. I wonder if a map might make this point more clearly than Figure 5, however; the three-letter population labels in Figure 5 are not very easy to read, especially since most readers will have no knowledge of the geographic locations of these breeds.

Map added as Figure 5, with previous ADMIXTURE figure as Figure 6 so that readers can still see individual breed ancestries.

“We infer that African taurine possess a large portion of wild African auroch ancestry, causing their divergence from Eurasian taurine.”

This claim appears to be largely based on the interpretation of the treemix plot in Figure 4. This figure shows an admixture edge from the ancestors of the European breeds into the African breeds. As noted above, it seems important that this migration edge be robust across different treemix runs. Also, labeling this ancestry as “wild African auroch ancestry” seem like an interpretation of the data rather than something that has been explicitly tested, since the authors don’t have wild African aurochs in their data.

This migration edge is robust across 6 different TreeMix runs. The edge is from a node that is ancestral to European, Asian, and African taurine, and this node is approximately halfway between the common ancestor of domesticated indicine and the common ancestor of domesticated taurine. African auroch are extinct. Most, if not all, bovine ancient DNA samples come from much colder climates than northern Africa. So we are unable to sample African aurochs.

But, we feel it is a strength of the TreeMix analysis to identify introgression from ancestral populations that have not been sampled. We feel the interpretation that the introgression is from African auroch is the most parsimonious explanation of our PCA, ADMIXTURE, and TreeMix results.

Additionally, the authors claim that this result shows “there was not a third domestication process, rather there was a single origin of domesticated taurine”. I may be missing something, but it seems that genetic data cannot distinguish whether a population was “domesticated” or “wild”. That is, it seems plausible that the source population tentatively identified in Figure 4 may have been independently domesticated. There may be other sources of evidence that refute this interpretation, but this is another example of where it would be useful to have a firewall between the genetic results and the interpretation in light of other evidence. The speculation about the role of disease resistance in introgression is similarly not based on evidence from this paper and should probably be set apart.

The claim that there was a single origin of domesticated taurine is based upon the topology of the phylogenetic network, as European, Asian, and African taurine all share a common ancestor, and the Asian clade is sister to the rest of the ingroup. This rules out the possibility of a separate domestication in Africa as a separate domestication would cause African domesticates to be sister to the rest of taurus. Larson and Burger (2013) do not consider admixture a separate domestication, and we choose to follow their definition. Two domestications with the resulting population in Africa a mixture of the two is not very parsimonious. The most parsimonious explanation is admixture from a wild relative.

We agree that we have not tested the influence of trypanosomiasis resistance on driving admixture, but we feel it is an interesting hypothesis that explains the force that drove admixture. We have rephrased the sentence as:

“We hypothesize that the introgression in Africa may have been driven by trypanosomiasis resistance in African auroch which may be the source of resistance in African taurine populations [48].”

“We detect exportation patterns in Asia and identify a cline of Eurasian taurine/indicine hybridization in Asia.”

The cline of taurine/indicine hybridization is based on interpretation of ADMIXTURE plots and some follow-up f4 statistics. I found this difficult to follow, especially since a significant f4 statistic can have multiple interpretations. Perhaps the authors could draw out the proposed phylogeny for these breeds and explain the reasons they chose particular f4 statistics to highlight.

We have added a map figure so that the ADMIXTURE estimates will be easier to interpret in a geographic frame. We also added, From previous research [3] and Figures 2 and 3, these relationships should be tree-like if there were no admixture. For 53 of the possible 280 tests, the Z-score was more extreme than ±2.575829. The most extreme test statistics were f4(Wagyu, Mongolian; Simmental, Shorthorn) = -0.003 (Z-score = -5.21, other rearrangements of these groups had Z-scores of 7.32 and 16.55) and f4(Hanwoo, Wagyu; Piedmontese, Shorthorn) = 0.002 (Z-score = 4.90, other rearrangements of these groups had Z-scores of 21.79 and 27.77)

While the f4 statistics do have multiple interpretations, we do feel confident that the ADMIXTURE analysis highlights which interpretation is the most likely.

“We also identify the influence of species other than Bos taurus in the formation of Asian breeds.”

The conclusion that other species other than Bos taurus have introgressed into Asian breeds seems to be based on interpretation of branch lengths in the trees in Figures 2-3 and some f3 statistics. The interpretation of branch lengths is extremely weak evidence for introgression, probably not even worth mentioning. The f3 statistics are potentially quite informative though. For the breeds in question (Brebes and Madura), which pairs of populations give the most negative f3 statistics? This is difficult information to extract from Supplementary Table 2, where the populations appear to be sorted alphabetically. A table showing the (for example) five most negative f3 statistics could be quite useful here.

Supplementary Table 2 has been updated to report the 5 most negative statistics. The Z-scores for Brebes are smaller than -18 and the Z-scores for Madura are smaller than -13. We also note that these results are supported by the ADMIXTURE analysis.

In general, if the SNP ascertainment scheme is not extremely complicated (can the authors describe the ascertainment scheme for this array?), a negative f3 statistic is very strong evidence that a target population is admixed, which a significant f4 statistic only means that at least one of the four populations in the statistic is admixed. This might be a useful property for the authors.

The SNPs were ascertained multiple ways, they were either a SNP in the reference Hereford animal, discovered from Sanger resequencing of 9 breeds, or reduced representation sequencing of Angus, Holstein, or a pool of breeds. Most of the SNPs were ascertained in Hereford, Angus, or Holstein.

“We detect the pronounced influence of Shorthorn cattle in the formation of European breeds.”

This conclusion appears to be based on interpretation of ADMIXTURE plots in Figures S6-S9. Interpreting these types of plots is notoriously difficult. I wonder if the f3 statistics might be useful here: do the authors get negative f3 statistics in the populations they write ìshare ancestry with Shorthorn cattleî when using the Durham shorthorns as one reference?

Durham Shorthorn is the ancestral breed of Beef Shorthorn, Milking Shorthorn, and Lincoln Red (reference 30 from the manuscript), and as these are direct relationships (tree-like) we wouldnít expect significant f-statistics. We added Table S3 to report the negative f3 statistics for Maine Anjou, Santa Gertrudis, and Beefmaster. We suspect Belgian Blue have undergone too much change in allele frequencies due to intense selection and small effective population sizes since admixture to produce significant f3 statistics. We have edited the sentence to say:

“As shown in Figures S6 through S9, Table S3, and from their breed histories [31], many breeds share ancestry with Shorthorn cattle, including Milking Shorthorn, Beef Shorthorn, Lincoln Red, Maine-Anjou, Belgian Blue, Santa Gertrudis, and Beefmaster.”

Charolais and Holstein did not produce significant f3 statistics. Although they did produce significant f4 statistics, we choose to not report these.

“Iberian and Italian cattle possess introgression from African taurine.”

This conclusion is based on ADMIXTURE plots and treemix; it would be interesting to see the results from f3 statistics as well.

We added this as the last paragraph of the Admixture in Europe subsection.

“We also used f-statistics to explore the evidence for African taurine introgression into Spain and Italy. We did not see any significant f3 statistics, but this test may be underpowered because of the low-level of introgression. With Italian and Spanish breeds as a sister group and African breeds, including OulmËs Zaer, as the other sister group, we see 321 significant tests out of 1911 possible tests. Of these 321 significant tests, 218 contained Oulmes Zaer. We also calculated f4 statistics with the Spanish breeds as sister and the African taurine breeds as sister (excluding Oulmes Zaer). With this setup, out of the possible 675 tests we only see 1 significant test, f4(Berrenda en Negro, Pirenaica;Lagune, N’Dama (ND2)) = 0.0007, Z-score = 3.064. With Italian cattle as sister and African taurine as sister (excluding Oulmes Zaer), we see 17 significant test out of 90 possible. Patterson et al. [27] define the f4-ratio as f4(A, O; X, C)/f4(A, O; B, C), where A and B are a sister group, C is sister to (A,B), X is a mixture of B and C, and O is the outgroup. This ratio estimates the ancestry from B, denoted as α. We calculated this ratio using Shorthorn as A, Montbeliard as B, Lagune as C, Morucha as X, and Hariana as O. We choose Shorthorn, Montbeliard, Lagune, and Hariana as they appeared the least admixed in the ADMIXTURE analyses. We choose Morucha because it is solid red with African ancestry in Figure S10. This statistic estimates that Morucha is 91.23% European (α†= 0.0180993/0.0198386) and 8.77% African, which is similar to the proportion estimated by TreeMix. The multiple f4 statistics with Italian breeds as sister and African breeds as the opposing sister support African admixture into Italy. The f4-ratio test with Morucha also supports our conclusion of African admixture in Spain.”

We understand that the f4 statistics are not as easily interpreted, but the f4-ratio seems to have a straight-forward interpretation.

“American Criollo cattle are shown to be of Iberian, and not African, decent.”

I found this difficult to follow-the authors write that these breeds “derive 7.5% of their ancestry from African taurine introgression”, so presumably they are in fact partially of African descent?

We reworded as:

“American Criollo cattle are shown to be imported from Iberia, and not directly from Africa, and African ancestry is inherited via Iberian ancestors.”

“Indicine introgression into American cattle occurred in the Americas, and not Europe”

This conclusion seems difficult to make from genetic data. The authors identify “indicine” ancestry in American cattle, so I don’t see how they can determine whether this happened before or after a migration without temporal information. It would be helpful if the authors walk the reader through each logical step they’re making so that the reader can decide whether they believe the evidence for each step.

We added this sentence:

“To reiterate, Iberian cattle do not have indicine ancestry, American Criollo breeds originated from exportations from Iberia, Brahman cattle were developed in the United States in the 1880ís [31], and American Criollo breeds carry indicine ancestry, and the introgression likely occurred from Brahman cattle.”

Other responses [NB: these are responses to comments from another reviewer, but his/her comments are not printed]:

We have attempted to make the manuscript easier to read for a wider audience, but welcome additional feedback.
NOTE TO BLOG READERS: Please send me your feed back as well! @pop_gen_JED

We have rearranged the nodes of Figure 4 and we believe it is now easier to read.

The position of the migration edges denote where in time or ancestry the migration occurred. The more basal a migration edge is placed, the
migration occurred earlier in time or from a more divergent population.

As mentioned above, the placement of the migration edges is meaningful, so we prefer to keep the information displayed in this manner. We have added a brief explanation of TreeMix to the manuscript under the TreeMix analysis paragraph of the Methods section.

The geographic origin of all the populations is given in Table S1. We have edited these two sentences to say,

We find that the Indonesian Brebes (BRE) and Madura (MAD) breeds have significant Bos javanicus (BALI) ancestry demonstrated by the short branch lengths in Figures 2-4, shared ancestry with Bali in ADMIXTURE analyses (light green in Figures S7-S9), and significant f3 statistics (Table S2). The Indonesian Pesisir and Aceh and the Chinese Hainan and Luxi breeds also have Bali ancestry (migration edge c in Figure 4, and light green in Figures S8 and S9).

We agree that the reference to Murray adds confusion and have deleted these references from the manuscript.

We add “previously suggested” to this statement to identify that these two waves have previously been inferred in the literature from archeology and genetics. We also feel that the use of “possibly” suggests that this is an interpretation and not a concrete result. In regards to the evidence to support our interpretation, we see two analyses, ADMIXTURE and TreeMix, suggesting two clades of indicine introgression.

Durham Shorthorns are the ancestors of Beef Shorthorns, Milking Shorthorns, Lincoln Reds, Belgium Blues, and Maine Anjous. We add a parenthetical with a citation to reference 31 to clarify this.

Table 1 was moved to the supplement.

One of the main assumptions and conclusions of McTavish et al is that there are no pure taurines in Africa; all cattle in Africa have indicine ancestry. Our results suggest that this is not true and pure taurines do exist in Africa. We have added, “Thus, we conclude that contrary to the assumptions and conclusions of [57] cattle with pure taurine ancestry do exist in Africa.

Added “The f3 and f4 statistics look for correlations in allele frequencies that are not compatible with a bifurcating tree; these statistics provide support for admixture in the history of the tested populations [26,27].” as the first sentence of the f3 and f4 statistics subsection in the Methods section.

If cattle were separately domesticated in Africa they would be the most divergent taurine clade. But, TreeMix finds, separate from user intervention, that the best model for the relationship between indicine, Asian taurine, African taurine, and European taurine is indicine as the outgroup, European and African taurine† as sister groups, and Asian taurine as the most divergent taurine group. I.e. (indicine,(Asian taurine, (African taurine, European taurine))). But, this model also includes admixture from an unsampled ancestral population that is approximately equally divergent from taurine and indicine. Our sampling is quite extensive and has sampled populations across Europe and Africa. But, we are unable to sample African auroch as they are extinct. Rather than separate domestication and admixture being indistinguishable, the gene frequencies suggest that there was introgression into African domesticated taurine from an ancestral population. We strongly feel the most parsimonious explanation is introgression from African auroch.

From Stock and Gifford-Gonzalez 2013, “The central fact around which disparate speculations about the origins of African cattle turn is one upon which all can also agree: northern Africa was home to wild aurochsen, Bos primigenius, from the Middle Pleistocene onwards (Linseele 2004).” We have added citations to Stock and Gifford-Gonzalez 2013 and Linseele 2004 to our manuscript.

Other changes:

Changed “elucidate” to “reveal” in Author Summary.

Second paragraph of TreeMix subsection of Methods section: Changed migration rate to migration proportion

Results section, Worldwide patterns subsection, 2nd paragraph, 6th sentence: Changed “(Figure 4)” to “(Figures 4 and 5, discussed in detail in the following subsections)”.

Results section,
Divergence within the taurine lineage subsection, 1st paragraph. Added “
We also see some runs of TreeMix placing a migration edge from Chianina cattle to Asian taurines (Figure S4).

3 thoughts on “Author post: Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle

  1. Thanks, Joe, for taking the time to review this paper.

    It would be good to see TreeMix run with higher degrees of freedom than shown in this paper. There might be some Italian breeds that admix with some of the African breeds if this were done.

  2. Marnie,
    Yes, we really appreciate Joe taking the time to review this paper!

    The admixture in Italy might be quite complex. In my mind it is still an open question. TreeMix does attempt to model their admixture history, migration edge b in Figure 4. Also in Figure S4 the Italian breeds are placed in very different locations in the 6 different networks. But, I’m not sure that additional migration edges will clarify this history. The residuals for Romagnola (RMG) and Chianina (CHIA) are not very large in Figure S2. When two more migrations were added for a total of 19, migrations were added from Shorthorn to Charolais, and from Brahman to Texas Longhorn. You also begin to worry about overfitting the data as the number of migration edges increase.

    Jared

  3. Pingback: Most viewed on Haldane’s Sieve: January 2014 | Haldane's Sieve

Leave a comment