Sunday, December 23, 2007

In the Lab 5: Data Analysis and Results

Last in my series of posts on phylogenetic research. See Parts 1, 2, 3, and 4.

This is where I admit I'm getting cheap on you, for several reasons. The first - I haven't completely finished my own results, and I won't be posting them here because they are intended for publishing. The second - I am still a student and not a master of phylogenetic analysis, so to speak. I am only acquainted with a couple of analysis methods, which I will describe here. Third - I am really running out of time, as I depart for winter break in Costa Rica in just four days. I will, when I return next semester, be able to give a more complete summary of the different analyses of phylogeography. For now, you get Analysis Lite.

We started off with an idea for a study and birds in the field. We've collected blood from those birds, purified DNA, PCR'ed, sequenced, and cleaned up the data. Now we have our completed data set: sequences for multiple loci, from individual Palm-Tanagers across Hispaniola. Remember from Part 1 that our sampling consisted of up to five individuals from sites across Hispaniola:

The most basic level of analysis I use is called statistical parsimony haplotype networks. These networks, constructed using a program called TCS. These networks are a phlyogenetic reconstruction of relationships for any given locus. Unlike trees, the networks allow a more reticulated relationship. Trees force each individual to a terminal branch, networks are used when relatedness is higher and there can be ancestral alleles still present. It is a nifty way of visualizing a complicated network of genetic similarity. A simple example follows:

In the above example, the network consists of ~30 individuals. The color-filled circles each represent a unique haplotype (a unique DNA sequence). Each haplotype's circle is sized proportional to how many individuals share that haplotype. The smallest colored circles have only one individual with that haplotype, the large central circle has ~15 individuals sharing an identical sequence. The lines and black nodes that connect the network together represent the relationships of the DNA sequence. Each line separating haplotypes represent one base pair change. The black nodes represent inferred ancestral alleles, intermediate haplotypes that are no longer present. For example: the smallest red circle on the left differs from the larger red circle by four base pairs. The two red circles directly next to each other differ by one base pair. Finally, to complete the analysis, the colors are representative of the locality each individual comes from (they correspond in this example to the map above). Thus the most well-represent haplotype above is shared by ~15 individuals from 6 localities above.

Analysis of haplotype networks is quite simple: you just eyeball it. In the example above, there seem to be two distinct groups separated by six base pair changes. One group is solely of individuals from the red locality, on the western Peninsula on the map above. The other consists of a mix of every other locality on the island sampled. Note in particular that in this second group, every locality is represented in the single central haplotype. This is indicative of panmixia - there are no barriers to gene flow between these localities. We can infer that there is some barrier of gene flow separating the red population from the rest (although in the example network sample sizes are very low, this may skew the analysis by missing rarer alleles that may connect the populations). These two clades (red/everything else) are reciprocally monophyletic, the buzzword of genetic studies. Individuals from the red location do not occur in the other clade, and vice versa. Reciprocal monophyly, especially with large sample sizes, is a very strong indication of a lack of gene flow between populations.

Now that you have a handle on haplotype networks, check out this example of a more complicated network:

Looking at how phylogenetic breaks - such as the monophyletic red clade identified above, overlay with topography is the basis of phylogeography. It allows us to identify likely topographic barriers to gene flow that allowed population isolation and differentiation, although it is important to note that we can infer barriers but this study is itself not a test of that barrier.

There are more detailed methods of analysis, such as tree-based phylogenetic reconstruction. I also intend to use an analysis called Isolation with Migration that generates a likely model of of the isolation of distinct clades, with migration rates between them.

When all is done, I hope to turn my poster from the AOU meeting into a published paper:

Blogging on Peer-Reviewed ResearchNow, just so you don't feel completely let down (all method and no results!) I will cover some of our group's work that was recently published this year. All of the methods of study I am applying to the Palm-Tanagers, Andrea has already done for the Chat-Tanagers (Calyptophilus). These two species, unlike the Palm-Tanagers, are high-elevation specialists, being found in the mountain ranges on Hispaniola.

Eastern Chat-Tanager (Calyptophilus frugivorous)
Photo courtesy of Andrea Townsend

Chat-Tanager ranges and sample locations (click for detail)

The haplotype networks for four loci (mtDNA ND2 and three nuclear introns) follow. Locality colors correspond to the above map. 48 samples total. Click to enlarge.

(Figures from Townsend et al. 2007)

A brief glance at the four networks will indicate the general trend: two distinct clades corresponding to the red/orange/yellow western localities and the blue/green/purple eastern localities. The genus has been considered by various sources as anywhere from one to four species, this work pretty much solidifies the taxonomy as two distinct species. The mitochondrial DNA ND2 gene is most telling - the black bar separating the two clades indicates approximately 100 base pair substitutions, giving the two species approximately 12% uncorrected divergence for this mtDNA locus. This is a very large, ancient difference - with the standard avian molecular clock and the IM analysis indicates the taxa diverged approximately 9 million years ago.

The nuclear introns show the same pattern as the mtDNA, but with much less actual divergence (1-2%). Also, two of the introns are not reciprocally monophyletic, with a handful or rare alleles falling in the 'wrong' clade. This may indicate a very limited amount of gene flow between the two species.

Within each respective clade, there is no population structure at all. As in the simple example I explained above, all localities are represented in the common haplotypes with few outliers. This indicates that gene flow among the populations within each species is common.

We can now examine what topographic barriers may be important. Recall the topography of Hispaniola and compare it with the species range maps above.

The mountain-dwelling Calyptophilus occur in distinct populations in each of the distinct east-west mountain ranges. We have two distinct clades encompassing several mountain ranges each. There may be limited gene flow between the two clades, but there is no structure within either population. This indicates that, despite the high-elevation specialist populations being separated by deep valleys and long distances, disjunct mountain ranges do not present a barrier to gene flow in either species. If the mountains aren't a cause for the divergence and speciation of Calyptophilus on Hispaniola, what is?

Recall two things: the ancient divergence of 9 million years, and the paleohistoric tendency of the intermontane valleys to flood. In fact, Hispaniola 15 million years ago consisted of two paleo-island blocks, which merged approximately 9 million years ago to form the island we know today. The divide between two island blocks is the deep (below sea level) valley that separates the southwestern peninsula from the rest of the island (the green area with the large lagoons in the topo map, the heavier red line in the species range map). This region is the area the two species come into contact. These facts suggest that Calyptophilus diverged not on the single island of Hispaniola, but they diverged allopatrically on two separate paleo-island blocks. Back in Part 1 we began this study to see if Hispaniola was big enough to support in situ speciation. The evidence from Calyptophilus phylogeography suggests that it is not.

Look for me to be writing more when I have results from the rest of the endemic species pairs on Hispaniola! This concludes my series, I hope you enjoyed it.


Andrea K. Townsend, Christopher C. Rimmer, Steven C. Latta, and Irby J. Lovette. 2007. Ancient differentiation in the single-island radiation of endemic Hispaniolan chat-tanagers (Aves: Calyptophilus). Molecular Ecology. 16: 3634-3642. (Abstract)

No comments:

Post a Comment