Beyond Simple Homology Searches: Multiple Sequence Alignments and Phylogenetic Trees
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
Phylogenetic trees represent hypotheses about evolutionary relationships between organisms or nucleotide or amino acid sequences. Because the best BLAST hit often does not represent the most closely related sequence, phylogenetic analyses are an essential extension of inquiry into any new protein or gene. In this unit, the reader will first learn how to create a multiple sequence alignment using ClustalX. He or she will then learn how to use that alignment to build a neighbor?joining phylogeny using the program Geneious. Finally, the user will learn how to interpret the phylogeny in light of the research questions. Curr. Protoc. Essential Lab. Tech. 1:11.3.1?11.3.17. © 2009 by John Wiley & Sons, Inc.
Keywords: phylogeny; alignment; neighbor?joining; homology; ClustalX; Geneious
Table of Contents
- Overview and Principles
- Strategic Planning
- Basic Protocol 1: Creating a Multiple Sequence Alignment
- Basic Protocol 2: Making a Phylogenetic Tree
- Commentary
- Literature Cited
- Figures
Materials
Figures
-
Figure 11.3.1 This phylogeny shows the evolutionary relationships between four extant species. The nodes labeled A, B, and C represent most recent common ancestors: A, common ancestor of all four species; B, common ancestor of human, chimp, and mouse; C, common ancestor of human and chimp. View Image -
Figure 11.3.2 This phylogeny shows the relationship between various orthologs and paralogs of a gene. Prior to the divergence of humans and chimps, this gene underwent a gene‐duplication event (indicated by the horizontal bar). That gene duplication event resulted in two paralogs: A and B. Speciation between humans and chimps resulted in the orthologs “human A” and “chimp A,” and the orthologs “human B” and “chimp B.” View Image -
Figure 11.3.3 Phylogeny showing the relationships between taxa 1, 2, and 3 as a polytomy, representing unresolved relationships between these taxa. View Image -
Figure 11.3.4 Results of a GenBank query for cytochrome oxidase I sequences from select Plasmodium species. This search returns several full‐length mitochondrial sequences. The genes of interest can be extracted from these sequences as described in the text. View Image -
Figure 11.3.5 The Geneious interface. Across the top is the toolbar. The left panel shows folders of the local documents and links to NCBI database searches. The right panel contains a tutorial and Help files. The top panel shows the contents of the selected documents folder. The bottom panel is the sequence/tree viewer. To the right in the bottom panel are a series of options that allow you to change the way you view the sequences. This screenshot shows the alignment of Plasmodium sequences that were imported from ClustalX. View Image -
Figure 11.3.6 Alignment of sequences of the cytochrome oxidase I gene from seven species of the malaria parasite Plasmodium . Sequences were loaded into ClustalX from FASTA‐formatted files, and aligned using the default parameters. Sequence positions in each column are hypothesized to have positional homology. Shown is the first 130 bp of the alignment. View Image -
Figure 11.3.7 Tree building options in Geneious. To build a phylogeny as described in the text, select HKY as the distance model, neighbor‐joining as the tree building method, and an outgroup (if you have one) from the list of sequences in the alignment. View Image -
Figure 11.3.8 Neighbor‐joining phylogeny of seven species of Plasmodium based on the cytochrome oxidase I gene shown in the Tree viewer panel of Geneious. You can return to the alignment used to build this phylogeny by clicking on “Alignment View” at the top of the panel. View Image -
Figure 11.3.9 Tree building options in Geneious. To bootstrap a phylogeny, select the box labeled “Resample tree.” View Image -
Figure 11.3.10 The bootstrapped phylogeny of Plasmodium species. Bootstrap support for each set of relationships is shown to the right of the node where two lineages diverge. View Image -
Figure 11.3.11 BLAST search of I. quamoclit DFundefined . The results of this BLAST search will be used to select sequences to use in building a phylogeny to assess the function of DFundefined . View Image -
Figure 11.3.12 Alignment of DFR sequences. The first three sequences are much longer than the other sequences. These sequences must be trimmed, and the alignment repeated, before using them in a phylogenetic analysis. View Image -
Figure 11.3.13 Alignment of DFR sequences imported into Geneious. View Image -
Figure 11.3.14 Bootstrapped phylogeny of DFR shows that DFundefined is most closely related to related species' DFR‐B . Thus, we can hypothesize that the function of DFundefined is most like the function of DFR‐B. View Image -
Figure 11.3.15 Sequence alignment with an unrealistic number of gaps. View Image
Videos
Literature Cited
Baum, D.A., Smith, S.D., and Donovan, S.S.S. 2005. Evolution: The tree‐thinking challenge. Science 310:979‐980. | |
Des Marais, D.L. and Rausher, M.D. 2008. Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454:762‐765. | |
Drummond, A.J., Ashton, B., Cheung, M., Heled, J., Kearse, M., Moir, R., Stones‐Havas, S., Thierer, T., and Wilson, A. 2008. Geneious v4.0. http://www.geneious.com/. | |
Eisen, J.A. 1998. Phylogenomics: Improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8:163‐167. | |
Felsenstein, J. 2004. Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, Mass. | |
Hall, B.G. 2004. Phylogenetic Trees Made Easy: A How‐To Manual, 2nd Edition. Sinauer Associates, Inc., Sunderland, Mass. | |
Hall, B.G. 2007. Phylogenetic Trees Made Easy: A How‐to Manual, 3rd Edition. Sinauer Associates, Inc., Sunderland, Mass. | |
Hasegawa, M., Kishino, H., and Yano, T. 1985. Dating of the human‐ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22:160‐174. | |
Hayakawa, T., Culleton, R., Otani, H., Horii, T., and Tanabe, K. 2008. Big bang in the evolution of extant malaria parasites. Mol. Biol. Evol. 25:2233‐2239. | |
Jukes, T.H. and Cantor, C.R. 1969. Evolution of protein molecules. In Mammalian protein metabolism (H.N. Munro, ed.), pp. 21‐132. Academic Press, New York. | |
Landan, G. and Graur, D. 2008. Characterization of pairwise and multiple sequence alignment errors. Gene. Epub June 3, 2008. | |
Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., and Higgins, D.G. 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947‐2948. | |
McHardy, A.C. and Rigoutsos, I. 2007. What's in the mix: Phylogenetic classification of metagenome sequence samples. Curr. Opin. Microbiol. 10:499‐503. | |
Mu, J., Joy, D.A., Duan, J., Huang, Y., Carlton, J., Walker, J., Barnwell, J., Beerli, P., Charleston, M.A., Pybus, O.G., and Su, X. 2005. Host switch leads to emergence of Plasmodium vivax malaria in humans. Mol. Biol. Evol. 22:1686‐1693. | |
Posada, D. and Crandall, K. 1998. MODELTEST: Testing the model of DNA substitution. Bioinformatics 14:817‐818. | |
Tamura, K. and Nei, M. 1993. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10:512‐526. | |
Thompson, J.D., Plewniak, F., and Poch, O. 1999. A comprehensive comparison of multiple sequence alignment programs. Nucleic Acids Res. 27:2682‐2690. | |
Zufall, R. and Rausher, M. 2004. Genetic changes associated with floral adaptation restrict future evolutionary potential. Nature 428:847‐850. | |
Zwickl, D.J. and Hillis, D.M. 2002. Increased taxon sampling greatly reduces phylogenetic error. Syst. Biol. 51:588‐598. | |
Key References | |
Hall, 2007. See above. | |
A “cookbook” for phylogenetic reconstruction. Recommended for beginning users interested in learning parsimony and maximum likelihood methods, in addition to distance methods. Relies largely on the program MEGA. | |
Felsenstein, 2004. See above. | |
A comprehensive guide to phylogenetic methodology and application. Recommended for those who want to delve deeply into the subject of phylogenetic inference. | |
Graur, D. and Li, W. 2000. Fundamentals of Molecular Evolution. Sinauer Associates, Inc., Sunderland, Mass. | |
Recommended reading for an understanding of the evolutionary biology behind the methods of phylogenetic reconstruction. |