Computing Multiple Sequence/Structure Alignments with the T‐Coffee Package

互联网2013-12-31

781

Abstract
Table of Contents
Figures
Literature Cited

Abstract

In this unit, we describe assembly of a multiple sequence alignment using the T?Coffee package. T?Coffee is much more flexible than most related methods (e.g., ClustalW) because it makes it possible to combine many alternative alignments into a single one, based on an estimate of consistency between these alignments. This strategy can be especially useful when one has to decide among the output produced by several alternative methods. Curr. Protoc. Bioinform. 29:3.8.1?3.8.25. © 2010 by John Wiley & Sons, Inc.

Keywords: sequence alignment; multiple sequence alignment; T?Coffee

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: Computing Multiple Sequence Alignments
Basic Protocol 2: Profile Alignments from Large Data Sets: Aligning Alignments
Support Protocol 1: Reformatting Sequences, Alignments, Structures, and Libraries
Support Protocol 2: Evaluating the Local Score of an Alignment
Support Protocol 3: Generating and Using T‐Coffee Libraries
Alternate Protocol 1: Combining and Comparing Alignments
Basic Protocol 3: Combining Sequences and Structures
Guidelines for Understanding Results
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 3.8.1 Default output of T‐Coffee.

View Image

Figure 3.8.2 Output of T‐Coffee when requesting several different output formats with the ‐output flag.

View Image

Figure 3.8.3 Choosing subgroups with the help of a phylogenetic tree.

View Image

Figure 3.8.4 The score_html output of T‐Coffee where residues are colored according to their CORE index.

View Image

Figure 3.8.5 An ASCII representation of the CORE index: every residue is replaced with its CORE index.

View Image

Figure 3.8.6 A CORE color‐coded multiple alignment that includes two structures.

View Image

Videos

Literature Cited

	Armougom, F., Moretti, S., Poirot, O., Audic, S., Dumas, P., Schaeli, B., Keduas, V., and Notredame, C. 2006. Expresso: Automatic incorporation of structural information in multiple sequence alignments using 3D‐Coffee. Nucleic Acids Res. 34:W604‐W608.
	Bairoch, A., Bucher, P., and Hofmann, K. 1997. The PROSITE database, its status in 1997. Nucleic Acids Res. 25:217‐221.
	Do, C.B., Mahabhashyam, M.S., Brudno, M., and Batzoglou, S. 2005. ProbCons: Probabilistic consistency‐based multiple sequence alignment. Genome Res. 15:330‐340.
	Duret, L. and Abdeddaim, S. 2000. Multiple alignments for structural, functional or phylolgenetic analyses of homologous sequences. In Bioinformatics: Sequence, Structure and Databanks (D. Higgins and W. Taylor, eds.) pp. 51‐76. Oxford University Press, Oxford.
	Jones, D.T. 1999. Protein secondary structure prediction based on position‐specific scoring matrices. J. Mol. Biol. 292:195‐202.
	Katoh, K., Misawa, K., Kuma, K., and Miyata, T. 2002. MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30:3059‐3066.
	Lassmann, T. and Sonnhammer, E.L. 2002. Quality assessment of multiple alignment programs. FEBS Lett. 529:126‐130.
	Moretti, S., Armougom, F., Wallace, I.M., Higgins, D.G., Jongeneel, C.V., and Notredame, C. 2007. The M‐Coffee Web server: A meta‐method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids Res. 35:W645‐W648.
	Morgenstern, B., Frech, K., Dress, A., and Werner, T. 1998. DIALIGN: Finding local similarities by multiple sequence alignment. Bioinformatics 14:290‐294.
	Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Barrell, D., Bateman, A., Binns, D., Biswas, M., Bradley, P., Bork, P., Bucher, P., Copley, R.R., Courcelle, E., Das, U., Durbin, R., Falquet, L., Fleischmann, W., Griffiths‐Jones, S., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lopez, R., Letunic, I., Lonsdale, D., Silventoinen, V., Orchard, S.E., Pagni, M., Peyruc, D., Ponting, C.P., Selengut, J.D., Servant, F., Sigrist, C.J., Vaughan, R., and Zdobnov, E.M. 2003. The InterPro database: 2003 brings increased coverage and new features. Nucleic Acids Res. 31:315‐318.
	Ng, P.C. and Henikoff, S. 2002. Accounting for human polymorphisms predicted to affect protein function. Genome Res. 12:436‐446.
	Notredame, C. 2002. Recent progress in multiple sequence alignment: A survey. Pharmacogenomics 3:131‐144.
	Notredame, C. and Abergel, C. 2003. Using multiple alignment methods to assess the quality of genomic data analysis. In Bioinformatics and Genomes (M. Andrade, ed.) pp. 150‐175. Springer Verlag, New York.
	Notredame, C., Higgins, D.G., and Heringa, J. 2000. T‐Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302:205‐217.
	Phillips, A., Janies, D., and Wheeler, W. 2000. Multiple sequence alignment in phylogenetic analysis. Mol. Phylogenet. Evol. 16:317‐330.
	Ramensky, V., Bork, P., and Sunyaev, S. 2002. Human non‐synonymous SNPs: Server and survey. Nucleic Acids Res. 30:3894‐3900.
	Rausch, T., Emde, A.‐K., Weese, D. Döring, A., Notredame, C., and Reinert, K. 2008. Segment‐based multiple sequence alignment. Bioinformatics 24:i187‐i192.
	Taylor, W.R. and Orengo, C.A. 1989. Protein structure alignment. J. Mol. Biol 208:1‐22.
	Thompson, J.D., Koehl, P., Ripp, R., and Poch, O. 2005. BAliBASE 3.0: Latest developments of the multiple sequence alignment benchmark. Proteins 61:127‐136.
	Wallace, I.M., O'Sullivan, O., Higgins, D.G., and Notredame, C. 2006. M‐Coffee: Combining multiple sequence alignment methods with T‐Coffee. Nucleic Acids Res. 34:1692‐1699.
Key References
	Kemena, C. and Notredame, C. 2009. Upcoming challenges for multiple sequence alignment methods in the high‐throughput era. Bioinformatics 25:2455‐2465.
	A recent review on the recent methodological developments of methods implementing template based alignments.
	Moretti, S., Wilm, A., Higgins, D.G., Xenarios, I., and Notredame, C. 2008. R‐Coffee: A Web server for accurately aligning noncoding RNA sequences. Nucleic Acids Res. 36:W10‐W13.
	A paper describing a Web server running a version of T‐Coffee able to align RNA.
	Notredame et al., . See above.
	The original paper describing the T‐Coffee algorithm and the one that should be cited as a reference for T‐Coffee.
	O'Sullivan, O., Suhre, K., Abergel, C., Higgins, D.G., and Notredame, C. 2004. 3DCoffee: Combining protein sequences and structures within multiple sequence alignments. J. Mol. Biol. 340:385‐395.
	A paper describing the combination of sequences and structures.
	Taylor and Orengo, . See above.
	First description of SAP structure‐structure alignment method used by T‐Coffee.
	Wilm, A., Higgins, D.G., and Notredame, C. 2008. R‐Coffee: A method for multiple alignment of non‐coding RNA. Nucleic Acids Res. 36:e52.
	A paper describing a novel algorithm in T‐Coffee for the alignment of RNA sequences.
Internet Resources
	http://www.tcoffee.org
	T‐Coffee home page.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Computing Multiple Sequence/Structure Alignments with the T‐Coffee Package

Abstract

Table of Contents

Materials

Figures

Videos

Literature Cited