Using Weeder for the Discovery of Conserved Transcription Factor Binding Sites

互联网2013-12-31

1140

Abstract
Table of Contents
Figures
Literature Cited

Abstract

One of the greatest challenges facing modern molecular biology is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of motifs involved in the regulation of gene expression at transcriptional and post?transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors with their corresponding binding sites. Weeder is a software package freely available for noncommercial users as a stand?alone or Web?based application for the automatic discovery of conserved motifs in a set of related DNA sequences from coregulated genes. The motifs found are likely to represent instances of binding sites for some common transcription factor regulating the genes of the set. The program has been designed to make its usage as simple as possible and to require very little prior knowledge about the length and conservation of the motifs to be found.

Keywords: transcription regulation; gene expression; transcription factor binding sites

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Basic Protocol 1: Finding Conserved TFBSs Using Weeder
Support Protocol 1: Obtaining and Installing Weeder
Alternate Protocol 1: Finding Conserved TFBSs Using the Online Version of Weeder
Guidelines for Understanding Results
Commentary
Literature Cited
Figures

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 2.11.1 The main Web page of Weeder, allowing users to download the source code and executable files for the stand‐alone implementation, or to access the Web input form.

View Image

Figure 2.11.2 The beginning of the output for file G1S.cycle.fasta.

View Image

Figure 2.11.3 An example of detailed motif report, including the consensus, the motif instances in the sequences, and frequency matrices built from motif instances.

View Image

Figure 2.11.4 The Weeder input form.

View Image

Figure 2.11.5 Confirmation message, including number of sequences read from the input, and E‐mail address that will receive the results.

View Image

Figure 2.11.6 Highest‐scoring motifs of length 10 for files G1S.cycle.fasta and yeast.random.fasta.

View Image

Figure 2.11.7 Sequence logo of motif TTGGCGCGAA found in the sequence set G1S.cycle.fasta.

View Image

Videos

Literature Cited

	Bailey, T.L. and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2:28‐36.
	Bulyk, M.L. 2003. Computational prediction of transcription‐factor binding site locations. Genome Biol. 5:201.
	Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. 2004. WebLogo: A sequence logo generator. Genome Res. 14:1188‐1190.
	Hertz, G.Z. and Stormo, G.D. 1999. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15:563‐577.
	Kent, W.J., Hsu, F., Karolchik, D., Kuhn, R.M., Clawson, H., Trumbower, H., and Haussler, D. 2005. Exploring relationships and mining data with the UCSC Gene Sorter. Genome Res. 15:737‐741.
	Narasimhan, C., LoCascio, P., and Uberbacher, E. 2003. Background rareness‐based iterative multiple sequence alignment algorithm for regulatory element detection. Bioinformatics 19:1952‐1963.
	Pavesi, G., Mauri, G., and Pesole, G. 2004a. In silico representation and discovery of transcription factor binding sites. Brief Bioinform. 5:217‐236.
	Pavesi, G., Mereghetti, P., Mauri, G., and Pesole, G. 2004b. Weeder Web: Discovery of transcription factor binding sites in a set of sequences from co‐regulated genes. Nucl. Acids Res. 32:W199‐W203.
	Schneider, T.D. and Stephens, R.M. 1990. Sequence logos: A new way to display consensus sequences. Nucl. Acids Res. 18:6097‐6100.
	Sinha, S. and Tompa, M. 2002. Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucl. Acids Res. 30:5549‐5560.
	Stormo, G.D. 2000. DNA binding sites: Representation and discovery. Bioinformatics 16:16‐23.
	Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouze, P., and Moreau, Y. 2001. A higher‐order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17:1113‐1122.
	Thompson, W., Rouchka, E.C., and Lawrence, C.E. 2003. Gibbs Recursive Sampler: Finding transcription factor binding sites. Nucl. Acids Res. 31:3580‐3585.
	Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Regnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., and Zhu, Z. 2005. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23:137‐144.
	van Helden, J. 2003. Regulatory sequence analysis tools. Nucl. Acids Res. 31:3593‐3596.
	Whitfield, M.L., Sherlock, G., Saldanha, A.J., Murray, J.I., Ball, C.A., Alexander, K.E., Matese, J.C., Perou, C.M., Hurt, M.M., Brown, P.O., and Botstein, D. 2002. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell. 13:1977‐2000.
Internet Resources
	http://www.pesolelab.it
	Web site for downloading and/or running Weeder.
	http://genome.ucsc.edu/
	UCSC genome browser.
	http://www.biobase.de/
	The TRANSFAC database (commercial, subscription required).
	http://www.cbil.upenn.edu/tess/
	The TESS transcription element search system.
	http://bayesweb.wadsworth.org/gibbs/gibbs.html
	The Gibbs motif sampler home page.
	http://meme.sdsc.edu/meme/intro.html
	The MEME motif discovery tool home page.
	http://ural.wustl.edu/
	Various tools for sequence analysis including motif finding algorithm CONSENSUS.