Using Weeder for the Discovery of Conserved Transcription Factor Binding Sites
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
One of the greatest challenges facing modern molecular biology is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of motifs involved in the regulation of gene expression at transcriptional and post?transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors with their corresponding binding sites. Weeder is a software package freely available for noncommercial users as a stand?alone or Web?based application for the automatic discovery of conserved motifs in a set of related DNA sequences from coregulated genes. The motifs found are likely to represent instances of binding sites for some common transcription factor regulating the genes of the set. The program has been designed to make its usage as simple as possible and to require very little prior knowledge about the length and conservation of the motifs to be found.
Keywords: transcription regulation; gene expression; transcription factor binding sites
Table of Contents
- Basic Protocol 1: Finding Conserved TFBSs Using Weeder
- Support Protocol 1: Obtaining and Installing Weeder
- Alternate Protocol 1: Finding Conserved TFBSs Using the Online Version of Weeder
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
Materials
Figures
-
Figure 2.11.1 The main Web page of Weeder, allowing users to download the source code and executable files for the stand‐alone implementation, or to access the Web input form. View Image -
Figure 2.11.2 The beginning of the output for file G1S.cycle.fasta. View Image -
Figure 2.11.3 An example of detailed motif report, including the consensus, the motif instances in the sequences, and frequency matrices built from motif instances. View Image -
Figure 2.11.4 The Weeder input form. View Image -
Figure 2.11.5 Confirmation message, including number of sequences read from the input, and E‐mail address that will receive the results. View Image -
Figure 2.11.6 Highest‐scoring motifs of length 10 for files G1S.cycle.fasta and yeast.random.fasta. View Image -
Figure 2.11.7 Sequence logo of motif TTGGCGCGAA found in the sequence set G1S.cycle.fasta. View Image
Videos
Literature Cited
Bailey, T.L. and Elkan, C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc. Int. Conf. Intell. Syst. Mol. Biol. 2:28‐36. | |
Bulyk, M.L. 2003. Computational prediction of transcription‐factor binding site locations. Genome Biol. 5:201. | |
Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. 2004. WebLogo: A sequence logo generator. Genome Res. 14:1188‐1190. | |
Hertz, G.Z. and Stormo, G.D. 1999. Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15:563‐577. | |
Kent, W.J., Hsu, F., Karolchik, D., Kuhn, R.M., Clawson, H., Trumbower, H., and Haussler, D. 2005. Exploring relationships and mining data with the UCSC Gene Sorter. Genome Res. 15:737‐741. | |
Narasimhan, C., LoCascio, P., and Uberbacher, E. 2003. Background rareness‐based iterative multiple sequence alignment algorithm for regulatory element detection. Bioinformatics 19:1952‐1963. | |
Pavesi, G., Mauri, G., and Pesole, G. 2004a. In silico representation and discovery of transcription factor binding sites. Brief Bioinform. 5:217‐236. | |
Pavesi, G., Mereghetti, P., Mauri, G., and Pesole, G. 2004b. Weeder Web: Discovery of transcription factor binding sites in a set of sequences from co‐regulated genes. Nucl. Acids Res. 32:W199‐W203. | |
Schneider, T.D. and Stephens, R.M. 1990. Sequence logos: A new way to display consensus sequences. Nucl. Acids Res. 18:6097‐6100. | |
Sinha, S. and Tompa, M. 2002. Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucl. Acids Res. 30:5549‐5560. | |
Stormo, G.D. 2000. DNA binding sites: Representation and discovery. Bioinformatics 16:16‐23. | |
Thijs, G., Lescot, M., Marchal, K., Rombauts, S., De Moor, B., Rouze, P., and Moreau, Y. 2001. A higher‐order background model improves the detection of promoter regulatory elements by Gibbs sampling. Bioinformatics 17:1113‐1122. | |
Thompson, W., Rouchka, E.C., and Lawrence, C.E. 2003. Gibbs Recursive Sampler: Finding transcription factor binding sites. Nucl. Acids Res. 31:3580‐3585. | |
Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Regnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., and Zhu, Z. 2005. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23:137‐144. | |
van Helden, J. 2003. Regulatory sequence analysis tools. Nucl. Acids Res. 31:3593‐3596. | |
Whitfield, M.L., Sherlock, G., Saldanha, A.J., Murray, J.I., Ball, C.A., Alexander, K.E., Matese, J.C., Perou, C.M., Hurt, M.M., Brown, P.O., and Botstein, D. 2002. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell. 13:1977‐2000. | |
Internet Resources | |
http://www.pesolelab.it | |
Web site for downloading and/or running Weeder. | |
http://genome.ucsc.edu/ | |
UCSC genome browser. | |
http://www.biobase.de/ | |
The TRANSFAC database (commercial, subscription required). | |
http://www.cbil.upenn.edu/tess/ | |
The TESS transcription element search system. | |
http://bayesweb.wadsworth.org/gibbs/gibbs.html | |
The Gibbs motif sampler home page. | |
http://meme.sdsc.edu/meme/intro.html | |
The MEME motif discovery tool home page. | |
http://ural.wustl.edu/ | |
Various tools for sequence analysis including motif finding algorithm CONSENSUS. |