Identifying Structural Noncoding RNAs Using RNAz

互联网2013-12-31

849

Abstract
Table of Contents
Figures
Literature Cited

Abstract

The functions of many noncoding RNAs and cis?acting regulatory elements of mRNAs depend on a defined RNA secondary structure. RNAz predicts such functional RNA structures on the basis of thermodynamic stability and evolutionary conservation of homologous sequences. It can be used to efficiently filter multiple alignments for noncoding RNA candidates in genomic screens. Curr. Protoc. Bioinform. 19:12.7.1?12.7.18. © 2007 by John Wiley & Sons, Inc.

Keywords: RNA structure; noncoding RNA; structure conservation; comparative genomics; gene prediction

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: Using RNAz to Analyze a Simple Alignment
Basic Protocol 2: Analyzing More Complex Alignments
Alternate Protocol 1: Using alifoldz.pl
Basic Protocol 3: Using RNAz to Perform a Large‐Scale Genomic Screen
Alternate Protocol 2: Using the RNAz Web Server
Support Protocol 1: Installing Necessary Software
Guidelines for Understanding Results
Commentary
Literature Cited
Figures

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 12.7.1 ClustalW formatted alignment of an iron responsive element (IRE) conserved in vertebrates. This format is read by RNAz. The first line must include the word CLUSTAL, the conservation line with asterisks is optional. The block length can be of any size (default: 60 columns).

View Image

Figure 12.7.2 RNAz output of the alignment shown in Figure . The output consists of two parts: the header shows important characteristics of the input alignment and all scores calculated by RNAz; the lower part explicitly shows the secondary structure predictions for the single sequences and the consensus structure prediction for the alignment.

View Image

Figure 12.7.3 Graphical output of RNAz. (A ) Structure annotated alignment. The consensus structure is shown in dot/bracket notation in the first line. The colors of the shadings indicate the number of different types of letter combinations that form a base pair. Red, ochre, green means that there are 1, 2, 3 different base‐pair combinations, respectively. If a base pair cannot be formed in one or more sequences, the colors are shown faded in different levels (not visible in this example because there are no sequences in the alignment that are incompatible with the consensus fold). (B ) RNA secondary structure drawing. A model of the consensus secondary structure is shown. Variable positions are circled (one circle, consistent mutation; two circles, compensatory mutation). The coloring scheme is the same as in panel A.

View Image

Figure 12.7.4 alifoldz.pl output of the alignment shown in Figure . The header shows the program settings used. The complete alignment was scored in both forward and reverse complement direction. A sample size of 100 was used to calculate the z ‐scores. The table below shows the results of the calculation. The consensus MFE of the native alignment, mean, and standard deviation of MFEs of 100 random alignments and the z ‐score are shown. We observe significant z ‐scores of −6.4 in the forward direction. Note that, due to the random component of the algorithm, your results on the same alignment may differ.

View Image

Figure 12.7.5 Overview of the analysis pipeline described in .

View Image

Figure 12.7.6 File formats used for genomic screens. (A ) Multiple sequence alignment in MAF format. It consists of several alignment blocks. Each block consists of a line starting with “a score=” where the alignment score is given. The existence of this line is important for RNAz, although the value of the score is ignored. The first line is followed by two or more sequence lines starting with s. These lines require six fields: (1) a unique identifier of the source sequence, (2) the start position of the aligned subsequence with respect to this source sequence, (3) the length of the aligned subsequence without gaps, (4) + or ‐, indicating if the sequence is in the same reading direction as the source sequence or the reverse complement, (5) the sequence length of the complete source sequence, (6) the aligned subsequence with gaps. All fields are required except field 5, which is ignored by RNAz; if the correct value is not available, it can be filled with arbitrary values. (B ) BED annotation file format. This format is used mainly because of its simplicity. In its basic form, it consists of four tab‐delimited fields: (1) sequence identifier, (2) start coordinates, (3) end coordinates, and (4) name of entry. (C ) For BLAST annotation, we use a database of sequences in FASTA format. Each entry starts with a header line with a leading > and the name of the sequence followed by the sequence itself.

View Image

Videos

Literature Cited

	Athanasius F. Bompfünewerer Consortium, Backofen, R., Bernhart, S.H., Flamm, C., Fried, C., Fritzsch, G., Hackermüller, J., Hertel, J., Hofacker, I.L., Missal, K., Mosig, A., Prohaska, S.J., Rose, D., Stadler, P.F., Tanzer, A., Washietl, S., and Will, S. 2007. RNAs everywhere: Genome‐wide annotation of structured RNAs. J. Exp. Zool. B Mol. Dev. Evol. 308:1‐25.
	Bompfünewerer, A., Flamm, C., Fried, C., Fritzsch, G., Hofacker, I., Lehmann, J., Missal, K., Mosig, A., Müller, B., Prohaska, S., Stadler, B., Stadler, P., Tanzer, A., Washietl, S., and Witwer, C. 2005. Evolutionary patterns of non‐coding RNAs. Theory Biosci. 123:301‐369.
	Frith, M.C., Pheasant, M., and Mattick, J.S. 2005. The amazing complexity of the human transcriptome. Eur. J. Hum. Genet. 13:894‐897.
	Griffiths‐Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., and Bateman, A. 2005. Rfam: Annotating non‐coding RNAs in complete genomes. Nucl. Acids Res. 33:D121‐D124.
	Hofacker, I.L., Fontana, W., Stadler, P.F., Bonhoeffer, S., Tacker, M., and Schuster, P. 1994. Fast folding and comparison of RNA secondary structures. Monatsh. Chem. 125:167‐188.
	Hofacker, I.L., Fekete, M., and Stadler, P.F. 2002. Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319:1059‐1066.
	Martens, J.A., Laprade, L., and Winston, F. 2004. Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene. Nature 429:571‐574.
	Washietl, S. and Hofacker, I.L. 2004. Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics. J. Mol. Biol. 342:19‐30.
	Washietl, S., Hofacker, I.L., and Stadler, P.F. 2005. Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. U.S.A. 102:2454‐2459.
Key References
	Hofacker et al., 2002. See above.
	This paper describes the basic algorithm to predict a consensus structure for aligned sequences. All programs described in this unit build upon this algorithm.
	Washietl and Hofacker 2004. See above.
	This paper introduces the alifoldz.pl algorithm and demonstrates that only comparative analysis has enough statistical power to predict functional structures with reasonable accuracy.
	Washietl et al., 2005. See above.
	This paper provides the reader with detailed description of the RNAz algorithm.
Internet Resources
	http://www.tbi.univie.ac.at/∼wash/RNAz/
	Download the latest version ofRNAz; read online manuals.
	http://rna.tbi.univie.ac.at/RNAz
	RNAz Web server.