丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Using MZEF to Find Internal Coding Exons

互联网

482
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

MZEF (Michael Zhang's Exon Finder) was designed to help identify one of the most important classes of exons, i.e. the internal coding exons, in human genomic DNA sequences. It is neither for predicting intronless genes, nor for assembling predicted exons into complete gene models. There is also a mouse version (mMZEF) and an Arabidopsis version (aMZEF). This unit presents the Unix and Web versions of MZEF and reviews how to interpret the MZEF results.

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Basic Protocol 1: Using MZEF to Analyze Genomic DNA Sequences Via the Web Interface
  • Basic Protocol 2: Using the Command‐Line Unix Version of MZEF to Analyze Genomic DNA Sequences
  • Alternate Protocol 1: Using the Interactive Unix Version MZEF to Analyze Genomic DNA Sequences
  • Guidelines for Understanding Results
  • Commentary
  • Appendix
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure 4.2.1 The screen‐dump from an example run, using M12523.fasta as the input sequence with all the default parameters.
    View Image
  •   Figure 4.2.2 Prediction results form the Command‐Line Unix version of MZEF (prior probability = 0.02; overlap = 1).
    View Image
  •   Figure 4.2.3 Prediction results from the interactive Unix version of MZEF (prior probability = 0.04; overlap = 0).
    View Image
  •   Figure 4.2.4 Prediction results from AAT. Note that all the predictions are on the forward strand, and that no exons are predicted on the reverse strand.
    View Image
  •   Figure 4.2.5 Prediction results from MZEF‐SPC.
    View Image
  •   Figure 4.2.6 A classifier C separates N = 13 sample points in K = 2 feature space. Error = 1.
    View Image
  •   Figure 4.2.7 Quadratic decision boundary for normal distributions.
    View Image
  •   Figure 4.2.8 Linear decision boundary for normal distributions when Σ+ = Σ .
    View Image

Videos

Literature Cited

   Bishop, C.M. 1996. Neural Networks for Pattern Recognition. Oxford, Clarendon Press.
   Box, G.E.P. and Cox, D.R. 1964. An analysis of transformations. J. R. Statist. Soc. B 26:211‐252.
   Chen, T. and Zhang, M.Q. 1998. POMBE: A fission yeast gene‐finding and exon‐intron structure prediction system. Yeast 14:701‐710.
   Davuluri, R., Grosse, I., and Zhang, M.Q. 2001. Computational identification of promoters and first exons in the human genome. Nature Genet. 29:412‐417.
   Fisher, R.A. 1936. The use of multiple measurements in taxonomic problems. Ann. Eugen. 7:179‐188.
   Fukunaga, K. 1990. Introduction to Statistical Pattern Recognition 2nd Edition. Academic Press, San Diego.
   International Human Genome Sequencing Consortium 2001. Initial sequencing and analysis of the human genome. Nature 409:860‐921.
   Ioshikhes, I. and Zhang, M.Q. 2000. Large‐scale human promoter mapping using CpG islands discrimination. Nature Genet. 26:61‐63.
   Minghetti, P.P., Ruffner, D.E., Kuang, W.J., Dennison, O.E., Hawkins, J.W., Beattie, W.G., and Dugaiczyk, A. 1986. Molecular structure of the human albumin gene is revealed by nucleotide sequence within q11‐22 of chromosome 4. J. Biol. Chem. 261:6747‐6757.
   Modrek, B. and Lee, C.A. 2002. A genomic view of alternative splicing. Nat. Genet. 30:13‐19.
   Solovyev, V.V., Salamov, A.A., and Lawrence, C.B. 1994. Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucl. Acids Res. 22:5156‐5163.
   Tabaska, J.E. and Zhang, M.Q. 1999. Detection of polyadenylation signals in human DNA sequences. Gene 231:77‐86.
   Tabaska, J.E., Davuluri, R., and Zhang, M.Q. 2001. A novel 3′‐terminal exon recognition algorithm. Bioinformatics 17:602‐607.
   Thanaraj, T.A. and Robinson, A.J. 2000. Prediction of exact boundaries of exons. Briefings in Bioinformatics 1:34356.
   Zhang, M.Q. 1997. Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc. Natl. Acad. Sci. U.S.A. 94:565‐568.
   Zhang, M.Q. 1998a. Identification of protein‐coding regions in Arabidopsis thaliana genome based on quadratic discriminant analysis. Plant Mol. Biol. 37:803‐806.
   Zhang, M.Q. 1998b. Identification of human gene core‐promoters in silico. Genome Res. 8:319‐326.
   Zhang, M.Q. 1998c. Statistical features of human exons and their flanking regions. Hum. Mol. Genet. 7:919‐932.
   Zhang, M.Q. 2000. Discriminant analysis and its application in DNA sequence motif recognition. Briefings in Bioinformatics 1:331‐342.
Key References
   Zhang, 1997. See above.
   This is the original MZEF paper.
   Zhang, 1998c. See above.
   This has human exon classification and feature statistics.
   Zhang, 2000. See above.
   This is a tutorial on discriminant analysis and has examples on how to combine MZEF with other programs.
Internet Resources
   http://www.cshl.org/genefinder
   MZEF Web server
   http://www.cshl.org/mzhanglab
   Papers and other related information for MZEF
   ftp://cshl.org/pub/science/mzhanglab
   FTP site for MZEF
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
关注公众号
反馈
TOP
打开小程序