丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Sequence Databases: Integrated Information Retrieval and Data Submission

互联网

1143
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

This unit describes the NCBI's Entrez database browser. Entrez integrates DNA and protein sequence data, threedimensional structures, and taxonomic information with its associated abstracts and citations contained in PubMed (MEDLINE). It is possible to search the Entrez information space using conventional search queries (authors, gene names, map location) as well as by bibliographic associations (articles that are related to one another) and sequence homology. Also described are the procedures for submission of new data, updates, and corrections to the sequence databases.

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction to Entrez
  • Data Submission: General Considerations
  • Submitting a Sequence to the Nucleotide Database
  • Submitting an Update or Correction to an Existing GenBank Entry
  • Submitting EST, STS, or GSS Data
  • Submitting High‐Throughput Genome Sequences (HTGS)
  • Conclusion
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure 6.7.1 The cumulative growth of biomedical research data. (A ) Growth of MEDLINE. (B ) Growth in the number of nucleotide sequences in GenBank (Benson et al., ). (C ) Growth in the number of protein sequences in the “nonredundant” set of proteins in GenBank. (D ) Growth of protein and nucleic acid three‐dimensional structures represented in the Brookhaven Protein Data Bank, April 2000 (http://www.rcsb.org/pdb/). These figures occasionally induce panic or despair in individuals who fear that it is impossible to make sense of so much data; however, modern information retrieval systems such as Entrez (Fig. ) have succeeded in integrating this data, reducing redundancy, and providing powerful and convenient user interfaces.
    View Image
  •   Figure 6.7.2 Data sources and links that constitute the Entrez system.
    View Image

Videos

Literature Cited

Literature Cited
   Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F., Kerlavage, A.R., McCombie, W.R., and Venter, J.C. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252:1651‐1656.
   Barrell, B.G. and Clark, B.F.C. 1974. Handbook of Nucleic Acid Sequences. Joynson‐Bruvvers, Oxford.
   Baxevanis, A.D., Boguski, M.S., and Ouellette, B.F.F. 1997. Computational analysis of DNA and protein sequences. In Genome Analysis: A Laboratory Manual (B. Birren, E.D. Green, S. Kapholz, R.M. Myers, and J. Roskams.eds.) pp. 533‐586. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.
   Benson, D.A., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., Wheeler, D.L. 2000. GenBank. Nucl. Acids. Res. 28:15‐18.
   Boguski, M. and McEntyre, J. 1994. I think therefore I publish. Trends Biochem. Sci. 19:71.
   Boguski, M.S., Lowe, T.M., and Tolstoshev, C.M. 1993. dbEST: Database for “Expressed Sequence Tags”. Nature Genet. 4:332‐333.
   Church, D.M., Stotler, C.J., Rutter, J.L., Murrell, J.R., Trofatter, J.A., and Buckler, A.J. 1993. Isolation of genes from complex sources of mammalian genomic DNA using exon amplification. Nature Genet. 6:98‐105.
   Cockerill, M. 1994. A versatile tool for retrieving molecular sequences. Trends Biochem. Sci. 19:94‐96.
   Harper, R. 1994. Access to DNA and protein databases on the Internet. Current Opin. Biotechnol. 5:4‐18.
   Kahn, A.S., Wilcox, A.S., Polymeropoulos, M.H., Hopkins, J.A., Stevens, T.J., Robinson, M., Orpana, A.K., and Sikela, J.M. 1992. Single pass sequencing and physical and genetic mapping of human brain cDNAs. Nature Genet. 2:180‐185.
   Kans, J.A. and Ouellette, B.F.F. 1998. Submitting DNA sequences to the databases. In Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (A.D. Baxevanis and B.F.F. Ouellette, eds.) pp. 319‐353. John Wiley & Sons, New York.
   Okubo, K., Hori, N., Matoba, R., Niiyama, T., Fukushima, A., Kojima, Y., and Matsubara, K. 1992. Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nature Genet. 2:173‐179.
   Schuler, G.D., Epstein, J.A., Ohkawa, H., and Kans, J.A. 1996. Entrez: Molecular biology database and retrieval system. Methods Enzymol. 266:141‐162.
   Smith, M.W., Holmsen, A.L., Wei, Y.H., Peterson, M., and Evans, G.A. 1994. Genomic sequence sampling: A strategy to high resolution sequence‐based physical mapping of complex genomes. Nature Genet. 7:40‐47.
   Smith, T.F. 1990. The history of the genetic sequence databases. Genomics 6:702‐707.
   Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier, L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., Metzstein, M., Hawkins, T., Wilson, R., Berks, M., Du, Z., Thierry‐Mieg, J., and Sulston, J. 1992. A survey of expressed genes in Caenorhabditis elegans. Nature Genet. 1:114‐123.
Internet Resources
   DNA Data Bank of Japan (DDBJ; Center for Information Biology, National Institute of Genetics), 1111 Yata, Mishima, Shiznoka 411, Japan; Fax 81‐559‐81‐6849. e‐mail submissions: ddbjsub@ddbj.nig.ac.jp, updates: ddbjupdt@ddbj.nig.ac.jp, information: ddbj@ddbj.nig.ac.jp, home page: http://www.ddbj.nig.ac.jp/, WWW submissions: http://sakura.ddbj.nig.ac.jp/
   European Molecular Biology Laboratory (EMBL), EMBL Outstation, European Bioinformatics Institutes (EBI), Welcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom. e‐mail submissions: datasubs@ebi.ac.uk, updates: update@ebi.ac.uk, information: datalib@ebi.ac.uk, home page: http://www.ebi.ac.uk, WWW submissions: http://www.ebi.ac.uk/Submissions/index.html, WebIn:http://www.ebi.ac.uk/embl/Submission/webin.html
   National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), Bldg. 38A, Room 8N‐803, 8600 Rockville Pike, Bethesda, Maryland 20894; Telephone: 301‐496‐2475; Fax: 301‐480‐9241. e‐mail submissions: gb-sub@ncbi.nlm.nih.gov, EST/GSS/STS: batch-sub@ncbi.nlm.nih.gov, updates: update@ncbi.nlm.nih.gov, information: info@ncbi.nlm.nih.gov, home page: http://www.ncbi.nlm.nih.gov/, WWW submissions: http://www.ncbi.nlm.nih.gov/Genbank/index.html, BankIt: http://www.ncbi.nlm.nih.gov/BankIt/.
Appendix: Sample Genbank Records
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
扫码领资料
反馈
TOP
打开小程序