Sequence Databases: Integrated Information Retrieval and Data Submission
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
This unit describes the NCBI's Entrez database browser. Entrez integrates DNA and protein sequence data, threedimensional structures, and taxonomic information with its associated abstracts and citations contained in PubMed (MEDLINE). It is possible to search the Entrez information space using conventional search queries (authors, gene names, map location) as well as by bibliographic associations (articles that are related to one another) and sequence homology. Also described are the procedures for submission of new data, updates, and corrections to the sequence databases.
Table of Contents
- Introduction to Entrez
- Data Submission: General Considerations
- Submitting a Sequence to the Nucleotide Database
- Submitting an Update or Correction to an Existing GenBank Entry
- Submitting EST, STS, or GSS Data
- Submitting High‐Throughput Genome Sequences (HTGS)
- Conclusion
- Literature Cited
- Figures
Materials
Figures
-
Figure 6.7.1 The cumulative growth of biomedical research data. (A ) Growth of MEDLINE. (B ) Growth in the number of nucleotide sequences in GenBank (Benson et al., ). (C ) Growth in the number of protein sequences in the “nonredundant” set of proteins in GenBank. (D ) Growth of protein and nucleic acid three‐dimensional structures represented in the Brookhaven Protein Data Bank, April 2000 (http://www.rcsb.org/pdb/). These figures occasionally induce panic or despair in individuals who fear that it is impossible to make sense of so much data; however, modern information retrieval systems such as Entrez (Fig. ) have succeeded in integrating this data, reducing redundancy, and providing powerful and convenient user interfaces. View Image -
Figure 6.7.2 Data sources and links that constitute the Entrez system. View Image
Videos
Literature Cited
Literature Cited | |
Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F., Kerlavage, A.R., McCombie, W.R., and Venter, J.C. 1991. Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 252:1651‐1656. | |
Barrell, B.G. and Clark, B.F.C. 1974. Handbook of Nucleic Acid Sequences. Joynson‐Bruvvers, Oxford. | |
Baxevanis, A.D., Boguski, M.S., and Ouellette, B.F.F. 1997. Computational analysis of DNA and protein sequences. In Genome Analysis: A Laboratory Manual (B. Birren, E.D. Green, S. Kapholz, R.M. Myers, and J. Roskams.eds.) pp. 533‐586. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. | |
Benson, D.A., Karsch‐Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., Wheeler, D.L. 2000. GenBank. Nucl. Acids. Res. 28:15‐18. | |
Boguski, M. and McEntyre, J. 1994. I think therefore I publish. Trends Biochem. Sci. 19:71. | |
Boguski, M.S., Lowe, T.M., and Tolstoshev, C.M. 1993. dbEST: Database for “Expressed Sequence Tags”. Nature Genet. 4:332‐333. | |
Church, D.M., Stotler, C.J., Rutter, J.L., Murrell, J.R., Trofatter, J.A., and Buckler, A.J. 1993. Isolation of genes from complex sources of mammalian genomic DNA using exon amplification. Nature Genet. 6:98‐105. | |
Cockerill, M. 1994. A versatile tool for retrieving molecular sequences. Trends Biochem. Sci. 19:94‐96. | |
Harper, R. 1994. Access to DNA and protein databases on the Internet. Current Opin. Biotechnol. 5:4‐18. | |
Kahn, A.S., Wilcox, A.S., Polymeropoulos, M.H., Hopkins, J.A., Stevens, T.J., Robinson, M., Orpana, A.K., and Sikela, J.M. 1992. Single pass sequencing and physical and genetic mapping of human brain cDNAs. Nature Genet. 2:180‐185. | |
Kans, J.A. and Ouellette, B.F.F. 1998. Submitting DNA sequences to the databases. In Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (A.D. Baxevanis and B.F.F. Ouellette, eds.) pp. 319‐353. John Wiley & Sons, New York. | |
Okubo, K., Hori, N., Matoba, R., Niiyama, T., Fukushima, A., Kojima, Y., and Matsubara, K. 1992. Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nature Genet. 2:173‐179. | |
Schuler, G.D., Epstein, J.A., Ohkawa, H., and Kans, J.A. 1996. Entrez: Molecular biology database and retrieval system. Methods Enzymol. 266:141‐162. | |
Smith, M.W., Holmsen, A.L., Wei, Y.H., Peterson, M., and Evans, G.A. 1994. Genomic sequence sampling: A strategy to high resolution sequence‐based physical mapping of complex genomes. Nature Genet. 7:40‐47. | |
Smith, T.F. 1990. The history of the genetic sequence databases. Genomics 6:702‐707. | |
Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier, L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., Metzstein, M., Hawkins, T., Wilson, R., Berks, M., Du, Z., Thierry‐Mieg, J., and Sulston, J. 1992. A survey of expressed genes in Caenorhabditis elegans. Nature Genet. 1:114‐123. | |
Internet Resources | |
DNA Data Bank of Japan (DDBJ; Center for Information Biology, National Institute of Genetics), 1111 Yata, Mishima, Shiznoka 411, Japan; Fax 81‐559‐81‐6849. e‐mail submissions: ddbjsub@ddbj.nig.ac.jp, updates: ddbjupdt@ddbj.nig.ac.jp, information: ddbj@ddbj.nig.ac.jp, home page: http://www.ddbj.nig.ac.jp/, WWW submissions: http://sakura.ddbj.nig.ac.jp/ | |
European Molecular Biology Laboratory (EMBL), EMBL Outstation, European Bioinformatics Institutes (EBI), Welcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom. e‐mail submissions: datasubs@ebi.ac.uk, updates: update@ebi.ac.uk, information: datalib@ebi.ac.uk, home page: http://www.ebi.ac.uk, WWW submissions: http://www.ebi.ac.uk/Submissions/index.html, WebIn:http://www.ebi.ac.uk/embl/Submission/webin.html | |
National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), Bldg. 38A, Room 8N‐803, 8600 Rockville Pike, Bethesda, Maryland 20894; Telephone: 301‐496‐2475; Fax: 301‐480‐9241. e‐mail submissions: gb-sub@ncbi.nlm.nih.gov, EST/GSS/STS: batch-sub@ncbi.nlm.nih.gov, updates: update@ncbi.nlm.nih.gov, information: info@ncbi.nlm.nih.gov, home page: http://www.ncbi.nlm.nih.gov/, WWW submissions: http://www.ncbi.nlm.nih.gov/Genbank/index.html, BankIt: http://www.ncbi.nlm.nih.gov/BankIt/. | |
Appendix: Sample Genbank Records |