The Importance of Biological Databases in Biological Discovery
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, the Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource are also covered. Non?sequence?centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG) are also discussed. Curr. Protoc. Bioinform. 34:1.1.1?1.1.6. © 2011 by John Wiley & Sons, Inc.
Keywords: biological database; sequence database; structure database; model organisms; biological pathways
Table of Contents
- Overview
- Disclaimer
- Literature Cited
- Figures
Materials
Figures
-
Figure 1.1.1 Exponential growth of GenBank. Data obtained from the NCBI Web site. Note that the period of accelerated growth after 1997 coincides with the completion of the HGP's genetic and physical mapping goals, setting the stage for high‐accuracy, high‐throughput sequencing, as well as the development of new sequencing technologies (Collins et al., , ; Green et al., ). View Image -
Figure 1.1.2 The number of nucleotide bases currently in GenBank for the 20 most‐sequenced organisms. The figures do not include chloroplast or mitochondrial sequences from these organisms. Raw data are from GenBank release 181.0. These organisms alone account for over 68 billion bases in GenBank. Please note that the number of bases present for each organism can be in excess of the actual size of its genome. The status of individual sequencing efforts can be found on the International Sequencing Consortium and NHGRI Web sites (see Internet Resources). View Image
Videos
Literature Cited
Collins, F.S., Patrinos, A., Jordan, E., Chakravarti, A., Gesteland, R., Walters, L., and Members of the DOE and NIH Planning Groups. 1998. New goals for the U.S. Human Genome Project: 1998‐2003. Science 282:682‐689. | |
Collins, F.S., Green, E.D., Guttmacher, A.E., and Guyer, M.S., on behalf of the U.S. National Human Genome Research Institute. 2003. A vision for the future of genomics research. Nature 422:835‐847. | |
Galperin, M.Y. and Cochrane, G.R. 2011. The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Res. 39:D1‐D6. | |
Green, E.D., Guyer, M.S., and the National Human Genome Research Institute. 2011. Charting a course for genomic medicine from basepairs to bedside. Nature 470:204‐213. | |
Internet Resources | |
http://www.ensembl.org | |
Ensembl Web site. | |
http://www.hgmd.cf.ac.uk | |
Human Gene Mutation Database (HGMD). | |
http://www.intlgenome.org | |
International Sequencing Consortium. | |
http://www.genome.ad.jp/kegg/ | |
Kyoto Encyclopedia of Genes and Genomes (KEGG). | |
http://informatics.jax.org | |
Mouse Genome Informatics at the Jackson Laboratory. | |
http://ncbi.nlm.nih.gov | |
National Center for Biotechnology Information (GenBank). | |
http://ncbi.nlm.nih.gov/Entrez/ | |
NCBI Entrez Web site. | |
http://genome.gov | |
National Human Genome Research Institute (NHGRI). | |
http://ncbi.nlm.nih.gov/omim | |
Online Mendelian Inheritance in Man (OMIM). | |
http://www.pdb.org | |
Protein Data Bank (PDB). | |
http://arabidopsis.org | |
The Arabidopsis Information Resource (TAIR). | |
http://genome.ucsc.edu | |
University of California at Santa Cruz (UCSC) Genome Browser. | |
http://www.genome.gov/10002154 | |
Status of genome sequencing projects funded by the National Human Genome Research Institute. | |
http://compbio.dfci.harvard.edu/tgi/ | |
The Gene Index Project. | |
http://wormbase.org | |
WormBase. |