Searching WormBase for Information about Caenorhabditis elegans
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
WormBase is the major public biological database for the nematode Caenorhabditis elegans . It is meant to be useful to any biologist who wants to use C. elegans , whatever his or her specialty. WormBase contains information about the genomic sequence of C. elegans , its genes and their products, and its higher?level traits such as gene expression patterns and neuronal connectivity. WormBase also contains genomic sequences and gene structures of C. briggsae and C. remanei , two closely related worms. These data are interconnected, so that a search beginning with one object (such as a gene) can be directed to related objects of a different type (e.g., the DNA sequence of the gene or the cells in which the gene is active). One can also perform searches for complex data sets. The WormBase developers group actively invites suggestions for improvements from the database users. WormBase's source code and underlying database are freely available for local installation and modification.
Keywords: Caenorhabditis elegans; WormBase; nematode; genomic annotation; gene expression pattern; RNAi; neuronal connectivity
Table of Contents
- Basic Protocol 1: Navigating the WormBase Home Page
- Basic Protocol 2: Performing a Database Search
- Basic Protocol 3: Examining a Gene in C. elegans
- Basic Protocol 4: Examining a Molecular Sequence in C. elegans
- Basic Protocol 5: Finding Protein Features
- Basic Protocol 6: Searching for Gene Products with Particular Sequence Motifs
- Basic Protocol 7: Using the Genome Browser
- Basic Protocol 8: Viewing the C. briggsae Genome and its Synteny with C. elegans
- Basic Protocol 9: Finding Sequence Similarities with Blast
- Basic Protocol 10: Mining Gene Data with WormMart
- Basic Protocol 11: Downloading a Batch of Sequences
- Basic Protocol 12: Examining the Genomic Content of a Classical Genetic Interval
- Basic Protocol 13: Using Other WormBase Searches
- Alternate Protocol 1: Installing and Running WormBase Locally
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
Materials
Figures
-
Figure 1.8.1 The Home page of the WormBase Web site, showing a general database search for zyg‐1 and the Web Site Directory. This page gives several different entry points for WormBase's diverse data. An example is shown of the simplest and broadest search (for Anything) with a single keyword. A menu of the most‐used database searches lines the top of the page, while a list of more specialized data fills the Web Site Directory on the page's left side. View Image -
Figure 1.8.2 Results of the database search in Figure . Having searched the entire database for anything matching zyg‐1 , one sees a plethora of disparate results: genes with zyg‐1 names, protein‐coding sequences (CDSes), expression patterns, and archived research papers. The advantage of this sort of search is that it lowers the chance of missing a wanted item, but it necessarily requires picking and choosing among this sort of data slurry. Alternatively, one could pick a specific data class in the Find pull‐down menu (e.g., “Any Gene” or “Cell”; Fig. ) and get narrower, but better focused results. View Image -
Figure 1.8.3 The top of the Gene Summary page for zyg‐1 . WormBase organizes its data around a few key hubs. Gene Summary pages are perhaps the most important single such hub; they are intended to give a compact but full summary of everything known about a given gene in C. elegans . Even in this excerpt, one can get summarized gene function and orthology, a list of transcripts and their experimental evidence, links to DNA and protein sequences, a C. briggsae ortholog, and external database records. View Image -
Figure 1.8.4 Genetic and genomic information from the Gene Summary page for zyg‐1 . Further down the same page as in Figure is a small but detailed diagram of the gene's DNA structure, with links to transcripts, sequenced clones, and alleles. Along with this are given exact nucleotide coordinates and the meiotic gene map position. View Image -
Figure 1.8.5 The Sequence Summary page of F59E12.2 (linked to the zyg‐1 Gene Summary page by the link under sequence name). Most data for the exact nucleotide sequence is too detailed to be of immediate interest on a Gene Summary page, so it is given its own Sequence Summary page instead (linked to the Gene and Protein pages). These data are most useful in designing cloning experiments or direct perturbations of DNA function such as RNAi. Further down this page are another schematic diagram, a BLAST search launcher, exact coordinates of exons and introns in the genomic sequences, and a list of available cDNA clones. View Image -
Figure 1.8.6 Part of the CE28571 ( zyg‐1 ) Protein page, with a schematic diagram of CE28571's exons, protein motifs, low‐complexity domains (defined by the SEG program; Wootton, ), and similarities to proteins in other eukaryotic species. As with nucleotide sequences, proteins have enough detailed information to require their own specialized pages. WormBase's Protein pages give both text and diagrams to let a user map individual sequence features with respect to one another and to the protein's exonic coding sequences. The sequence features shown range from very generic (signal, low‐complexity, and predicted transmembrane) to broadly distributed but specific motifs (e.g., “tyrosine protein kinase”) and then to individual BLAST matches with highly similar proteins in other organisms. Diagramming all of these allows the user to quickly see what parts of the protein are likely to have distinct functions. View Image -
Figure 1.8.7 Protein motifs identified by a “ribonucleoprotein” search term. WormBase has an extensive catalog of protein motifs, taken from both the PFAM and the InterPro compilations. Keyword searches of these motifs are one way to subdivide a general protein type into several types with detailed functional differences. View Image -
Figure 1.8.8 Proteins identified as sharing a single motif. Motifs are evolutionarily mobile; they can be spread among homologous proteins or transferred horizontally between nonhomologous ones. Accordingly, each motif in WormBase is listed with the full set of proteins encoding it. This gives one way of identifying every gene product in C. elegans likely to participate in a shared biochemical function. View Image -
Figure 1.8.9 A view of the entire mitochondrial chromosome (mtDNA) in the Genome Browser. Like the Gene Summary page, the Genome Browser provides a central hub around which complex data can be economically organized. Here we see its view expanded to an entire chromosome. The view is customizable with many different user‐selected tracks (a few of which are visible). View Image -
Figure 1.8.10 An expanded Genome Browser view of the F59E12.2 ( zyg‐1 ) sequence, with added tracks for ESTs, mRNAs, and C. briggsae homologies. Where the Gene Summary page gives a text‐oriented, human‐readable summary of zyg‐1 , the Genome Browser here gives a view rooted in its DNA structure. Picking just a few tracks allows this view to link gene coexpression (through operons), likely regulatory sequences (i.e., noncoding DNA highly conserved in C. briggsae ), direct evidence for gene activity (ESTs and a cDNA), a genomic clone (archived in GenBank), and complexities of the gene's structure (including a nested gene with an entirely dissimilar mutant phenotype). View Image -
Figure 1.8.11 A view of 1 Mb of genomic DNA, centered on the F59E12.2 ( zyg‐1 ) sequence. Genome Browser views are customizable not only in their contents but in their size. Shown here is a tracked view spanning 1 Mb of genomic DNA. As the view grows, fine details are merged into an general map; this works best when one is looking for features that vary over a scale of tens or hundreds of thousands of nucleotides. View Image -
Figure 1.8.12 A view of 100 bp of genomic DNA immediately to the 5′ side of F59E12.2 ( zyg‐1 ). The opposite extreme of size selection is this 100‐nucleotide view of zyg‐1 's 5′‐flank. This view lists individual nucleotides and is ideal for fine resolution of transgenic construct or cis ‐regulatory sites. As in larger views, multiple tracks can be chosen to make easy comparisons of diverse features (e.g., cDNAs versus predicted start sites). View Image -
Figure 1.8.13 The Genome Browser showing the C. briggsae ortholog of zyg‐1 . C. briggsae 's genome is also available through the Genome Browser. This view of zyg‐1 confirms that its complex structure is indeed conserved in C. briggsae , while also showing small differences in intron size. View Image -
Figure 1.8.14 The Synteny Viewer showing the zyg‐1 / bli‐2 cluster in C. elegans and C. briggsae . Here the zyg‐1 loci from two Caenorhabditis species are shown in syntenic alignment, making their precise similarities and differences obvious. Like the Genome Browser, this view can be expanded to take in large chromosomal spans or contracted to single DNA sites. A particularly good use of this viewer is in working out the clearest possible view of an evolutionarily complex syntenic region. View Image -
Figure 1.8.15 A BLASTP search of WormPep release 147 with the human dymeclin (DYM) protein, which when mutated leads to Dyggve‐Melchior‐Clausen or Smith‐McCort dysplasia. BLAST searches in WormBase not only give hit results, but also give hyperlinks to their database records, making it easy to go from a positive search result to its Gene Summary page or to a view of its genomic region. Both strong and weak hits can be informative, since they can identify both orthologs and paralogs of a query sequence. Searches have a default cut‐off E‐value of 0.01, but this can be adjusted by the user for more or less stringency (and hits). View Image -
Figure 1.8.16 The Filter menu of WormMart, with filters set to select for pqn‐ genes in C. elegans with uncoordinated RNAi phenotypes. WormMart gives the user a menu with which one or more of a great many different conditions can be imposed on data. Each condition is itself simple, but the freedom of users to choose and mix them with a graphical interface makes highly complex searches practical. This particular search started by choosing the WS140 data release (shown in the Summary on the right‐hand side) and its Gene data set. This still leaves the user with over 40,000 objects to sort through. In this simple search, the user has selected only those genes falling into the pqn class, which includes ∼100 genes encoding prion‐like proteins with domains highly enriched for glutamine (Q) or asparagine (N). View Image -
Figure 1.8.17 The Output menu for selecting sequence attributes, showing several different choices of gene substructure. After filtering, data in WormMart need to be exported, and again, many different choices of output contents and format exist. One particularly useful form is sequence output in which the user picks some type of gene structure (e.g., 5′ flanks, introns, or exons) for mass export from a selected gene set (selected by choices like those shown in Fig. ). As a given option for sequence export is picked, a small schematic diagram of the gene is marked in red to clarify what the option means in practice. Since the sequences are exported in FASTA format, the headers for these FASTA records can themselves be loaded with user‐selected data (e.g., gene names). View Image -
Figure 1.8.18 Final results of the search in Figure . Another option for user‐selected output is to have tables listing gene features rather than nucleotide sequences. This output was generated from the pqn‐ search shown in Figure by selecting (in addition to the pqn gene class) for molecular and classical gene names, RNAi phenotypes, and conserved orthologous protein groups (KOGs). As with the Genome Browser, a strength of these user‐selected outputs is the ability to quickly compare disparate data sets in an easily scanned, well‐aligned format. View Image -
Figure 1.8.19 The graphical output from a search for genetic markers in the vicinity of hid ‐ 3 . Classical genetics in C. elegans remains crucial for finding new biological functions. Here the user has a gene map for the region around the uncloned hid ‐ 3 gene that integrates cloned genes, uncloned loci, predicted genes, and STS markers. Such a view makes it straightforward to design fine‐scale STS mapping and to identify other loci that might be allelic to hid ‐ 3 . View Image -
Figure 1.8.20 Part of the tabular output from a search for hid‐3 markers. Graphic and tabular gene maps have complementary uses. The graphical map in Figure lets the user take in a genetic region intuitively at a glance; this table lists the exact identity and details of its contents. Details include the meiotic map position, alleles, and laboratory strains for each gene in a region. View Image -
Figure 1.8.21 Results of a Gene Ontology (GO) search for “RNA splicing”. GO allows genes to be classified by their shared biochemical or biological roles whether or not their products have any similarity to each other. While this classification is powerful, it can be difficult to decipher because there are a great many GO terms, most with complex meanings. To help make sense of this complexity, searches in WormBase for GO terms give tables listing not only the names of terms, but also their definitions and the genes associated with them. Searching with a simple phrase such as “RNA splicing” can give many different results with highly detailed meanings. View Image -
Figure 1.8.22 Summary of the “RNA splicing” GO term in WormBase, with its connections to genes and protein motifs. Each GO term has its own summary page, accessible either through a term search (as in Fig. ) or through gene or protein motif pages. The broadly defined “RNA splicing” term is seen here to encompass two different genes and three different protein motifs. One link on this page leads to a browsable version of the entire Gene Ontology system. View Image -
Figure 1.8.23 A detailed view of the “RNA splicing, via transesterification reactions with bulged adenosine as nucleophile” GO term in WormBase defined, shown in its context of other GO terms, and connected to genes and protein motifs. Another view of a GO term, this time with a browsable context. As in Figure , links are given for associated genes and protein motifs, but here one can also see how this rather specialized term fits into the overall Gene Ontology. Note that this GO term is not actually the most narrow one possible, but is itself a parent term for three even more specialized terms (at the end of the derivation). View Image -
Figure 1.8.24 An expanded view of neuronal lineages. WormBase gives a graphically browsable diagram of C. elegans ' entire developmental lineage, from the fertilized egg to the adult body. Here is shown a small subset of that lineage, starting from the progenitor cell P1. Each node can be either collapsed or expanded by clicking on it to give simplified or elaborated views; here all the nodes have been expanded. Each cell type is given a hypertext link to its own Cell page. View Image -
Figure 1.8.25 The Cell Report page for AS1, a neuron in the P1 lineage seen in Figure . Clicking on the AS1 link in P1's lineage leads to this report, summarizing developmental and functional traits for this cell type. A single cell can belong to more than one group, defined either by cell class or by organ or tissue. Cells can be major progenitors of a lineage branch (blast cells), intermediates during development, or terminally differentiated. They can also have many different gene expression patterns associated with them, either generically (e.g., a gene expressed in neurons will implicitly be expressed in AS1) or more or less specifically (some genes may be expressed in only AS1, while others may be expressed in some well defined set of cells including AS1). Although WormBase has tended so far to emphasize a gene‐centric view of the organism, Cell pages are likely to become increasingly detailed hubs of information rivaling the Gene Summary pages and Genome Browser as WormBase's contents extend to integrative, physiological data. View Image -
Figure 1.8.26 Summary for a chosen set of neurons. Neurons in C. elegans have somewhat cryptic names (e.g., AFD for “amphid finger neuron”). The tabular output from a Neuron search decodes these names by listing their human‐readable identities, their membership in neuronal groups (by shared ganglia or shared traits), and their developmental lineage abbreviations (totally different from their differentiated neuron abbreviations). Numbers of their gap junctions and synaptic connections are also given, with their identities detailed on a neuron's Cell page (Fig. ). View Image -
Figure 1.8.27 Diagram of the ADAL neuron. This is another Cell Report page, for a sensory neuron of the head. Pages for sensory and some other neurons include small diagrams of their structures in the body, with the pharynx given as a background for orientation. More specialized and fully detailed anatomical views are available in WormAtlas (http://www.wormatlas.org). Each neuron page also gives a detailed list of neuron‐by‐neuron connections, determined from electron‐microscopic serial sectioning of an entire worm's nervous system. View Image -
Figure 1.8.28 Synaptic connections in WormBase. Each neuron in C. elegans has fully identified chemical synapses or gap junctions linking it to other neurons or muscles; these links are fully tabulated in WormBase. View Image -
Figure 1.8.29 Results for an Expression Pattern search with “AS neurons”. Gene expression patterns can be searched with terms for predefined cell groups, extracted from the primary literature. The resulting table gives, for each pattern found, the gene driving it and a summary of the cells it includes. Since these expression patterns are usually driven by a gene's entire promoter and since metazoan promoters can have complex, multiple cis ‐regulatory elements, the patterns can be heterogeneous and extensive. However, some genes are solely expressed by a single cell type in the whole animal, while yet others appear to have truly ubiquitous expression in all somatic cells. Hypertext links from these search results can lead to a gene, its DNA sequence, or a detailed report about a single expression pattern (e.g., “Expr217”). The expression pattern report, in turn, will list the exact reagents used to determine the pattern (e.g., a defined DNA region or an antibody). View Image -
Figure 1.8.30 A diagram showing overlaps between zyg‐1 's canonical (sequenced) genomic cosmid clone and other (unsequenced) clones. There are many more clones produced by the C. elegans genome project than have actually been sequenced. The default view in the Genome Browser gives only those cosmid or YAC clones that were actually used in genomic sequencing. However, for actual experiments on individual genes, a non‐canonical cosmid's insert may better encompass the gene's full 5′ and 3′ flanks. The clone map search allows users to see the entire set of clones available for a gene region. View Image
Videos
Literature Cited
Ailion, M. and Thomas, J.H. 2003. Isolation and characterization of high‐temperature‐induced dauer formation mutants in Caenorhabditis elegans. Genetics 165:127‐144. | |
Ashrafi, K., Chang, F.Y., Watts, J.L., Fraser, A.G., Kamath, R.S., Ahringer, J., and Ruvkun, G. 2003. Genome‐wide RNAi analysis of Caenorhabditis elegans fat regulatory genes. Nature 421:268‐272. | |
Bairoch, A., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O'Donovan, C., Redaschi, N., and Yeh, L.S. 2005. The Universal Protein Resource (UniProt). Nucl. Acids Res. 33:D154‐D159. | |
Balakrishnan, R., Christie, K.R., Costanzo, M.C., Dolinski, K., Dwight, S.S., Engel, S.R., Fisk, D.G., Hirschman, J.E., Hong, E.L., Nash, R., Oughtred, R., Skrzypek, M., Theesfeld, C.L., Binkley, G., Dong, Q., Lane, C., Sethuraman, A., Weng, S., Botstein, D., and Cherry, J.M. 2005. Fungal BLAST and Model Organism BLASTP Best Hits: New comparison resources at the Saccharomyces Genome Database (SGD). Nucl. Acids Res. 33:D374‐D377. | |
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths‐Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., Studholme, D.J., Yeats, C., and Eddy, S.R. 2004. The Pfam protein families database. Nucl. Acids Res. 32:D138‐D141. | |
Brenner, S. 1974. The genetics of Caenorhabditis elegans. Genetics 77:71‐94. | |
C. elegans Sequencing Consortium. 1998. Genome sequence of the nematode C. elegans: A platform for investigating biology. Science 282:2012‐2018. | |
Chalfie, M. and White, J. 1988. The nervous system. In The Nematode Caenorhabditis elegans (W.B. Wood., ed.) pp. 337‐391. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. | |
Chen, N., Harris, T.W., Antoshechkin, I., Bastiani, C., Bieri, T., Blasiar, D., Bradnam, K., Canaran, P., Chan, J., Chen, C.K., Chen, W.J., Cunningham, F., Davis, P., Kenny, E., Kishore, R., Lawson, D., Lee, R., Mµller, H.M., Nakamura, C., Pai, S., Ozersky, P., Petcherski, A., Rogers, A., Sabo, A., Schwarz, E.M., Van Auken, K., Wang, Q., Durbin, R., Spieth, J., Sternberg, P.W., and Stein, L.D. 2005. WormBase: A comprehensive data resource for Caenorhabditis biology and genomics. Nucl. Acids Res. 33:D383‐D389. | |
Cho, S., Jin, S.W., Cohen, A., and Ellis, R.E. 2004. A phylogeny of Caenorhabditis reveals frequent loss of introns during nematode evolution. Genome Res. 14:1207‐1220. | |
Cohn, D.H., Ehtesham, N., Krakow, D., Unger, S., Shanske, A., Reinker, K., Powell, B.R., and Rimoin, D.L. 2003. Mental retardation and abnormal skeletal development (Dyggve‐Melchior‐Clausen dysplasia) due to mutations in a novel, evolutionarily conserved gene. Am. J. Hum. Genet. 72:419‐428. | |
Costanzo, M.C., Crawford, M.E., Hirschman, J.E., Kranz, J.E., Olsen, P., Robertson, L.S., Skrzypek, M.S., Braun, B.R., Hopkins, K.L., Kondu, P., Lengieza, C., Lew‐Smith, J.E., Tillberg, M., and Garrels, J.I. 2001. YPD, PombePD and WormPD: Model organism volumes of the BioKnowledge library, an integrated resource for protein information. Nucl. Acids Res. 29:75‐79. | |
Drysdale, R.A., Crosby, M.A., and FlyBase Consortium. 2005. FlyBase: Genes and gene models. Nucl. Acids Res. 33:D390‐D395. | |
Eeckman, F.H. and Durbin, R. 1995. ACeDB and Macace. Methods Cell Biol. 48:583‐605. | |
El Ghouzzi, V., Dagoneau, N., Kinning, E., Thauvin‐Robinet, C., Chemaitilly, W., Prost‐Squarcioni, C., Al‐Gazali, L.I., Verloes, A., Le Merrer, M., Munnich, A., Trembath, R.C., and Cormier‐Daire, V. 2003. Mutations in a novel gene Dymeclin (FLJ20071) are responsible for Dyggve‐Melchior‐Clausen syndrome. Hum. Mol. Genet. 12:357‐364. | |
Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E., and Mello, C.C. 1998. Potent and specific genetic interference by double‐stranded RNA in Caenorhabditis elegans. Nature 391:806‐811. | |
Ge, H., Walhout, A.J., and Vidal, M. 2003. Integrating “omic” information: A bridge between genomics and systems biology. Trends Genet. 19:551‐560. | |
GuhaThakurta, D., Schriefer, L.A., Waterston, R.H., and Stormo, G.D. 2004. Novel transcription regulatory elements in Caenorhabditis elegans muscle genes. Genome Res. 14:2457‐2468. | |
Gunsalus, K.C., Ge, H., Schetter, A.J., Goldberg, D.S., Han, J.D., Hao, T., Berriz, G.F., Bertin, N., Huang, J., Chuang, L.S., Li, N., Mani, R., Hyman, A.A., Sonnichsen, B., Echeverri, C.J., Roth, F.P., Vidal, M., and Piano, F. 2005. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436:861‐865. | |
Harris, M.A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Lewis, S., Marshall, B., Mungall, C., Richter, J., Rubin, G.M., Blake, J.A., Bult, C., Dolan, M., Drabkin, H., Eppig, J.T., Hill, D.P., Ni, L., Ringwald, M., Balakrishnan, R., Cherry, J.M., Christie, K.R., Costanzo, M.C., Dwight, S.S., Engel, S., Fisk, D.G., Hirschman, J.E., Hong, E.L., Nash, R.S., Sethuraman, A., Theesfeld, C.L., Botstein, D., Dolinski, K., Feierbach, B., Berardini, T., Mundodi, S., Rhee, S.Y., Apweiler, R., Barrell, D., Camon, E., Dimmer, E., Lee, V., Chisholm, R., Gaudet, P., Kibbe, W., Kishore, R., Schwarz, E.M., Sternberg, P., Gwinn, M., Hannick, L., Wortman, J., Berriman, M., Wood, V., de la Cruz, N., Tonellato, P., Jaiswal, P., Seigfried, T., and White, R. 2004. The Gene Ontology (GO) database and informatics resource. Nucl. Acids Res. 32:D258‐D261. | |
Hubbard, T., Andrews, D., Caccamo, M., Cameron, G., Chen, Y., Clamp, M., Clarke, L., Coates, G., Cox, T., Cunningham, F., Curwen, V., Cutts, T., Down, T., Durbin, R., Fernandez‐Suarez, X.M., Gilbert, J., Hammond, M., Herrero, J., Hotz, H., Howe, K., Iyer, V., Jekosch, K., Kahari, A., Kasprzyk, A., Keefe, D., Keenan, S., Kokocinsci, F., London, D., Longden, I., McVicker, G., Melsopp, C., Meidl, P., Potter, S., Proctor, G., Rae, M., Rios, D., Schuster, M., Searle, S., Severin, J., Slater, G., Smedley, D., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Storey, R., Trevanion, S., Ureta‐Vidal, A., Vogel, J., White, S., Woodwark, C., and Birney, E. 2005. Ensembl 2005. Nucl. Acids Res. 33:D447‐D453. | |
Hughes‐Davies, L., Huntsman, D., Ruas, M., Fuks, F., Bye, J., Chin, S.F., Milner, J., Brown, L.A., Hsu, F., Gilks, B., Nielsen, T., Schulzer, M., Chia, S., Ragaz, J., Cahn, A., Linger, L., Ozdag, H., Cattaneo, E., Jordanova, E.S., Schuuring, E., Yu, D.S., Venkitaraman, A., Ponder, B., Doherty, A., Aparicio, S., Bentley, D., Theillet, C., Ponting, C.P., Caldas, C., and Kouzarides, T. 2003. EMSY links the BRCA2 pathway to sporadic breast and ovarian cancer. Cell 115:523‐535. | |
Jones, S.J., Riddle, D.L., Pouzyrev, A.T., Velculescu, V.E., Hillier, L., Eddy, S.R., Stricklin, S.L., Baillie, D.L., Waterston, R., and Marra, M.A. 2001. Changes in gene expression associated with developmental arrest and longevity in Caenorhabditis elegans. Genome Res. 11:1346‐1352. | |
Jorgensen, E.M. and Mango, S.E. 2002. The art and design of genetic screens: Caenorhabditis elegans. Nat. Rev. Genet. 3:356‐369. | |
Kamath, R.S., Martinez‐Campos, M., Zipperlen, P., Fraser, A.G., and Ahringer, J. 2001. Effectiveness of specific RNA‐mediated interference through ingested double‐stranded RNA in Caenorhabditis elegans. Genome Biol. 2:RESEARCH0002. | |
Kamath, R.S., Fraser, A.G., Dong, Y., Poulin, G., Durbin, R., Gotta, M., Kanapin, A., Le Bot, N., Moreno, S., Sohrmann, M., Welchman, D.P., Zipperlen, P., and Ahringer, J. 2003. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 421:231‐237. | |
Kasprzyk, A., Keefe, D., Smedley, D., London, D., Spooner, W., Melsopp, C., Hammond, M., Rocca‐Serra, P., Cox, T., and Birney, E. 2004. EnsMart: A generic system for fast and flexible access to biological data. Genome Res. 14:160‐169. | |
Kent, W.J. 2002. BLAT: The BLAST‐like alignment tool. Genome Res. 12:656‐664. | |
Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., and Haussler, D. 2002. The human genome browser at UCSC. Genome Res. 12:996‐1006. | |
Kiontke, K., Gavin, N.P., Raynes, Y., Roehrig, C., Piano, F., and Fitch, D.H. 2004. Caenorhabditis phylogeny predicts convergence of hermaphroditism and extensive intron loss. Proc. Natl. Acad. Sci. U.S.A. 101:9003‐9008. | |
Korf, I., Yandell, M., and Bedell, J. 2003. BLAST. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Krause, M. 1995. Techniques for analyzing transcription and translation. Methods Cell Biol. 48:513‐529. | |
Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305:567‐580. | |
Li, S., Armstrong, C.M., Bertin, N., Ge, H., Milstein, S., Boxem, M., Vidalain, P.‐O., Han, J.‐D.J., Chesneau, A., Hao, T., Goldberg, D.S., Li, N., Martinez, M., Rual, J.‐F., Lamesch, P., Xu, L., Tewari, M., Wong, S.L., Zhang, L.V., Berriz, G.F., Jacotot, L., Vaglio, P., Reboul, J., Hirozane‐Kishikawa, T., Li, Q., Gabel, H.W., Elewa, A., Baumgartner, B., Rose, D.J., Yu, H., Bosak, S., Sequerra, R., Fraser, A., Mango, S.E., Saxton, W.M., Strome, S., van den Heuvel, S., Piano, F., Vandenhaute, J., Sardet, C., Gerstein, M., Doucette‐Stamm, L., Gunsalus, K.C., Harper, J.W., Cusick, M.E., Roth, F.P., Hill, D.E., and Vidal, M. 2004. A map of the interactome network of the metazoan C. elegans. Science 303:540‐543. | |
Lippincott‐Schwartz, J. and Patterson, G.H. 2003. Development and use of fluorescent protein markers in living cells. Science 300:87‐91. | |
Lupas, A. 1997. Predicting coiled‐coil regions in proteins. Curr. Opin. Struct. Biol. 7:388‐393. | |
Mello, C. and Fire, A. 1995. DNA transformation. Methods Cell Biol. 48:451‐482. | |
Merke, D.P. and Bornstein, S.R. 2005. Congenital adrenal hyperplasia. Lancet 365:2125‐2136. | |
Miller, D.M. and Shakes, D.C. 1995. Immunofluorescence microscopy. Methods Cell Biol. 48:365‐394. | |
Mount, D.W. 2004. Bioinformatics: Sequence and Genome Analysis, 2nd. ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. | |
Mulder, N.J., Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Binns, D., Bradley, P., Bork, P., Bucher, P., Cerutti, L., Copley, R., Courcelle, E., Das, U., Durbin, R., Fleischmann, W., Gough, J., Haft, D., Harte, N., Hulo, N., Kahn, D., Kanapin, A., Krestyaninova, M., Lonsdale, D., Lopez, R., Letunic, I., Madera, M., Maslen, J., McDowall, J., Mitchell, A., Nikolskaya, A.N., Orchard, S., Pagni, M., Ponting, C.P., Quevillon, E., Selengut, J., Sigrist, C.J., Silventoinen, V., Studholme, D.J., Vaughan, R., and Wu, C.H. 2005. InterPro, progress and status in 2005. Nucl. Acids Res. 33:D201‐D205. | |
Mµller, H.M., Kenny, E.E., and Sternberg, P.W. 2004. Textpresso: An ontology‐based information retrieval and extraction system for biological literature. PLoS Biol. 2:E309. | |
Nardone, J., Lee, D.U., Ansel, K.M., and Rao, A. 2004. Bioinformatics for the ‘bench biologist’: How to find regulatory regions in genomic DNA. Nat. Immunol. 5:768‐774. | |
O'Connell, K.F., Leys, C.M., and White, J.G. 1998. A genetic screen for temperature‐sensitive cell‐division mutants of Caenorhabditis elegans. Genetics 149:1303‐1321. | |
Pogue, D. 2005. Mac OS X: The Missing Manual, Tiger Edition. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Reese, G., Yarger, R.J., and King, T. 2002. Managing and Using MySQL, 2nd. ed. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Riddle, D.L., Blumenthal, T., Meyer, B.J., and Priess, J.R. (eds.). 1997. C. elegans II. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. | |
Schuler, G.D. 1997. Sequence mapping by electronic PCR. Genome Res. 7:541‐550. | |
Schwartz, R.L., Phoenix, T., and Foy, B.D. 2005. Learning Perl, 4th. ed. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Schwarz, E.M., Stein, L.D., and Sternberg, P.W. 2002. Caenorhabditis elegans databases. Curr. Genomics 3:111‐119. | |
Sieburth, D., Ch'ng, Q., Dybbs, M., Tavazoie, M., Kennedy, S., Wang, D., Dupuy, D., Rual, J.F., Hill, D.E., Vidal, M., Ruvkun, G., and Kaplan, J.M. 2005. Systematic analysis of genes required for synapse structure and function. Nature 436:510‐517. | |
Simmer, F., Moorman, C., Van Der Linden, A.M., Kuijk, E., Van Den Berghe, P.V., Kamath, R., Fraser, A.G., Ahringer, J., and Plasterk, R.H. 2003. Genome‐wide RNAi of C. elegans using the hypersensitive rrf‐3 strain reveals novel gene functions. PLoS Biol. 1:E12. | |
Simpson, P.T., Reis‐Filho, J.S., Gale, T., and Lakhani, S.R. 2005. Molecular evolution of breast cancer. J. Pathol. 205:248‐254. | |
Stajich, J.E., Block, D., Boulez, K., Brenner, S.E., Chervitz, S.A., Dagdigian, C., Fuellen, G., Gilbert, J.G., Korf, I., Lapp, H., Lehvaslaiho, H., Matsalla, C., Mungall, C.J., Osborne, B.I., Pocock, M.R., Schattner, P., Senger, M., Stein, L.D., Stupka, E., Wilkinson, M.D., and Birney, E. 2002. The Bioperl toolkit: Perl modules for the life sciences. Genome Res. 12:1611‐1618. | |
Stein, L.D. and Thierry‐Mieg, J. 1998. Scriptable access to the Caenorhabditis elegans genome sequence and other ACEDB databases. Genome Res. 8:1308‐1315. | |
Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The Generic Genome Browser: A building block for a model organism system database. Genome Res. 12:1599‐1610. | |
Stein, L.D., Bao, Z., Blasiar, D., Blumenthal, T., Brent, M.R., Chen, N., Chinwalla, A., Clarke, L., Clee, C., Coghlan, A., Coulson, A., D'Eustachio, P., Fitch, D.H., Fulton, L.A., Fulton, R.E., Griffiths‐Jones, S., Harris, T.W., Hillier, L.W., Kamath, R., Kuwabara, P.E., Mardis, E.R., Marra, M.A., Miner, T.L., Minx, P., Mullikin, J.C., Plumb, R.W., Rogers, J., Schein, J.E., Sohrmann, M., Spieth, J., Stajich, J.E., Wei, C., Willey, D., Wilson, R.K., Durbin, R., and Waterston, R.H. 2003. The genome sequence of Caenorhabditis briggsae: A platform for comparative genomics. PLoS Biol. 1:E45. | |
Stone, M., Ockman, S., and DiBona, C. (eds.). 1999. Open Sources: Voices From the Open Source Revolution. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Sulston, J.E., Schierenberg, E., White, J.G., and Thomson, J.N. 1983. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100:64‐119. | |
Swan, K.A., Curtis, D.E., McKusick, K.B., Voinov, A.V., Mapa, F.A., and Cancilla, M.R. 2002. High‐throughput gene mapping in Caenorhabditis elegans. Genome Res. 12:1100‐1105. | |
Tabara, H., Motohashi, T., and Kohara, Y. 1996. A multi‐well version of in situ hybridization on whole mount embryos of Caenorhabditis elegans. Nucl. Acids Res. 24:2119‐2124. | |
Tisdall, J.D. 2003. Mastering Perl for Bioinformatics. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
Welsh, M., Dalheimer, M.K., Dawson, T., and Kaufman, L. 2002. Running Linux, 4th ed. O'Reilly & Associates, Inc., Sebastopol, Calif. | |
White, J.G., Southgate, E., Thomson, J.N., and Brenner, S. 1986. The structure of the nervous system of Caenorhabditis elegans. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 314:1‐340. | |
Wicks, S.R., Yeh, R.T., Gish, W.R., Waterston, R.H., and Plasterk, R.H. 2001. Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map. Nat. Genet. 28:160‐164. | |
Wood, W.B. (ed.) 1988. The nematode Caenorhabditis elegans. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. | |
Wootton, J.C. 1994. Non‐globular domains in protein sequences: Automated segmentation using complexity measures. Comput. Chem. 18:269‐285. | |
Key References | |
Brenner, 1974. See above. | |
The beginning of modern C. elegans research, this article remains vividly readable and gives a clear introduction to the goals and tactics of worm genetics. | |
C. elegans Sequencing Consortium, 1998. See above. | |
This summarizes the first findings from the near‐completion (∼98%) of the C. elegans genome, and gives useful background information about how the genomic sequence was acquired and organized. (Final closure of the last 2% of gaps in the genome sequence was achieved in November 2002, four years later.) | |
Harris et al., 2004. See above. | |
Gene Ontology has become a recognized common vocabulary for functionally annotating gene products in both WormBase and many other genomic databases. | |
Mulder et al., 2005. See above. | |
Like Gene Ontology terms, InterPro motifs have become a common vocabulary for genome annotation in WormBase and elsewhere. | |
Stein et al., 2002. See above. | |
The genome browser described here is extensively used in WormBase. | |
Sulston et al., 1983. See above. | |
The lineage browser in WormBase is an attempt to provide a Web interface for small slices of the entire set of findings given here. | |
White et al., 1986. See above. | |
The neuronal connections and anatomy stored in WormBase are largely derived from this work. | |
Internet Resources | |
http://wormbase.org | |
The main Web site for WormBase. It is meant to be slightly less bleeding‐edge than the development site, but to be maximally stable. | |
http://dev.wormbase.org | |
The development Web site for WormBase. This site runs the latest release of the WormBase data, allowing any bugs in the data or their presentation to be caught before being put on the main site. If one wants the absolutely latest information on C. elegans, this is therefore the site to use (the main site lags by two weeks, the interval between successive data releases). New additions to the site software are tested here first as well. | |
http://www.wormbook.org | |
WormBook is an online anthology of over 70 articles reviewing a great deal of current knowledge about C. elegans biology in 2005. All articles are provided in both HTML and PDF format, and are freely downloadable. Along with its two predecessors (Wood, ; Riddle et al., ), WormBook is strongly recommended for anybody starting work on this organism or studying it. | |
http://ws150.wormbase.org | |
The Web site for the archival WS150 release of WormBase's data. The advantage of this site is that the data do not change, and thus can be used for reliable cross‐comparison of bioinformatical analyses by different research groups at different times. Similar sites exist for several previous tenth releases (e.g., WS140: http://ws140.wormbase.org) and future archival sites are planned for roughly once every 7 months. | |
http://www.its.caltech.edu/∼wormbase/userguide | |
The User's Guide for WormBase. | |
ftp://ftp.wormbase.org/pub/wormbase | |
This FTP site contains archives of the wormpep and wormrna files, and the core software for running WormBase as a local installation. | |
http://caltech.wormbase.org | |
The California Institute of Technology mirror site for WormBase maintained by Erich Schwarz (emsch@its.caltech.edu). | |
http://imbb.wormbase.org | |
The Greek mirror site for WormBase, maintained by Nektarios Tavernarakis (tavernarakis@imbb.forth.gr). | |
ftp://ftp.sanger.ac.uk/pub2/wormbase | |
This FTP site contains the complete releases of WormBase's data (typically the most recent two, along with permanently archived releases such as WS100 through WS150). | |
http://www.wormatlas.org | |
The public atlas of C. elegans anatomy, with several key references for worm anatomy on line, including White et al. (). | |
http://elegans.swmed.edu | |
A useful site with links to worm literature and genetic strain information, maintained by Leon Avery (leon@eatworms.swmed.edu). | |
http://wormbase.org/about/about_Celegans.html | |
Links to several other C. elegans databases are collected here. | |
http://www.geneontology.org | |
This site gives extensive documentation for the Gene Ontology system, increasingly used for functional annotation in WormBase. |