Using the Generic Synteny Browser (GBrowse_syn)

互联网2013-12-31

974

Abstract
Table of Contents
Figures
Literature Cited

Abstract

Genome Browsers are software that allow the user to view genome annotations in the context of a reference sequence, such as a chromosome, contig, scaffold, etc. The Generic Genome Browser (GBrowse) is an open?source genome browser package developed as part of the Generic Model Database Project (see UNIT ; Stein et al., 2002). The increasing number of sequenced genomes has led to a corresponding growth in the field of comparative genomics, which requires methods to view and compare multiple genomes. Using the same software framework as GBrowse, the Generic Synteny Browser (GBrowse_syn) allows the comparison of colinear regions of multiple genomes using the familiar GBrowse?style Web page. Like GBrowse, GBrowse_syn can be configured to display any organism, and is currently the synteny browser used for model organisms such as C. elegans (WormBase; http://www.wormbase.org; see UNIT ) and Arabidopsis (TAIR; http://www.arabidopsis.org; see UNIT ). GBrowse_syn is part of the GBrowse software package and can be downloaded from the Web and run on any Unix?like operating system, such as Linux, Solaris, or MacOS X. GBrowse_syn is still under active development. This unit will cover installation and configuration as part of the current stable version of GBrowse (v. 1.71). Curr. Protoc. Bioinform. 31:9.12.1?9.12.25. © 2010 by John Wiley & Sons, Inc.

Keywords: G Browse; Genome Browser; Synteny

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: Configuring and Using GBrowse_syn
Basic Protocol 2: Browsing OrthoCluster Synteny Blocks with GBrowse_syn
Alternate Protocol 1: Loading MERCATOR into the GBrowse_syn Database
Support Protocol 1: Installing GBrowse_syn in the Unix/Linux Environment
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 9.12.1 The first few lines of rice.aln, a CLUSTAL‐formatted alignment file. Note that this is simply a formatting convention and does not imply that the CLUSTAL program was used to generate the data.

View Image

Figure 9.12.2 A sample of the genome annotations for the “rice” data source. These annotations are in GFF3 format, which is explained in detail in UNIT . This sample contains three gene models in a three‐level containment hierarchy (gene > mRNA > CDS).

View Image

Figure 9.12.3 Complete configuration file for the “oryza” data source that is installed as an example with the GBrowse package. This file is similar in structure to a GBrowse configuration file, as described in UNIT . In addition to the connection information for the joining database, this file specifies the location of the configuration files for the species to be compared in GBrowse_syn and the theme color and tracks to load for each species.

View Image

Figure 9.12.4 The rice_synteny.conf configuration file. Minimal information is required, as this is not intended as a stand‐alone genome browser. A detailed list of configurable options for GBrowse configuration files can be found in UNIT . Note that the [EG] track is referenced by the main configuration file oryza.synconf in Figure .

View Image

Figure 9.12.5 The startup screen for the Oryza sativa sample data source included with the GBrowse package. Clicking on one of the example segment links is a good way to get started browsing.

View Image

Figure 9.12.6 Example segment rice 3:16050173..1606497. With the default options, shaded polygons with grid lines are shown. The grid lines correspond to mapped sequence coordinates in the aligned segments.

View Image

Figure 9.12.7 An excerpt from the GMOD (Generic Model Organism Database) Wiki pages that describes Web page features for GBrowse_syn. These features continue to be updated and changes are posted to the Wiki.

View Image

Figure 9.12.8 A five‐species whole‐genome DNA sequence alignment comparison from WormBase (http://www.wormbase.org), showing regions that are co‐linear with Caenorhabditis elegans genomic segment X:1085001..1115000. The displayed region uses the default settings for the display options shown in the bottom panels of the image.

View Image

Figure 9.12.9 Alignment chaining. (A ) alignment of a segment of the rice and wild‐rice genomes with the alignment data provided. (B ) The same region with the “chain alignments” option selected. Same‐stand alignments with monotonically increasing (or decreasing) coordinates are merged or connected by dashed lines where there are gaps. This example allows gaps of up to 50 kb between chained alignments. Note the loss of two genes in domestic versus wild rice.

View Image

Figure 9.12.10 The first five lines of genome_bri.text, an example of a genome annotation file. Each row represents one gene. The four columns are gene name, reference sequence, start, end, and strand., where start and end are 1‐based coordinates relative to the reference sequence and strands 1 and −1 are the plus (+) and (−) strands, respectively.

View Image

Figure 9.12.11 The configuration file orthologs.synconf. Note that the coordinate sparse data require the use of the grid coordinates = exact option.

View Image

Figure 9.12.12 The ele.conf species configuration file. Note that the Bio::DB::GFF adapter is used for the GFF2 gene annotation data.

View Image

Figure 9.12.13 The starting page for the “orthocluster” data source.

View Image

Figure 9.12.14 Example segment chrX:255000..275000.With the default options, shaded polygons with grid lines are shown. Note that the grid lines correspond to orthologous gene boundaries.

View Image

Figure 9.12.15 The welcome screen for a new, unconfigured Gbrowse_syn installation.

View Image

Videos

Literature Cited

Literature Cited
	Bray, N. and Pachter, L. 2004. MAVID: Constrained ancestral alignment of multiple sequences. Genome Res. 14:693‐699.
	Dewey, C.N. 2006. Whole‐Genome Alignments and Polytopes for Comparative Genomics. EECS Department, University of California, Berkeley, California.
	Dewey, C.N. 2007. Aligning multiple whole genomes with Mercator and MAVID. Methods Mol. Biol. 395:221‐236.
	Kent, W.J., Baertsch, R., Hinrichs, A., Miller, W., and Haussler, D. 2003. Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc. Natl. Acad. Sci. U.S.A. 100:11484‐11489
	Ng, M.P., Vergara, I.A., Frech, C., Chen, Q., Zeng, X., Pei, J., and Chen, N. 2009. OrthoClusterDB: An online platform for synteny blocks. BMC Bioinformatics 10:192.
	Schwartz, S., Kent, W.J., Smit, A., Zhang, Z., Baertsch, R., Hardison, R.C., Haussler, D., and Miller, W. 2003. Human‐mouse alignments with BLASTZ. Genome Res. 13:103‐107.
	Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., and Lewis, S. 2002. The generic genome browser: A building block for a model organism system database. Genome Res. 12:1599‐1610.
	Zeng, X., Pei, J., Vergara, I.A., Nesbitt, M., Wang, K., and Chen, N. 2008. OrthoCluster: A new tool for mining synteny blocks and applications in comparative genomics. In 11th International Conference on Extending Technology (EDBT), March 25‐30, 2008, Nantes, France.