Using CisGenome to Analyze ChIP‐chip and ChIP‐seq Data

互联网2013-12-31

1488

Abstract
Table of Contents
Figures
Literature Cited

Abstract

Chromatin immunoprecipitation (ChIP) coupled with genome tiling array hybridization (ChIP?chip) and ChIP followed by massively parallel sequencing (ChIP?seq) are high?throughput approaches to profiling genome?wide protein?DNA interactions. Both technologies are increasingly used to study transcription?factor binding sites and chromatin modifications. CisGenome is an integrated software system for analyzing ChIP?chip and ChIP?seq data. This unit describes basic functions of CisGenome and how to use them to find genomic regions with protein?DNA interactions, visualize binding signals, associate binding regions with nearby genes, search for novel transcription?factor binding motifs, and map existing DNA sequence motifs to user?supplied genomic regions to define their exact locations.Curr. Protoc. Bioinform. 33:2.13.1?2.13.45. © 2011 by John Wiley & Sons, Inc.

Keywords: transcription factor; chromatin immunoprecipitation; tiling array; next generation sequencing; motif; gene regulation

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: ChIP‐chip Peak Calling for Affymetrix Tiling Array Data
Basic Protocol 2: Visualization
Basic Protocol 3: Peak‐Gene Association
Basic Protocol 4: DNA Sequence Retrieval
Basic Protocol 5: De Novo Motif Discovery
Basic Protocol 6: Motif Mapping
Basic Protocol 7: ChIP‐chip Peak Calling for Other Tiling Array Platforms
Basic Protocol 8: ChIP‐seq Peak Calling (One‐Sample Analysis)
Basic Protocol 9: ChIP‐seq Peak Calling (Two‐Sample Analysis)
Support Protocol 1: Installing CisGenome
Support Protocol 2: Installing Genome Databases
Guidelines for Understanding Results
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 2.13.1 Overview of the CisGenome basic data analysis pipeline.

View Image

Figure 2.13.2 The CisGenome graphic user interface (GUI) and menu system. The menu for creating an Affymetrix tiling array data set is shown as an example.

View Image

Figure 2.13.3 The dialog for adding BPMAP files to an Affymetrix ChIP‐chip data set.

View Image
Figure 2.13.4 The dialog for adding CEL files to an Affymetrix ChIP‐chip data set.

View Image

Figure 2.13.5 The newly created tiling array data set shown in the CisGenome Project Explorer. Double‐clicking a CEL file will open a CisGenome Browser window displaying a heat map of the array image.

View Image

Figure 2.13.6 The dialog for normalizing an Affymetrix tiling array data set.

View Image

Figure 2.13.7 ChIP‐chip peak calling. Before peak detection, a normalized tiling array data set (circle 1.10) needs to be available in the Project Explorer, and one needs to provide several basic peak calling parameters in a dialog.

View Image

Figure 2.13.8 ChIP‐chip peak calling results. Peaks are summarized in a COD file shown in the right window. A number of BAR files are also created to store enrichment signals. Both the COD file and the BAR files are added to the Project Explorer on the left.

View Image

Figure 2.13.9 CisGenome Browser. (A ) The shortcut icon for the browser. (B ) The first page of the browser.

View Image

Figure 2.13.10 The browser page for choosing browser session type.

View Image
Figure 2.13.11 An empty browser session newly created.

View Image
Figure 2.13.12 The browser page for choosing data track type.

View Image
Figure 2.13.13 The track configuration page in CisGenome Browser.

View Image

Figure 2.13.14 CisGenome Browser showing different types of data. Tools to adjust the display styles are highlighted.

View Image

Figure 2.13.15 Peak‐gene association. (A ) The dialog for annotate peaks by nearby genes. (B ) The annotation results returned in a COD file.

View Image

Figure 2.13.16 DNA sequence retrieval. (A ) The parameter configuration dialog. (B ) Returned files. The DNA sequences will be returned in FASTA format (top). If cross‐species conservation score is requested, conservation scores for each sequence will be returned as well. The conservation scores can be returned in a text format (bottom left), in BED format (bottom right), or a binary CS format (not shown).

View Image

Figure 2.13.17 The parameter configuration dialog for de novo motif discovery.

View Image
Figure 2.13.18 An example of the summary file produced by de novo motif discovery.

View Image

Figure 2.13.19 Motif matrix files produced by de novo motif discovery. (A ) Each motif matrix is stored in a MAT file. (B ) The list of motifs is stored in a MATL file. (C ) Double‐clicking the MATL file in Project Explorer opens CisGenome Browser to display sequence logos of the motifs. The last motif in this example matches the known Gli motif.

View Image

Figure 2.13.20 An example of the CONS file for describing motif consensus sequence.

View Image

Figure 2.13.21 Mapping a motif matrix to a list of genomic regions. (A ) The parameter configuration dialog. (B ) The mapped motif sites are saved to a COD file.

View Image

Figure 2.13.22 Input data format for calling peaks from ChIP‐chip experiments based on non‐Affymetrix tiling array platforms.

View Image

Figure 2.13.23 The parameter configuration dialog for normalizing ChIP‐chip data from a text file.

View Image

Figure 2.13.24 Converting ChIP‐chip data in a text file to a tiling array data set consisting of BAR files. (A ) The parameter configuration dialog. (B ) The converted data set shown in Project Explorer.

View Image

Figure 2.13.25 A sample ALN file.

View Image

Figure 2.13.26 Loading aligned reads for ChIP‐seq peak calling. (A ) The parameter configuration dialog for loading the ALN file. (B ) Loaded data shown in Project Explorer.

View Image

Figure 2.13.27 FDR computation for an one‐sample ChIP‐seq experiment. (A ) The parameter configuration dialog. (B ) The results are returned in a table that summarizes statistical properties of the data.

View Image

Figure 2.13.28 Peak calling from one‐sample ChIP‐seq data. (A ) The parameter configuration dialog. (B ) The detected peaks are reported in a COD file.

View Image

Figure 2.13.29 Data for two‐sample ChIP‐seq analysis loaded into CisGenome.

View Image

Figure 2.13.30 FDR computation for a two‐sample ChIP‐seq experiment. (A ) The parameter configuration dialog. (B ) The results are returned in a table that summarizes statistical properties of the data.

View Image

Figure 2.13.31 The parameter configuration dialog for two‐sample ChIP‐seq peak calling.

View Image
Figure 2.13.32 An example of the CisGenome.ini file.

View Image

Figure 2.13.33 Load a genome database into CisGenome GUI. (A ) In the file open dialog, choose the file named [species]_[assembly].cgw in the genome database folder. (B ) The loaded database shown in Project Explorer.

View Image

Videos

Literature Cited

Literature Cited
	Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. 2007. High‐resolution profiling of histone methylations in the human genome. Cell 129:823‐837.
	Bolstad, B.M., Irizarry, R.A., Astrand, M., and Speed, T.P. 2003. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185‐193.
	Cawley, S., Bekiranov, S., Ng, H.H., Kapranov, P., Sekinger, E.A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A.J., Wheeler, R., Wong, B., Drenkow, J., Yamanaka, M., Patel, S., Brubaker, S., Tammana, H., Helt, G., Struhl, K., and Gingeras, T.R. 2004. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116:499‐509.
	Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. 2004. WebLogo: A sequence logo generator. Genome Res. 14:1188‐1190.
	The Gene Ontology Consortium. 2000. Gene ontology: Tool for the unification of biology. Nat. Genet. 25:25‐29.
	Jensen, S.T., Liu, X.S., Zhou, Q., and Liu, J.S. 2004. Computational discovery of gene regulatory binding motifs: A Bayesian perspective. Stat. Sci. 19:188‐204.
	Ji, H. and Wong, W.H. 2005. TileMap: Create chromosomal map of tiling array hybridizations. Bioinformatics 21:3629‐3636.
	Ji, H., Jiang, H., Ma, W., Johnson, D.S., Myers, R.M., and Wong, W.H. 2008. An integrated software system for analyzing ChIP‐chip and ChIP‐seq data. Nat. Biotechnol. 26:1293‐1300.
	Johnson, D.S., Mortazavi, A., Myers, R.M., and Wold, B. 2007. Genome‐wide mapping of in vivo protein‐DNA interactions. Science 316:1497‐1502.
	Liu, J.S., Neuwald, A.F., and Lawrence, C.E. 1995. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Amer. Statist. Assoc. 90:1156‐1170.
	Liu, X.S., Brutlag, D.L., and Liu, J.S. 2002. An algorithm for finding protein‐DNA binding sites with applications to chromatin‐immunoprecipitation microarray experiments. Nat. Biotechnol. 20:835‐839.
	Mikkelsen, T.S., Ku, M., Jaffe, D.B., Issac, B., Lieberman, E., Giannoukos, G., Alvarez, P., Brockman, W., Kim, T.K., Koche, R.P., Lee, W., Mendenhall, E., O'Donovan, A., Presser, A., Russ, C., Xie, X., Meissner, A., Wernig, M., Jaenisch, R., Nusbaum, C., Lander, E.S. and Bernstein, B.E. 2007. Genome‐wide maps of chromatin state in pluripotent and lineage‐committed cells. Nature 448:553‐560.
	Ren, B., Robert, F., Wyrick, J.J., Aparicio, O., Jennings, E.G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., Volkert, T.L., Wilson, C.J., Bell, S.P., and Young, R.A. 2000. Genome‐wide location and function of DNA binding proteins. Science 290:2306‐2309.
	Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., Thiessen, N., Griffith, O.L., He, A., Marra, M., Snyder, M., and Jones, S. 2007. Genome‐wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4:651‐657.
	Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L.W., Richards, S., Weinstock, G.M., Wilson, R.K., Gibbs, R.A., Kent, W.J., Miller, W., and Haussler, D. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15:1034‐1050.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Using CisGenome to Analyze ChIP‐chip and ChIP‐seq Data

Abstract

Table of Contents

Materials

Figures

Videos

Literature Cited