Using CisGenome to Analyze ChIP‐chip and ChIP‐seq Data
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
Chromatin immunoprecipitation (ChIP) coupled with genome tiling array hybridization (ChIP?chip) and ChIP followed by massively parallel sequencing (ChIP?seq) are high?throughput approaches to profiling genome?wide protein?DNA interactions. Both technologies are increasingly used to study transcription?factor binding sites and chromatin modifications. CisGenome is an integrated software system for analyzing ChIP?chip and ChIP?seq data. This unit describes basic functions of CisGenome and how to use them to find genomic regions with protein?DNA interactions, visualize binding signals, associate binding regions with nearby genes, search for novel transcription?factor binding motifs, and map existing DNA sequence motifs to user?supplied genomic regions to define their exact locations.Curr. Protoc. Bioinform. 33:2.13.1?2.13.45. © 2011 by John Wiley & Sons, Inc.
Keywords: transcription factor; chromatin immunoprecipitation; tiling array; next generation sequencing; motif; gene regulation
Table of Contents
- Introduction
- Basic Protocol 1: ChIP‐chip Peak Calling for Affymetrix Tiling Array Data
- Basic Protocol 2: Visualization
- Basic Protocol 3: Peak‐Gene Association
- Basic Protocol 4: DNA Sequence Retrieval
- Basic Protocol 5: De Novo Motif Discovery
- Basic Protocol 6: Motif Mapping
- Basic Protocol 7: ChIP‐chip Peak Calling for Other Tiling Array Platforms
- Basic Protocol 8: ChIP‐seq Peak Calling (One‐Sample Analysis)
- Basic Protocol 9: ChIP‐seq Peak Calling (Two‐Sample Analysis)
- Support Protocol 1: Installing CisGenome
- Support Protocol 2: Installing Genome Databases
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
- Tables
Materials
Figures
-
Figure 2.13.1 Overview of the CisGenome basic data analysis pipeline. View Image -
Figure 2.13.2 The CisGenome graphic user interface (GUI) and menu system. The menu for creating an Affymetrix tiling array data set is shown as an example. View Image -
Figure 2.13.3 The dialog for adding BPMAP files to an Affymetrix ChIP‐chip data set. View Image -
Figure 2.13.4 The dialog for adding CEL files to an Affymetrix ChIP‐chip data set. View Image -
Figure 2.13.5 The newly created tiling array data set shown in the CisGenome Project Explorer. Double‐clicking a CEL file will open a CisGenome Browser window displaying a heat map of the array image. View Image -
Figure 2.13.6 The dialog for normalizing an Affymetrix tiling array data set. View Image -
Figure 2.13.7 ChIP‐chip peak calling. Before peak detection, a normalized tiling array data set (circle 1.10) needs to be available in the Project Explorer, and one needs to provide several basic peak calling parameters in a dialog. View Image -
Figure 2.13.8 ChIP‐chip peak calling results. Peaks are summarized in a COD file shown in the right window. A number of BAR files are also created to store enrichment signals. Both the COD file and the BAR files are added to the Project Explorer on the left. View Image -
Figure 2.13.9 CisGenome Browser. (A ) The shortcut icon for the browser. (B ) The first page of the browser. View Image -
Figure 2.13.10 The browser page for choosing browser session type. View Image -
Figure 2.13.11 An empty browser session newly created. View Image -
Figure 2.13.12 The browser page for choosing data track type. View Image -
Figure 2.13.13 The track configuration page in CisGenome Browser. View Image -
Figure 2.13.14 CisGenome Browser showing different types of data. Tools to adjust the display styles are highlighted. View Image -
Figure 2.13.15 Peak‐gene association. (A ) The dialog for annotate peaks by nearby genes. (B ) The annotation results returned in a COD file. View Image -
Figure 2.13.16 DNA sequence retrieval. (A ) The parameter configuration dialog. (B ) Returned files. The DNA sequences will be returned in FASTA format (top). If cross‐species conservation score is requested, conservation scores for each sequence will be returned as well. The conservation scores can be returned in a text format (bottom left), in BED format (bottom right), or a binary CS format (not shown). View Image -
Figure 2.13.17 The parameter configuration dialog for de novo motif discovery. View Image -
Figure 2.13.18 An example of the summary file produced by de novo motif discovery. View Image -
Figure 2.13.19 Motif matrix files produced by de novo motif discovery. (A ) Each motif matrix is stored in a MAT file. (B ) The list of motifs is stored in a MATL file. (C ) Double‐clicking the MATL file in Project Explorer opens CisGenome Browser to display sequence logos of the motifs. The last motif in this example matches the known Gli motif. View Image -
Figure 2.13.20 An example of the CONS file for describing motif consensus sequence. View Image -
Figure 2.13.21 Mapping a motif matrix to a list of genomic regions. (A ) The parameter configuration dialog. (B ) The mapped motif sites are saved to a COD file. View Image -
Figure 2.13.22 Input data format for calling peaks from ChIP‐chip experiments based on non‐Affymetrix tiling array platforms. View Image -
Figure 2.13.23 The parameter configuration dialog for normalizing ChIP‐chip data from a text file. View Image -
Figure 2.13.24 Converting ChIP‐chip data in a text file to a tiling array data set consisting of BAR files. (A ) The parameter configuration dialog. (B ) The converted data set shown in Project Explorer. View Image -
Figure 2.13.25 A sample ALN file. View Image -
Figure 2.13.26 Loading aligned reads for ChIP‐seq peak calling. (A ) The parameter configuration dialog for loading the ALN file. (B ) Loaded data shown in Project Explorer. View Image -
Figure 2.13.27 FDR computation for an one‐sample ChIP‐seq experiment. (A ) The parameter configuration dialog. (B ) The results are returned in a table that summarizes statistical properties of the data. View Image -
Figure 2.13.28 Peak calling from one‐sample ChIP‐seq data. (A ) The parameter configuration dialog. (B ) The detected peaks are reported in a COD file. View Image -
Figure 2.13.29 Data for two‐sample ChIP‐seq analysis loaded into CisGenome. View Image -
Figure 2.13.30 FDR computation for a two‐sample ChIP‐seq experiment. (A ) The parameter configuration dialog. (B ) The results are returned in a table that summarizes statistical properties of the data. View Image -
Figure 2.13.31 The parameter configuration dialog for two‐sample ChIP‐seq peak calling. View Image -
Figure 2.13.32 An example of the CisGenome.ini file. View Image -
Figure 2.13.33 Load a genome database into CisGenome GUI. (A ) In the file open dialog, choose the file named [species]_[assembly].cgw in the genome database folder. (B ) The loaded database shown in Project Explorer. View Image
Videos
Literature Cited
Literature Cited | |
Barski, A., Cuddapah, S., Cui, K., Roh, T.Y., Schones, D.E., Wang, Z., Wei, G., Chepelev, I., and Zhao, K. 2007. High‐resolution profiling of histone methylations in the human genome. Cell 129:823‐837. | |
Bolstad, B.M., Irizarry, R.A., Astrand, M., and Speed, T.P. 2003. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19:185‐193. | |
Cawley, S., Bekiranov, S., Ng, H.H., Kapranov, P., Sekinger, E.A., Kampa, D., Piccolboni, A., Sementchenko, V., Cheng, J., Williams, A.J., Wheeler, R., Wong, B., Drenkow, J., Yamanaka, M., Patel, S., Brubaker, S., Tammana, H., Helt, G., Struhl, K., and Gingeras, T.R. 2004. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116:499‐509. | |
Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. 2004. WebLogo: A sequence logo generator. Genome Res. 14:1188‐1190. | |
The Gene Ontology Consortium. 2000. Gene ontology: Tool for the unification of biology. Nat. Genet. 25:25‐29. | |
Jensen, S.T., Liu, X.S., Zhou, Q., and Liu, J.S. 2004. Computational discovery of gene regulatory binding motifs: A Bayesian perspective. Stat. Sci. 19:188‐204. | |
Ji, H. and Wong, W.H. 2005. TileMap: Create chromosomal map of tiling array hybridizations. Bioinformatics 21:3629‐3636. | |
Ji, H., Jiang, H., Ma, W., Johnson, D.S., Myers, R.M., and Wong, W.H. 2008. An integrated software system for analyzing ChIP‐chip and ChIP‐seq data. Nat. Biotechnol. 26:1293‐1300. | |
Johnson, D.S., Mortazavi, A., Myers, R.M., and Wold, B. 2007. Genome‐wide mapping of in vivo protein‐DNA interactions. Science 316:1497‐1502. | |
Liu, J.S., Neuwald, A.F., and Lawrence, C.E. 1995. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Amer. Statist. Assoc. 90:1156‐1170. | |
Liu, X.S., Brutlag, D.L., and Liu, J.S. 2002. An algorithm for finding protein‐DNA binding sites with applications to chromatin‐immunoprecipitation microarray experiments. Nat. Biotechnol. 20:835‐839. | |
Mikkelsen, T.S., Ku, M., Jaffe, D.B., Issac, B., Lieberman, E., Giannoukos, G., Alvarez, P., Brockman, W., Kim, T.K., Koche, R.P., Lee, W., Mendenhall, E., O'Donovan, A., Presser, A., Russ, C., Xie, X., Meissner, A., Wernig, M., Jaenisch, R., Nusbaum, C., Lander, E.S. and Bernstein, B.E. 2007. Genome‐wide maps of chromatin state in pluripotent and lineage‐committed cells. Nature 448:553‐560. | |
Ren, B., Robert, F., Wyrick, J.J., Aparicio, O., Jennings, E.G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., Volkert, T.L., Wilson, C.J., Bell, S.P., and Young, R.A. 2000. Genome‐wide location and function of DNA binding proteins. Science 290:2306‐2309. | |
Robertson, G., Hirst, M., Bainbridge, M., Bilenky, M., Zhao, Y., Zeng, T., Euskirchen, G., Bernier, B., Varhol, R., Delaney, A., Thiessen, N., Griffith, O.L., He, A., Marra, M., Snyder, M., and Jones, S. 2007. Genome‐wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4:651‐657. | |
Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L.W., Richards, S., Weinstock, G.M., Wilson, R.K., Gibbs, R.A., Kent, W.J., Miller, W., and Haussler, D. 2005. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15:1034‐1050. |