Digital Gene Expression by Tag Sequencing on the Illumina Genome Analyzer

互联网2013-12-31

2437

Abstract
Table of Contents
Materials
Figures
Literature Cited

Abstract

This unit provides a protocol for performing digital gene expression profiling on the Illumina Genome Analyzer sequencing platform. Tag sequencing (Tag?seq) is an implementation of the LongSAGE protocol on the Illumina sequencing platform that increases utility while reducing both the cost and time required to generate gene expression profiles. The ultra?high?throughput sequencing capability of the Illumina platform allows the cost?effective generation of libraries containing an average of 20 million tags, a 200?fold improvement over classical LongSAGE. Tag?seq has less sequence composition bias, leading to a better representation of AT?rich tag sequences, and allows a more accurate profiling of a subset of the transcriptome characterized by AT?rich genes expressed at levels below the threshold of detection of LongSAGE (Morrissy et al., 2009). Curr. Protoc. Hum. Genet. 65:11.11.1?11.11.36 © 2010 by John Wiley & Sons, Inc.

Keywords: gene expression; Tag?seq; Illumina; RNA; cDNA; tag; PCR

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: First‐ and Second‐Strand cDNA Synthesis for Tag‐seq Library Construction
Basic Protocol 2: Tag Generation
Basic Protocol 3: PCR and Fragment Isolation
Basic Protocol 4: Preparing the Library for Illumina Sequencing
Alternate Protocol 1: Amplified Tag‐seq Library Construction (Tag‐seqLite)
Basic Protocol 5: Data Analysis
Reagents and Solutions
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: First‐ and Second‐Strand cDNA Synthesis for Tag‐seq Library Construction

Materials

RNAseZap (Ambion)
DEPC‐treated H₂ O ( appendix 2D )
Total RNA sample: typically isolated using TriZOL (Invitrogen), AllPrep Mini Kit (Qiagen), or RiboPure Kit (Ambion), and DNase I treated
Oligo(dT) magnetic beads (Invitrogen)
Lysis/Binding buffer (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
Wash Buffer B (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
1× First Strand Buffer (Invitrogen)
5× First‐Strand Buffer (Invitrogen)
4 U/µl RNaseOUT (Invitrogen)
0.1 M DTT (Invitrogen)
5 M betaine (Sigma), prepared using nuclease‐free H₂ O
10 mM dNTP mix (10 mM each dNTP; Invitrogen)
200 U/µl Superscript II Reverse Transcriptase (Invitrogen)
5× Second‐Strand buffer (Invitrogen)
10 U/µl E. coli DNA ligase (Invitrogen)
10 U/µl E. coli DNA polymerase (Invitrogen)
2 U/µl E. coli RNase H (Invitrogen)
Wash Buffer C (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
0.5 M EDTA (Invitrogen)
Wash Buffer D (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
10× Buffer 4 (New England Biolabs)

Textured nitrile gloves (Fisher)
Bench Coat (bench protection paper; Fisher)
RNase‐free 1.5‐ml non‐stick (siliconized) microcentrifuge tubes (Ambion)
Magnetic stand (Invitrogen, cat. no. R670‐01)
50‐ml conical polypropylene tubes (BD Falcon)
Clay Adams Nutator Shaker (VWR Scientific)
Thermomixers, 1.5 ml (Eppendorf)

Basic Protocol 2: Tag Generation

Materials

100× bovine serum albumin (BSA; New England Biolabs)
10× Buffer 4 (New England Biolabs)
10 U/µl Nla III restriction endonuclease (New England Biolabs)
cDNA‐containing magnetic beads ( protocol 1 , step 49)
Wash Buffer C (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
Wash Buffer D (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
10× ligase buffer (from I‐SAGE Long Kit; Invitrogen, cat. no. T5000‐03)
DEPC‐treated H₂ O ( appendix 2D )
10 µM GEX Adapter 1 (from Illumina Tag Profiling Sample Prep Kit, cat. no. FC‐102‐1005)
5 U/µl T4 DNA ligase (Invitrogen)
DNA Away (Molecular BioProducts)
32 mM (800×) S‐adenosylmethionine (SAM)
10× (and 1×) Buffer 4 (New England Biolabs)
2 U/µl Mme I restriction endonuclease (New England Biolabs)
1 U/µl shrimp alkaline phosphatase (Invitrogen)
2‐ml Phase‐Lock Gel (PLG) tube (heavy; Fisher)
Phenol/chloroform/isoamyl alcohol (PCI; Fisher)
3 M sodium acetate, pH 5.5 (Ambion; also see appendix 2D )
20 mg/ml mussel glycogen
100% and ice‐cold 70% ethanol
1.5 µM GEX Adapter 2 (from Illumina Tag Profiling Sample Prep Kit, cat. no. FC‐102‐1005)

Thermomixers, 1.5 ml (Eppendorf)
Textured nitrile gloves (Fisher)
RNase‐free 1.5‐ml non‐stick (siliconized) microcentrifuge tubes (Ambion)
16°C water bath (Fisher Isotemp 3016)
Magnetic stand (Invitrogen, cat. no. R670‐01)

Basic Protocol 3: PCR and Fragment Isolation

Materials

RNaseZap (Ambion)
Ultrapure Water (Invitrogen)
5× HF buffer (Finnzymes)
DMSO (Invitrogen)
10 mM dNTP mix (10 mM each dNTP; Invitrogen)
GEX PCR Primers 1 and 2 (from Illumina Tag Profiling Sample Prep Kit, cat. no. FC‐102‐1005)
2 U/µl Phusion Hot Start DNA Polymerase (Finnzymes)
40% acrylamide (19:1 acrylamide:bis; BioRad)
50× TAE buffer ( appendix 2D )
10% (w/v) ammonium persulfate (BioRad; prepare immediately before use)
TEMED (BioRad)
Micro‐90 cleaning solution (Cole‐Parmer)
10× bromphenol blue/xylene cyanol loading dye (see recipe )
25‐bp DNA ladder (20 ng/µl; Invitrogen) in loading dye (5:1)
SybrGreen (Cambrex Bio Science Inc.)
Elution buffer: 5 parts low‐TE buffer (10 mM Tris⋅Cl, pH 7.4 and 1 mM EDTA; see appendix 2D ) plus 1 part 7.5 M ammonium acetate
3 M sodium acetate, pH 5.5 (Ambion, or see appendix 2D )
20 mg/ml mussel glycogen (Roche Scientific)
70% and 100% ethanol (anhydrous ethyl alcohol; Commercial Alcohol Inc., http://www.comalc.com/)
EB buffer (from Qiagen PCR Purification Kit)

Textured nitrile gloves (Fisher)
RNase‐free 1.5‐ml, 0.5‐ml, and 2‐ml microcentrifuge tubes (Ambion)
0.2‐ml thin‐walled, RNase‐free PCR tubes (Ambion)
Peltier Thermal Cycler (MJ Research)
Glass plates for PAGE gels (Owl Scientific)
Bags for gel pouring (Fisher Scientific)
Casting tray (Owl Scientific)
Colored tape (asssorted; VWR Scientific)
Combs (15 well, 1.5 mm; Owl Scientific, cat. no. P1‐15D
Spacers (1.5 mm; Owl Scientific, cat. no. P1‐SD)
Gel pouches (Owl Scientific, cat. no. GP2‐25)
50‐ml conical polypropylene tubes (e.g., BD Falcon)
Penguin Owl Electrophoresis System (Owl Scientific)
Power supply (LVC2kW, 48VDCV; Tyco Electronics)
18‐G needle
Typhoon Gel Scanner (GE Healthcare)
Dark Reader (UV transilluminator) (InterScience; http://www.interscience.com/)
Bench Coat (bench protection paper; Fisher)
Heating block
SpinX columns (0.22‐µm; Costar)

Basic Protocol 4: Preparing the Library for Illumina Sequencing

Materials

Sample: PCR product (library; see protocol 3 , step 43)
Agilent DNA 1000 Kit (Agilent)
EB buffer (from Qiagen PCR Purification Kit) supplemented with 0.1% (v/v) Tween‐20

Agilent 2100 Bioanalyzer (Agilent)
Chip Priming Station (Agilent)
IKA Vortex Mixer (Agilent)
Illumina Genome Analyzer (Illumina)
Cluster Station (Illumina)

Alternate Protocol 1: Amplified Tag‐seq Library Construction (Tag‐seqLite)

Total RNA sample: typically isolated using TriZOL (Invitrogen), AllPrep Mini Kit (Qiagen), or RiboPure Kit (Ambion) and DNase I treated
LITE1/LITE TS primer mix (20 µM each; Integrated DNA Technologies, http://www.idtdna.com/) including:
- Biotin‐AAG CAG TGG TAA CAA CGC AGA GTA CTT TTT TTT TTT TTT TTT TTT TTT TTT TTT‐TVN
- Lite TS primer, 10 µM: AAG CAG TGG TAA CAA CGC AGA GTA CGC GGG
Nuclease‐free H₂ O (Ambion; also see recipe for DEPC‐treated H₂ O in appendix 2D )
20 mM DTT
SMARTScribe Reverse Transcriptase (Clontech)
TE buffer, pH 8.0 (Invitrogen; also see appendix 2D )
Advantage 2 PCR Kit (Clontech) including:
- 10× Advantage 2 Buffer
- 50× Advantage 2 Polymerase Mix
Buffer PB (Qiagen)
Buffer PE (Qiagen)
Buffer EB (Qiagen)
M280 Streptavidin beads (Invitrogen)
2× and 1× B+W buffer (see recipe )

Thin‐walled, frosted‐lid, RNase‐free PCR tubes (Ambion)
Microcentrifuge adapters for the PCR tubes
Peltier Thermal Cycler (MJ Research)
QIAquick spin columns and collection tubes (Qiagen)
NanoDrop spectrophotometer ( appendix 3D )
Agilent DNA 7500 Kit (Agilent) including:
- DNA 7500 Gel Matrix
- DNA 7500 Maker
- DNA 7500 Ladder
- Agilent DNA 7500 Chips
- Agilent Chip Priming Station
- IKA Works Vortexer
- Agilent Electrode Cleaner

Additional reagents and equipment for quantitating DNA using NanoDrop spectrophotometer ( appendix 3D )

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 11.11.1 Tag‐seq library generation. Polyadenylated mRNAs (open rectangles) are captured using oligo(dT) beads, and double‐stranded cDNA is subsequently synthesized. The cDNA (double clear rectangles) is digested with the Nla III anchoring restriction enzyme (vertical arrows), leaving a 4‐bp overhang (GTAC). Only cDNA fragments anchored to oligo (dT) beads are retained. Adapter A (light gray rectangle) is ligated to the overhang, and adds a recognition site for the Type IIS tagging enzyme Mme I. Following Mme I digestion (gray vertical arrow), a second adapter is ligated (Adapter B, light gray rectangle) to the resulting 2‐bp overhang. PCR primers (horizontal gray arrows) annealing to adapters A and B are used to enrich tags (dark gray rectangles). Cluster generation and sequencing (horizontal black arrow) is performed on the Illumina cluster station and analyzer. The resulting image files are processed to extract the read sequences, and 21‐bp SAGE tags are further extracted from the reads. Tags consist of the 4‐bp Nla III recognition sites and 17 bp of unique sequence, and constitute a total of 21 bases that can be mapped back to the original mRNA.

View Image

Figure 11.11.2 Overview of the flow of analysis (A ) and output of three analysis scripts. (B ) SdCompare: the number of tag sequences ( y axis) with expression counts above 20 in each of two compared libraries are binned by the log‐ratio of their expression ( x axis). This provides a measure of the similarity between two libraries (ce0068 and ce0069). (C ) CorrelatePlot: a scatterplot of the (log) expression levels of tags sequenced in two libraries (ce0068 and ce0069; x axis and y axis, respectively) is shown along with the Pearson correlation coefficient of the two libraries, the linear regression equation (top right), and the linear regression line. (D ) SdSageTree : A hierarchical tree representation of the distance matrix calculated for five libraries. The distance matrix is constructed from the standard deviations of the log ratios of the tag expression values in the five libraries. See Table for an overview of the scripts and Table for a summary of the data files.

View Image

Figure 11.11.3 Purified 85‐bp PCR product on the Agilent DNA1000 chip is seen as a 96 bp peak.

View Image

Figure 11.11.4 Gel image of PCR products along with the no‐template control. Both 13‐cycle and 15‐cycle 85‐bp PCR product bands were excised, gel purified, and ethanol precipitated. In general, if the purity of both PCR products is similar and if both have enough products for sequencing, use the one with fewer PCR cycles for sequencing.

View Image

Videos

Literature Cited

	Boon, K., Osorio, E.C., Greenhut, S.F., Schaefer, C.F., Shoemaker, J., Polyak, K., Morin, P.J., Buetow, K.H., Strausberg, R.L., De Souza, S.J., and Riggins, G.J. 2002. An anatomy of normal and malignant gene expression. Proc. Natl. Acad. Sci. U.S.A. 99:11287‐11292.
	Gerhard, D.S., Wagner, L., Feingold, E.A., Shenmen, C.M., Grouse, L.H., Schuler, G., Klein, S.L., Old, S., Rasooly, R., Good, P., Guyer, M., Peck, A.M., Derge, J.G., Lipman, D., Collins, F.S., Jang, W., Sherry, S., Feolo, M., Misquitta, L., Lee, E., Rotmistrovsky, K., Greenhut, S.F., Schaefer, C.F., Buetow, K., Bonner, T.I., Haussler, D., Kent, J., Kiekhaus, M., Furey, T., Brent, M., Prange, C., Schreiber, K., Shapiro, N., Bhat, N.K., Hopkins, R.F., Hsie, F., Driscoll, T., Soares, M.B., Casavant, T.L., Scheetz, T.E., Brown‐stein, M.J., Usdin, T.B., Toshiyuki, S., Carninci, P., Piao, Y., Dudekula, D.B., Ko, M.S., Kawakami, K., Suzuki, Y., Sugano, S., Gruber, C.E., Smith, M.R., Simmons, B., Moore, T., Waterman, R., Johnson, S.L., Ruan, Y., Wei, C.L., Mathavan, S., Gunaratne, P.H., Wu, J., Garcia, A.M., Hulyk, S.W., Fuh, E., Yuan, Y., Sneed, A., Kowis, C., Hodgson, A., Muzny, D.M., McPherson, J., Gibbs, R.A., Fahey, J., Helton, E., Ketteman, M., Madan, A., Rodrigues, S., Sanchez, A., Whiting, M., Madari, A., Young, A.C., Wetherby, K.D., Granite, S.J., Kwong, P.N., Brinkley, C.P., Pearson, R.L., Bouffard, G.G., Blakesly, R.W., Green, E.D., Dickson, M.C., Rodriguez, A.C., Grimwood, J., Schmutz, J., Myers, R.M., Butterfield, Y.S., Griffith, M., Griffith, O.L., Krzywinski, M.I., Liao, N., Morin, R., Palmquist, D., Petrescu, A.S., Skalska, U., Smailus, D.E., Stott, J.M., Schnerch, A., Schein, J.E., Jones, S.J., Holt, R.A., Baross, A., Marra, M.A., Clifton, S., Makowski, K.A., Bosak, S., Malek, J.; MGC Project Team. 2004. The Status, Quality, and Expansion of the NIH Full‐Length cDNA Project: The Mammalian Gene Collection (MGC). Genome Res. 14:2121‐2127.
	Gowda, M., Jantasuriyarat, C., Dean, R.A., and Wang, G.L. 2004. Robust‐LongSAGE (RL‐SAGE): A substantially improved LongSAGE method for gene discovery and transcriptome analysis. Plant Physiol. 134:890‐897.
	Heidenblut, A.M., Luttges, J., Buchholz, M., Heinitz, C., Emmersen, J., Nielsen, K.L., Schreiter, P., Souquet, M., Nowacki, S., Herbrand, U., Klöppel, G., Schmiegel, W., Gress, T., and Hahn, S.A. 2004. aRNA‐longSAGE: A new approach to generate SAGE libraries from microdissected cells. Nucleic Acids Res. 32:E131.
	Khattra, J., Delaney, A.D., Zhao, Y., Siddiqui, A.S., Asano, J., McDonald, H., Pandoh, P., Dhalla, N., Prabhu, A., Ma, K., Lee, S., Ally, A., Tam, A., Sa, D., Rogers, S., Charest, D., Stott, J., Zuyderduyn, S., Varhol, R., Eaves, C., Jones, S., Holt, R.A., Hirst, M., Hoodless, P.A., and Marra, M.A. 2007. Large‐scale production of SAGE libraries from microdissected tissues, flow‐sorted cells, and cell lines. Genome Res. 17:108‐116.
	Kodzius, R., Kojima, M., Nishiyori, H., Nakamura, M., Fukuda, S., Tagami, M., Sasaki, D., Imamura, K., Kai, C., Harbers, M., Hayashizaki, Y., and Carninci, P. 2006. CAGE: Cap Analysis of Gene Expression. Nat. Methods 3:211‐222.
	Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M., and Gilad, Y. 2008. RNA‐seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18:1509‐1517.
	Matsumura, H., Reich, S., Ito, A., Saitoh, H., Kamoun, S., Winter, P., Kahl, G., Reuter, M., Kruger, D.H., and Terauchi, R. 2003. Gene expression analysis of plant host‐pathogen interactions by SuperSAGE. Proc. Natl. Acad. Sci. U.S.A. 100:15718‐15723.
	Morrissy, A.S., Morin, R.D., Delaney, A., Zeng, T., McDonald, H., Jones, S., Zhao, Y., Hirst, M., and Marra, M.A. 2009. Next‐generation tag sequencing for cancer gene expression profiling. Genome Res. 19:1825‐1835.
	Peters, D.G., Kassam, A.B., Yonas, H., O'Hare, E.H., Ferrell, R.E., and Brufsky, A.M. 1999. Comprehensive transcript analysis in small quantities of mRNA by SAGE‐lite. Nucleic Acids Res. 27:e39.
	Pontius, J.U., Wagner, L., and Schuler, G.D. 2003. UniGene: A unified view of the transcriptome. In The NCBI Handbook, National Center for Biotechnology Information, Bethesda Md. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Books.
	Rosenkranz, R., Borodina, T., Lehrach, H., and Himmelbauer, H. 2008. Characterizing the mouse ES cell transcriptome with Illumina sequencing. Genomics 92:187‐194.
	Saha, S., Sparks, A.B., Rago, C., Akmaev, V., Wang, C.J., Vogelstein, B., Kinzler, K.W., and Velculescu, V.E. 2002. Using the transcriptome to annotate the genome. Nat. Biotechnol. 20:508‐512.
	Siddiqui, A.S., Khattra, J., Delaney, A.D., Zhao, Y., Astell, C., Asano, J., Babakaiff, R., Barber, S., Beland, J., Bohacec, S., Brown‐John, M., Chand, S., Charest, D., Charters, A.M., Cullum, R., Dhalla, N., Featherstone, R., Gerhard, D.S., Hoffman, B., Holt, R.A., Hou, J., Kuo, B.Y.‐L., Lee, L.L.C., Lee, S., Leung, D., Ma, K., Matsuo, C., Mayo, M., McDonald, H., Prabhu, A., Pandoh, P., Riggins, G.J., Ruiz de Algara, T., Rupert, J.L., Smailus, D., Stott, J., Tsai, M., Varhol, R., Vrljicak, P., Wong, D., Wu, M.K., Xie, Y., Yang, G., Zhang, I., Hirst, M., Jones, S.J.M., Helgason, C.D., Simpson, E.M., Hoodless, P.A., and Marra, M.A. 2005. A mouse atlas of gene expression: Large‐scale digital gene‐expression profiles from precisely defined developing C57BL/6J mouse tissues and cells. Proc. Natl. Acad. Sci. U.S.A. 102:18485‐18490.
	Velculescu, V.E., Zhang, L., Vogelstein, B., and Kinzler, K.W. 1995. Serial analysis of gene expression. Science 270:484‐487.
	Wei, C.L., Ng, P., Chiu, K.P., Wong, C.H., Ang, C.C., Lipovich, L., Liu, E.T., and Ruan, Y. 2004. 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotation. Proc. Natl. Acad. Sci. U.S.A. 101:11701‐11706.
Internet Resources
	http://bioinfo.au.tsinghua.edu.cn/micrornadb/
	MicroRNAdb:A Comprehensive Database for MicroRNAs. MOE Key Laboratory of Bioinfomatics,Tsinghua University, Beijing.
	http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Books
	The Reference Sequence (RefSeq) Project. 2002. Chapter 18, The NCBI Handbook. National Library of Medicine (US), National Center for Biotechnology Information. Bethesda, Md.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Digital Gene Expression by Tag Sequencing on the Illumina Genome Analyzer

Abstract

Table of Contents

Materials

Basic Protocol 1: First‐ and Second‐Strand cDNA Synthesis for Tag‐seq Library Construction

Basic Protocol 2: Tag Generation

Basic Protocol 3: PCR and Fragment Isolation

Basic Protocol 4: Preparing the Library for Illumina Sequencing

Alternate Protocol 1: Amplified Tag‐seq Library Construction (Tag‐seqLite)

Figures

Videos

Literature Cited