SNP Discovery by Transcriptome Pyrosequencing
互联网
527
Single nucleotide polymorphisms (SNPs) are single base differences between haplotypes. SNPs are abundant in many species and valuable as markers for genetic map construction, modern molecular breeding programs, and quantitative genetic studies. SNPs are readily mined from genomic DNA or cDNA sequence obtained from individuals having two or more distinct genotypes. While automated Sanger sequencing has become less expensive over time, it is still costly to acquire deep Sanger sequence from several genotypes. “Next-generation” DNA sequencing technologies that utilize new chemistries and massively parallel approaches have enabled DNA sequences to be acquired at extremely high depths of coverage faster and for less cost than traditional sequencing. One such method is represented by the Roche/454 Life Sciences GS-FLX Titanium Series, which currently uses pyrosequencing to produce up to 400–600 million bases of DNA sequence/run (>1 million reads, ∼400 bp/read). This chapter discusses the use of high-throughput pyrosequencing for SNP discovery by focusing on 454 sequencing of maize cDNA, the development of a computational pipeline for polymorphism detection, and the subsequent identification of over 7,000 putative SNPs between Mo17 and B73 maize. In addition, alternative alignment and polymorphism detection strategies that implement Illumina short reads, data processing and visualization tools, and reduced representation techniques that reduce the sequencing of repeat DNA, thus enabling efficient analysis of genome sequence, are discussed.