丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Using the Velvet de novo Assembler for Short‐Read Sequencing Technologies

互联网

924
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

The Velvet de novo assembler was designed to build contigs and eventually scaffolds from short?read sequencing data. This protocol describes how to use Velvet, interpret its output, and tune its parameters for optimal results. It also covers practical issues such as configuration, using the VelvetOptimiser routine, and processing colorspace data. Curr. Protoc. Bioinform. 31:11.5.1?11.5.12. © 2010 by John Wiley & Sons, Inc.

Keywords: Genome assembly; Next?Generation Sequencing; de Bruijn Graphs

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Assembling a Set of Reads with Velvet
  • Support Protocol 1: Installing VELVET
  • Support Protocol 2: Using VelvetOptimiser
  • Support Protocol 3: Processing Colorspace Data
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure 11.5.1 Schematic representation of the coverage distribution. The solid curve represents the expected k ‐mer coverage distribution in an idealized sequencing project of a repeat‐free genome, i.e., a Poisson distribution. The dashed curve schematically represents the k ‐mer coverage distribution in a typical experimental situation. The variance of the main peak is increased, and new peaks are created by erroneous k‐ mers (left‐most peak with minimal coverage) and repeated coverage (smaller peaks on the right). Finally, the dotted curve schematically represents the distribution after a successful assembly. The width of the peaks is tightened because of the averaging over whole contigs.
    View Image

Videos

Literature Cited

   Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I.A., Belmonte, M.K., Lander, E..S., Nusbaum, C., and Jaffe, D.B. 2008. ALLPATHS: De novo assembly of whole‐genome shotgun microreads. Genome Res. 18:810‐820.
   Cock, P.J., Fields, C.J., Goto, N., Heuer, M.L., and Rice, P.M. 2010. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 38:1767‐1771.
   Chaisson, M.J., Brinza, D., and Pevzner, P.A. 2009. De novo fragment assembly with short mate‐paired reads: Does the read length matter? Genome Res. 19:336‐346.
   Li, H., Handsaker, B., Wysoker, A., Fennel, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and the 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25: 2078‐2079.
   Li, R., Zhu, H., and Wang, J. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 20:265‐272.
   Kurtz, S., Narechania, A., Stein, J., and Ware D. 2008. A new method to compute K‐mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 9:517.
   Pevzner, P.A., Tang, H., and Waterman, M.S. 2001. An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A. 98: 9748‐9753.
   Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J., and Birol, I. 2009. ABySS: A parallel assembler for short read sequence data. Genome Res. 19:1117‐1123.
   Zerbino, D.R. and Birney, E. 2008. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18:821‐829.
   Zerbino, D.R., McEwen, G.K., Margulies, E.H., and Birney, E. 2010. Pebble and rock band: Heuristic resolution of repeats and scaffolding in the velvet short‐read de novo assembler. PLoS ONE 4:e8407.
Key References
   Zerbino and Birney, 2008. See above.
   This first publication mainly described the implementation of de Bruijn graphs within Velvet and the error‐correction algorithm, TourBus.
   Zerbino et al., 2010. See above.
   This follow‐up paper describes how Velvet resolves complex repeats using long reads or paired‐end read information.
Internet Resources
   http://www.ebi.ac.uk/∼zerbino/velvet
   Velvet Web site, where code and information on Velvet can be downloaded.
   http://bioinformatics.net.au/software.shtml
   VelvetOptimiser by Simon Gladman and Torsten Seeman. This wrapper software scans different parameters of Velvet to produce an optimal assembly, as described in .
   http://solidsoftwaretools.com/gf/project/denovotools/
   Colorspace de novo pipeline by Craig Cummings, Vrunda Sheth, and Dima Brinza. These scripts allow you to do all the appropriate colorspace conversions described in (a registration is required, but the software is free).
   http://solidsoftwaretools.com/gf/project/corona/
   The Corona Lite package can be found on this server.
   http://sourceforge.net/apps/mediawiki/amos/
   AMOS suite by the AMOS Consortium. This suite of tools allows the user to manipulate, convert or analyze AFG assembly files.
   http://tools.invitrogen.com/content/sfs/manuals/SOLiD_SAGE_SoftwareGuide.pdf
   Colorspace documentation by Applied Biosystems. This document describes colorspace, and the csfasta format in particular.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
关注公众号
反馈
TOP
打开小程序