丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Some Phenotype Association Tools in Galaxy: Looking for Disease SNPs in a Full Genome

互联网

566
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

This unit focuses on some of the tools available on the public Galaxy server that are useful for exploring possible associations between human genetic variants and phenotypes. We trace step?by?step through an example illustrating several methods for examining a single full?coverage genome to look for single?nucleotide polymorphisms (SNPs) that are either known to be associated with disease or suspected to have impact for other reasons. It makes use of public genomic data, tools designed specifically for working with variants, and also some general tools for text manipulation and operations on genomic coordinates. Curr. Protoc. Bioinform. 39:15.2.1?15.2.27. © 2012 by John Wiley & Sons, Inc.

Keywords: disease; SNP; genome variation; coding; non?coding; gene?based analysis; Web application

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Using Galaxy to Look for Disease SNPs in a Full Genome: Preparing Input Data
  • Basic Protocol 2: Selecting Known Coding SNPs Predicted to be Damaging, then Finding Their Genes and Associated Pathways
  • Basic Protocol 3: Running New Predictions of Coding SNPs Likely to be Detrimental
  • Basic Protocol 4: Finding SNPs that Fall in Suspected Functional Regions
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure 15.2.1 Uploading a data file. See text for details.
    View Image
  •   Figure 15.2.2 Converting to pgSnp format. See text for details.
    View Image
  •   Figure 15.2.3 Putative SNP Phenotypes library. See text for details.
    View Image
  •   Figure 15.2.4 Removing SNPs found in healthy individuals. See text for details.
    View Image
  •   Figure 15.2.5 Completed input dataset. See text for details.
    View Image
  •   Figure 15.2.6 Details about the PolyPhen‐2 dataset. See text for details.
    View Image
  •   Figure 15.2.7 Joining on genomic intervals. See text for details.
    View Image
  •   Figure 15.2.8 Selecting damaging results. See text for details.
    View Image
  •   Figure 15.2.9 PolyPhen‐2 results. See text for details.
    View Image
  •   Figure 15.2.10 Mapping between identifiers. See text for details.
    View Image
  •   Figure 15.2.11 Choosing the identifier fields. See text for details.
    View Image
  •   Figure 15.2.12 Joining on identifiers. See text for details.
    View Image
  •   Figure 15.2.13 CTD. See text for details.
    View Image
  •   Figure 15.2.14 CTD results. See text for details.
    View Image
  •   Figure 15.2.15 Input for SIFT. See text for details.
    View Image
  •   Figure 15.2.16 Viewing the workflow. See text for details.
    View Image
  •   Figure 15.2.17 Running the workflow. See text for details.
    View Image
  •   Figure 15.2.18 SIFT. See text for details.
    View Image
  •   Figure 15.2.19 Selecting damaging SNPs. See text for details.
    View Image
  •   Figure 15.2.20 SIFT results. See text for details.
    View Image
  •   Figure 15.2.21 Intersecting with the PRPs. See text for details.
    View Image
  •   Figure 15.2.22 SNPs in PRPs. See text for details.
    View Image
  •   Figure 15.2.23 DNase hypersensitive sites (HSSs) from ENCODE. See text for details.
    View Image
  •   Figure 15.2.24 Intersecting with the HSSs. See text for details.
    View Image
  •   Figure 15.2.25 SNPs in HSSs. See text for details.
    View Image
  •   Figure 15.2.26 PhyloP. See text for details.
    View Image
  •   Figure 15.2.27 Distribution of phyloP scores. See text for details.
    View Image
  •   Figure 15.2.28 Histogram. See text for details.
    View Image
  •   Figure 15.2.29 Filtering the SNPs based on phyloP score. See text for details.
    View Image
  •   Figure 15.2.30 Highly conserved SNPs. See text for details.
    View Image

Videos

Literature Cited

   Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova, A., Bork, P., Kondrashov, A.S., and Sunyaev, S.R. 2010. A method and server for predicting damaging missense mutations. Nat. Methods 7:248‐249.
   Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., and Taylor, J. 2010. Galaxy: A web‐based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. 89:19.10.1‐19.10.21.
   Davis, A.P., Murphy, C.G., Saraceni‐Richards, C.A., Rosenstein, M.C., Wiegers, T.C., and Mattingly, C.J. 2009. Comparative Toxicogenomics Database: A knowledgebase and discovery tool for chemical.gene.disease networks. Nucleic Acids Res. 37:D786‐D792.
   Drmanac, R., Sparks, A.B., Callow, J.M., Halpern, A.L., Burns, N.L., Kermani, B.G., Carnevali, P., Nazarenko, I., Nilsen, G.B., Yeung, G, Dahl, F., Fernandez, A., Staker, B., Pant, K.P., Baccash, J., Borcherding, A.P., Brownley, A., Cedeno, R., Chen, L., Chernikoff, D., Cheung, A., Chirita, R., Curson, B., Ebert, J.C., Hacker, C.R., Hartlage, R., Hauser, B., Huang, S., Jiang, Y., Karpinchyk, V., Koenig, M., Kong, C., Landers, T., Le, C., Liu, J., McBride, C.E., Morenzoni, M., Morey, R.E., Mutch, K., Perazich, H., Perry, K., Peters, B.A., Peterson, J., Pethiyagoda, C.L., Pothuraju, K., Richter, C., Rosenbaum, A.M., Roy, S., Shafto, J., Sharanhovich, U., Shannon, K.W., Sheppy, C.G., Sun, M., Thakuria, J.V., Tran, A., Vu, D., Zaranek, A.W., Wu, X., Drmanac, S., Oliphant, A.R., Banyai, W.C., Martin, B., Ballinger, D.G., Church, G.M., and Reid, C.A. 2009. Human genome sequencing using unchained base reads on self‐assembling DNA nanoarrays. Science 327:78‐81.
   Ferretti, V., Poitras, C., Bergeron, D., Coulombe, B., Robert, F., and Blanchette, M. 2007. PReMod: A database of genome‐wide mammalian cis‐regulatory module predictions. Nucleic Acids Res. 35:D122‐D126.
   Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Taylor, J., Miller, W., Kent, W.J., and Nekrutenko, A. 2005. Galaxy: A platform for interactive large‐scale genome analysis. Genome Res. 15:1451‐1455.
   Giardine, B., Riemer, C., Hefferon, T., Thomas, D., Hsu, F., Zielenski, J., Sang, Y., Elnitski, L., Cutting, G., Trumbower, H., Kern, A., Kuhn, R., Patrinos, G.P., Hughes, J., Higgs, D., Chui, D., Scriver, C., Phommarinh, M., Patnaik, S.K., Blumenfeld, O., Gottlieb, B., Vihinen, M., Väliaho, J., Kent, J., Miller, W., and Hardison, R.C. 2007. PhenCode: Connecting ENCODE data with mutations and phenotype. Hum. Mutat. 28:554‐562.
   Goecks, J., Nekrutenko, A., Taylor, J.; Galaxy Team. 2010. Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol. 11:R86.
   Huang, D.W., Sherman, B.T., and Lempicki, R.A. 2009. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4:44‐57.
   Karolchik, D., Hinrichs, A.S., Furey, T.S., Roskin, K.M., Sugnet, C.W., Haussler, D., and Kent, W.J. 2004. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32:D493‐D496.
   Kumar, P., Henikoff, S., and Ng, P.C. 2009. Predicting the effects of coding non‐synonymous variants on protein function using the SIFT algorithm. Nat. Protoc. 4:1073‐1081.
   Reimand, J., Kull, M., Peterson, H., Hansen, J., and Vilo, J. 2007. g:Profiler: A web‐based toolset for functional profiling of gene lists from large‐scale experiments. Nucleic Acids Res. 35:W193‐W200.
   Seal, R.L., Gordon, S.M., Lush, M.J., Wright, M.W., and Bruford, E.A. 2011. genenames.org: The HGNC resources in 2011. Nucleic Acids Res. 39:D514‐519.
   Siepel, A., Pollard, K.S., and Haussler, D. 2006. New methods for detecting lineage‐specific selection. In Proceedings of the 10th International Conference on Research in Computational Molecular Biology (RECOMB 2006), pp. 190‐205, Venice, Italy.
   Taylor, J., Tyekucheva, S., King, D.C., Hardison, R.C., Miller, W., and Chiaromonte, F. 2006. ESPERR: Learning strong and weak signals in genomic sequence alignments to identify functional elements. Genome Res. 16:1596‐1604.
Internet Resources
   http://galaxyproject.org
   The main public instance of Galaxy.
   http://phencode.bx.psu.edu
   A collection of human phenotype‐associated SNPs from Locus‐Specific Databases.
   http://www.bx.psu.edu/miller_lab/docs/galaxy_phen_assoc/tutorial/
   A version of this tutorial in HTML format.
   http://genome.ucsc.edu/FAQ/FAQformat.html
   Descriptions of file formats used by the UCSC Table Browser.
Supplementary File
   http://www.currentprotocols.com/protocol/bi1502
   This is an alternate URL to access the file “test.masterVar.gz” cited in , Necessary Resources, Files on page 15.2.3.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
关注公众号
反馈
TOP
打开小程序