Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2

互联网2013-12-31

1243

Abstract
Table of Contents
Materials
Figures
Literature Cited

Abstract

PolyPhen?2 (Polymorphism Phenotyping v2), available as software and via a Web server, predicts the possible impact of amino acid substitutions on the stability and function of human proteins using structural and comparative evolutionary considerations. It performs functional annotation of single?nucleotide polymorphisms (SNPs), maps coding SNPs to gene transcripts, extracts protein sequence annotations and structural attributes, and builds conservation profiles. It then estimates the probability of the missense mutation being damaging based on a combination of all these properties. PolyPhen?2 features include a high?quality multiple protein sequence alignment pipeline and a prediction method employing machine?learning classification. The software also integrates the UCSC Genome Browser's human genome annotations and MultiZ multiple alignments of vertebrate genomes with the human genome. PolyPhen?2 is capable of analyzing large volumes of data produced by next?generation sequencing projects, thanks to built?in support for high?performance computing environments like Grid Engine and Platform LSF. Curr. Protoc. Hum. Genet. 76:7.20.1?7.20.41. © 2013 by John Wiley & Sons, Inc.

Keywords: human genetic variation; single?nucleotide polymorphism (SNP); mutation effect prediction; computational biology; PolyPhen?2

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Introduction
Basic Protocol 1: Predicting the Effect of a Single‐Residue Substitution on Protein Structure and Function Using the PolyPhen‐2 Web Server
Basic Protocol 2: Analyzing a Large Number of SNPs in a Batch Mode Using the PolyPhen‐2 Web Server
Basic Protocol 3: Quick Search in a Database of Precomputed Predictions
Support Protocol 1: Checking the Status of Your Query with the Grid Gateway Interface
Alternate Protocol 1: Automated Batch Submission
Alternate Protocol 2: Installing PolyPhen‐2 Standalone Software
Alternate Protocol 3: Using PolyPhen‐2 Standalone Software
Support Protocol 2: Updating Built‐In Databases
Commentary
Literature Cited
Figures
Tables

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Materials

Basic Protocol 1: Predicting the Effect of a Single‐Residue Substitution on Protein Structure and Function Using the PolyPhen‐2 Web Server

Materials

An up‐to‐date Web browser, such as Firefox, Internet Explorer, or Safari. JavaScript support and cookies should be enabled in the browser configuration; the Java browser plug‐in is required for the protein 3‐D structure viewer to function.

Basic Protocol 2: Analyzing a Large Number of SNPs in a Batch Mode Using the PolyPhen‐2 Web Server

Materials

An up‐to‐date Web browser, such as Firefox, Internet Explorer, or Safari. Cookies should be enabled in the browser configuration.

Basic Protocol 3: Quick Search in a Database of Precomputed Predictions

Materials

An up‐to‐date Web browser, such as Firefox, Internet Explorer, or Safari. JavaScript support should be enabled in the browser configuration.

Support Protocol 1: Checking the Status of Your Query with the Grid Gateway Interface

Materials

An up‐to‐date Web browser, such as Firefox, Internet Explorer, or Safari. Cookies should be enabled in the browser configuration.

Alternate Protocol 1: Automated Batch Submission

Materials

A text editor and curl command‐line utility (http://curl.haxx.se/)

Alternate Protocol 2: Installing PolyPhen‐2 Standalone Software

Materials

A Linux computer with the PolyPhen‐2 standalone software installed as described in protocol 6

Alternate Protocol 3: Using PolyPhen‐2 Standalone Software

Materials

A Linux computer with the PolyPhen‐2 standalone software installed as described in protocol 6 and a sufficiently fast Internet connection. Steps 5 and 6 require Blat tools installed (see protocol 6 , step 9) and involve substantial amounts of calculation. In order for these steps to complete within a reasonable time, it is recommended to use a powerful multi‐CPU workstation or a Linux cluster.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

Figures

Figure 7.20.1 PolyPhen‐2 home Web page with the input form prepared to submit a single protein substitution query using Swiss‐Prot accession as a protein identifier. Also supported are RefSeq and Ensembl protein identifiers; alternatively, a dbSNP reference SNP identifier can be entered, in which case no other input is required.

View Image

Figure 7.20.2 Detailed results of the PolyPhen‐2 analysis for single variant query. This format is used for all PolyPhen‐2 query reports except the Batch Query. The top Query section includes UniProtKB/Swiss‐Prot description of query protein, if it was recognized as a known database entry. The large “heatmap” color bar with the black indicator mark dominates the display, illustrating the strength of the putative damaging effect for the variant, assessed using the default HumDiv‐trained predictor. Clicking on the [+] control boxes expands the Prediction/Confidence panel for the HumDiv‐trained predictor, as well as additional panels with protein multiple sequence alignment and 3D‐structure viewers.

View Image

Figure 7.20.3 Detailed results of the PolyPhen‐2 analysis for a single variant query with the multiple sequence alignment and 3‐D‐structure protein viewer panels expanded the multiple sequence alignment panel displays a fixed 75‐residue wide window surrounding the variant's position (the column indicated by black frame), with the alignment colored using the ClustalX (Thompson et al., ) scheme for all columns above 50% conservation threshold. Clicking on the link at the bottom of the alignment panel opens the Jalview (Waterhouse et al., ) alignment viewer applet with the complete multiple alignment loaded. Displayed below is a 3‐D‐structure viewer applet (Jmol; http://www.jmol.org/) with the protein structure loaded and zoomed into the mutation residue using the Zoom into mutation button. The structure viewer window is fully interactive, and the protein structure can be rotated, moved, or zoomed in and out.

View Image

Figure 7.20.4 The PolyPhen‐2 Batch Query Web page allows submitting large number of variants for analysis in a single operation. Type or paste your variants into the Batch Query text input area (one variant per line) or upload a text file listing variants using Upload batch file text box (locate the file using the Browse button). If you enter your e‐mail address into the corresponding text box, you will be notified via e‐mail when your query completes. To analyze protein variants in nonstandard or unannotated proteins, you can upload your own protein sequences in FASTA format using the Upload FASTA file text box. Genomic variants are also supported; see the Sample Batch panel for the various input format examples. Do not forget to select the genome assembly version matching your genomic SNP data under Advanced Options; default assembly version used is GRCh37/UCSC hg19.

View Image

Figure 7.20.5 Grid Gateway Interface (GGI) Web page showing a PolyPhen‐2 user session with one single‐variant query completed and a Batch Query pending execution. Click on the View link to access results of a single‐variant query (no errors were reported). This Batch Query was queued as a 7‐stage pipeline; the status of each pipeline stage is tracked and displayed separately, with short stage explanations printed in the Description column. The batch will be completed when the last stage finishes. Grid Status shows high Grid Load and large number of other Pending jobs; batch completion waiting time is likely to be substantial. Click on the Refresh button periodically to update session status. You can also close your browser and check your session at a later time—go to the PolyPhen‐2 home page, click on the Check Status button, and you will be transferred to your session automatically.

View Image

Figure 7.20.6 Grid Gateway Interface (GGI) Web page showing a completed PolyPhen‐2 Batch Query. Right‐click on one of the SNPs, Short, or Full links in the Batches/Results column to download results to your computer; see for description of the three various types of report files. Click on the Logs link under Batches/Errors to view all error and warning messages generated by the pipeline. Note that most of the warnings are for your information only and do not indicate failure of the analysis. After downloading results, all batch data can be deleted by checking corresponding Batches/Delete checkbox and clicking on the Refresh button. Be warned that the delete operation is irreversible and deleted data cannot be restored.

View Image

Videos

Literature Cited

Literature Cited
	Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova, A., Bork, P., Kondrashov, A.S., and Sunyaev, S.R. 2010. A method and server for predicting damaging missense mutations. Nat. Methods 7:248‐249.
	Ashley, E.A., Butte, A.J., Wheeler, M.T., Chen, R., Klein, T.E., Dewey, F.E., Dudley, J.T., Ormond, K.E., Pavlovic, A., Morgan, A.A., Pushkarev, D., Neff, N.F., Hudgins, L., Gong, L., Hodges, L.M., Berlin, D.S., Thorn, C.F., Sangkuhl, K., Hebert, J.M., Woon, M., Sagreiya, H., Whaley, R., Knowles, J.W., Chou, M.F., Thakuria, J.V., Rosenbaum, A.M., Zaranek, A.W., Church, G.M., Greely, H.T., Quake, S.R., and Altman, R.B. 2010. Clinical assessment incorporating a personal genome. Lancet 375:1525‐1535.
	Bamshad, M.J., Ng, S.B., Bigham, A.W., Tabor, H.K., Emond, M.J., Nickerson, D.A., and Shendure, J. 2011. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12:745‐755.
	Boyko, A.R., Williamson, S.H., Indap, A.R., Degenhardt, J.D., Hernandez, R.D., Lohmueller, K.E., Adams, M.D., Schmidt, S., Sninsky, J.J., Sunyaev, S.R., White, T.J., Nielsen, R., Clark, A.G., and Bustamante, C.D. 2008. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4:e10083.
	Chasman, D. and Adams, R.M. 2001. Predicting the functional consequences of non‐synonymous single nucleotide polymorphisms: Structure‐based assessment of amino acid variation. J. Mol. Biol. 307:683‐706.
	Hicks, S., Wheeler, D.A., Plon, S.E., and Kimmel, M. 2011. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Hum. Mutat. 32:661‐668.
	Joosten, R.P., te Beek, T.A.H., Krieger, E., Hekkelman, M.L., Hooft, R.W.W., Schneider, R., Sander, C., and Vriend, G. 2011. A series of PDB related databases for everyday needs. Nucleic Acids Res. 39:D411‐D419.
	Nei, M. and Kumar, S. 2000. Molecular Evolution and Phylogenetics. Oxford University Press, New York.
	Ng, P.C., Henikoff, J.G., and Henikoff, S. 2000. PHAT: A transmembrane‐specific substitution matrix. Predicted hydrophobic and transmembrane. Bioinformatics (Oxford) 16:760‐766.
	Sunyaev, S.R., Eisenhaber, F., Rodchenkov, I.V., Eisenhaber, B., Tumanyan, V.G., and Kuznetsov, E.N. 1999. PSIC: Profile extraction from sequence alignments with position‐specific counts of independent observations. Protein Eng. 12:387‐394.
	Sunyaev, S., Ramensky, V., Koch, I., Lathe, W., Kondrashov, A.S., and Bork, P. 2001. Prediction of deleterious human alleles. Hum. Mol. Genet. 10:591‐597.
	Tennessen, J.A., Bigham, A.W., O'Connor, T.D., Fu, W., Kenny, E.E., Gravel, S., McGee, S., Do, R., Liu, X., Jun, G., Kang, H.M., Jordan, D., Leal, S.M., Gabriel, S. Rieder, M.J., Abecasis, G. Altshuler, D., Nickerson, D.A., Boerwinkle, E., Sunyaev, S. Bustamante, C.D. Bamshad, M.J., Akey, J.M., Broad GO, and Seattle GO, on behalf of the NHLBI Exome Sequencing Project. 2012. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 337:64‐69.
	Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. 1997. The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876‐4882.
	The UniProt Consortium. 2011. Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res. 40:D71‐D75.
	Waterhouse, A.M., Procter, J.B., Martin, D.M.A., Clamp, M., and Barton, G.J. 2009. Jalview Version 2‐a multiple sequence alignment editor and analysis workbench. Bioinformatics (Oxford) 25:1189‐1191.

GO TO THE FULL PROTOCOL:

PDF or HTML at Wiley Online Library

关于丁香通

公司信息

个人用户

企业机构

无忧采购轻松科研

提问

扫一扫

实验小助手

扫码领资料

反馈

TOP

打开小程序

Predicting Functional Effect of Human Missense Mutations Using PolyPhen‐2

Abstract

Table of Contents

Materials

Basic Protocol 1: Predicting the Effect of a Single‐Residue Substitution on Protein Structure and Function Using the PolyPhen‐2 Web Server

Basic Protocol 2: Analyzing a Large Number of SNPs in a Batch Mode Using the PolyPhen‐2 Web Server

Basic Protocol 3: Quick Search in a Database of Precomputed Predictions

Support Protocol 1: Checking the Status of Your Query with the Grid Gateway Interface

Alternate Protocol 1: Automated Batch Submission

Alternate Protocol 2: Installing PolyPhen‐2 Standalone Software

Alternate Protocol 3: Using PolyPhen‐2 Standalone Software

Figures

Videos

Literature Cited

关于丁香通

公司信息

个人用户

企业机构