PatternLab: From Mass Spectra to Label‐Free Differential Shotgun Proteomics
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
PatternLab for proteomics is a self?contained computational environment for analyzing shotgun proteomic data. Recent improvements incorporate modules to facilitate the computational analysis, such as FastaDBXtractor for sequence database preparation and ProLuCID runner for simplifying and managing the protein identification search engine; modules for pushing the limits on proteomics standards, such as SEPro, which relies on a semi?labeled decoy approach for increasing confidence in filtering and organizing peptide spectrum matches; and modules with novel features, such as SEProQ for enabling label?free quantitation by extracted ion chromatograms according to a distributed normalized ion abundance factor approach (dNIAF). Existing modules were also improved, such as the TFold module for pinpointing differentially expressed proteins. These new modules are integrated into the previously described arsenal of tools for further data analysis. Here we provide detailed instructions for operating and understanding them. Curr. Protoc. Bioinform. 40:13.19.1?13.19.18. © 2012 by John Wiley & Sons, Inc.
Keywords: semi?labeled decoy approach; filtering PSMs; dNIAF; quantitative proteomics; protein identification
Table of Contents
- Introduction
- Basic Protocol 1: Preparing a Sequence Database to be Searched by ProLuCID or the Academic SEQUEST
- Basic Protocol 2: Obtaining PSMs with ProLuCID and ProLuCID Runner
- Basic Protocol 3: Filtering Results with the Search Engine Processor (SEPro)
- Basic Protocol 4: Quantitating PSMs by dNIAFs with SEProQ
- Basic Protocol 5: Using Regrouper to Port SEPro Spectral Counting or dNIAF Results to PatternLab
- Basic Protocol 6: Using the Updated TFold Module for Pinpointing Differentially Expressed Proteins
- Guidelines for Understanding Results
- Commentary
- Literature Cited
- Figures
Materials
Figures
-
Figure 13.19.1 FastaDBXtractor GUI. View Image -
Figure 13.19.2 ProLuCID runner's form for specifying the search engine parameters. View Image -
Figure 13.19.3 Example of an annotated spectrum. The blue, red, and green peaks are those that matched a y‐ion series, a b‐ion series, and neutral loss, respectively. View Image -
Figure 13.19.4 SEProQ main GUI. View Image -
Figure 13.19.5 SEProQ GUI displaying the results after tying PSMs to XICs. View Image -
Figure 13.19.6 SEProQ XIC results. The x axis stands for the chromatography retention time and the y axis for the ion counts. The data points used to draw the XIC are discriminated in the table to the right. The red dot on the XIC corresponds to the time when an MS2 event occurred that resulted in the peptide sequence identification. The title of the plot (viz., 1241.58472) stands for the mass (MH+) of the peptide ion. View Image -
Figure 13.19.7 Bird's‐eye view of an LC/MS/MS run. The x axis stands for the chromatography retention time and the y axis for the MH+ value of the peptide ion. This plot can be obtained by clicking on the cell indicating the file name of an LC/MS/MS run in the SEProQ result table. View Image -
Figure 13.19.8 Regrouper's Input tab GUI. View Image -
Figure 13.19.9 Regrouper's “Spectral Counting Analysis” tab. View Image -
Figure 13.19.10 TFold GUI during the comparison of the MudPIT data of two biological conditions. Each quantitated protein is mapped as a dot on the plot according to its p ‐value ( x axis) and fold‐change ( y ‐axis). Red dots are proteins that satisfy neither the fold‐change cutoff nor the FDR q ‐value. Green dots are those satisfying the fold‐change cutoff but not the q ‐value. Orange dots are those satisfying both the fold‐change cutoff and the q ‐value, being however proteins of low abundance and as such highlighted (and separated from the blue dots). We recommend further experimentation to ascertain their differential expression. Blue dots are proteins that satisfy all statistical filters. View Image -
Figure 13.19.11 SEPro's Result Browser provides a practical way for sharing protein identifications. By double‐clicking on a protein identification (top table), its FASTA sequence will pop up; identified peptides will be highlighted in blue. By double‐clicking on a row of the PSM table (the lower table), an annotated mass spectrum corresponding to the given PSM will pop up. View Image
Videos
Literature Cited
Literature Cited | |
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel‐Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., and Sherlock, G. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat.Genet. 25:25‐29. | |
Barboza, R., Cociorva, D., Xu, T., Barbosa, V.C., Perales, J., Valente, R.H., Franca, F.M., Yates, J.R. III, and Carvalho, P.C. 2011. Can the false‐discovery rate be misleading? Proteomics 11:4105‐4108. | |
Benjamini, Y. and Hochberg, Y. 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57:289‐300. | |
Carvalho, P.C., Fischer, J.S., Chen, E.I., Yates, J.R. III, and Barbosa, V.C. 2008a. PatternLab for proteomics: A tool for differential shotgun proteomics. BMC Bioinformatics 9:316. | |
Carvalho, P.C., Hewel, J., Barbosa, V.C., and Yates, J.R. III 2008b. Identifying differences in protein expression levels by spectral counting and feature selection. Genet. Mol. Res. 7:342‐356. | |
Carvalho, P.C., Fischer, J.S., Chen, E.I., Domont, G.B., Carvalho, M.G., Degrave, W.M., Yates, J.R. III, and Barbosa, V.C. 2009a. GO Explorer: A gene‐ontology tool to aid in the interpretation of shotgun proteomics data. Proteome Sci. 7:6. | |
Carvalho, P.C., Xu, T., Han, X., Cociorva, D., Barbosa, V.C., and Yates, J.R. III 2009b. YADA: A tool for taking the most out of high‐resolution spectra. Bioinformatics 25:2734‐2736. | |
Carvalho, P.C., Fischer, J.S., Perales, J., Yates, J.R., Barbosa, V.C., and Bareinboim, E. 2011. Analyzing marginal cases in differential shotgun proteomics. Bioinformatics. 27:275‐276. | |
Carvalho, P.C., Fischer, J.S., Xu, T., Cociorva, D., Balbuena, T.S., Valente, R.H., Perales, J., Yates, J.R. III, and Barbosa, V.C. 2012a. Search engine processor: Filtering and organizing PSMs. Proteomics. 12:944‐949. | |
Carvalho, P.C., Yates, I. Jr., and Barbosa, V.C. 2012b. Improving the TFold test for differential shotgun proteomics. Bioinformatics 28:1652‐1654. | |
Eng, J.K., McCormack, A., Yates, I. Jr., and Yates, J.R. III 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:976‐989. | |
Liu, H., Sadygov, R.G., and Yates, J.R. III 2004. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76:4193‐4201. | |
Makarov, A. 2000. Electrostatic axially harmonic orbital trapping: A high‐performance technique of mass analysis. Anal.Chem. 72:1156‐1162. | |
McDonald, W.H., Tabb, D.L., Sadygov, R.G., MacCoss, M.J., Venable, J., Graumann, J., Johnson, J.R., Cociorva, D., and Yates, J.R. III 2004. MS1, MS2, and SQT‐three unified, compact, and easily parsed file formats for the storage of shotgun proteomic spectra and identifications. Rapid Commun. Mass Spectrom. 18:2162‐2168. | |
Washburn, M.P., Wolters, D., and Yates, J.R. III 2001. Large‐scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19:242‐247. | |
Weiss, M., Schrimpf, S., Hengartner, M.O., Lercher, M.J., and von Mering, C. 2010. Shotgun proteomics data from multiple organisms reveals remarkable quantitative conservation of the eukaryotic core proteome. Proteomics 10:1297‐1306. | |
Xu, T., Venable, J.D., Park, S.K., Cociorva, D., Lu, B., Liao, L., Wohlschlegel, J., Hewel, J., and Yates, J.R. III 2006. ProLuCID, a fast and sensitive tandem mass spectra‐based protein identification program. Mol. Cell Proteomics 5:S174. | |
Yates, J.R. III, Park, S.K., Delahunty, C.M., Xu, T., Savas, J.N., Cociorva, D., and Carvalho, P.C. 2012. Toward objective evaluation of proteomic algorithms. Nat. Methods 9:455‐456. | |
Zhang, B., Chambers, M.C., and Tabb, D.L. 2007. Proteomic parsimony through bipartite graph analysis improves accuracy and transparency. J. Proteome. Res. 6:3549‐3557. | |
Zhang, Y., Wen, Z., Washburn, M.P., and Florens, L. 2010. Refinements to label free proteome quantitation: How to deal with peptides shared by multiple proteins. Anal. Chem. 82:2272‐2281. |