Using the PRIDE Proteomics Identifications Database for Knowledge Discovery and Data Analysis

互联网2014-02-13

431

The PRIDE Proteomics Identifications Database provides users with the ability to explore and compare mass spectrometry-based proteomics experiments that reveal details of the protein expression found in a broad range of taxonomic groups, tissues and disease states. A PRIDE experiment typically includes identifications of proteins, peptides and protein modifications. Many of the submitted experiments also include processed peak lists representing the mass spectra that provide the evidence for these identifications.

Since the inception of the PRIDE project, a number of tools supporting submission of data to PRIDE have been developed. Of particular note is the “PRIDE Converter” that has become the tool most frequently used for the production of PRIDE submissions at the time of writing.

The PRIDE XML format has been expanded to provide submitters with the capacity to annotate fragment ion information on to peptide identifications and the fragmentation spectra that provide the experimental evidence for these peptides. A novel algorithm for annotating fragment ion information on to peptides and their evidential mass spectra has also been developed that will ultimately provide a route for evaluating the quality of peptide identifications arising from tandem mass spectrometry. This algorithm allows the visualisation of potential fragment ions on to the identified mass spectra, even where no such information has been submitted.

In this chapter, we describe how PRIDE can be applied as a research tool and how the experiments in PRIDE can be compared and analysed. We also explore how complex queries can be constructed using the PRIDE BioMart. Finally, we will describe how the user can integrate PRIDE data with annotation from other resources, using federated BioMart queries.