丁香实验_LOGO
登录
提问
我要登录
|免费注册
点赞
收藏
wx-share
分享

Inferring Protein Function from Homology Using the Princeton Protein Orthology Database (P‐POD)

互联网

1291
  • Abstract
  • Table of Contents
  • Figures
  • Literature Cited

Abstract

 

Inferring a protein's function by homology is a powerful tool for biologists. The Princeton Protein Orthology Database (P?POD) offers a simple way to visualize and analyze the relationships between homologous proteins in order to infer function. P?POD contains computationally generated analysis distinguishing orthologs from paralogs combined with curated published information on functional complementation and on human diseases. P?POD also features an applet, Notung, for users to explore and modify phylogenetic trees and generate their own ortholog/paralogs calls. This unit describes how to search P?POD for precomputed data, how to find and use the associated curated information from the literature, and how to use Notung to analyze and refine the results.Curr. Protoc. Bioinform. 33:6.11.1?6.11.12. © 2011 by John Wiley & Sons, Inc.

Keywords: functional complementation; disease; conservation; phylogenetic analysis; trees; paralogs; Notung

     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Table of Contents

  • Introduction
  • Basic Protocol 1: Searching for Homologs
  • Basic Protocol 2: Investigating the Conserved Function and Significance of a Protein
  • Basic Protocol 3: Using the Notung Applet to Examine Homology Relationships in Greater Detail
  • Guidelines for Understanding Results
  • Commentary
  • Literature Cited
  • Figures
  • Tables
     
 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Materials

 
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library

Figures

  •   Figure 6.11.1 The P‐POD search interface at http://ppod.princeton.edu/.
    View Image
  •   Figure 6.11.2 Sample P‐POD search results. In this example, a search for S. cerevisiae proteins matching “2678” returned the SGD database identifier S000002678, the UniProt identifier P26784, and seven other proteins (not shown). Matching strings are highlighted in pink. The right half of the table shows the compositions of the four types of families: OrthoMCL, MultiParanoid, Jaccard, and Naïve Ensemble. Each family is represented as a 4 × 3 array of circles that use a two‐letter code to show the number of proteins from each of the organisms; e.g., “At 2” means that the family contains two proteins from Arabidopsis thaliana . If a protein is not assigned to a family in a particular analysis, the word “orphan” appears. If the icon for a family does not load properly, the family name–e.g., “OrthoMCL884”–still appears in the relevant table entry.
    View Image
  •   Figure 6.11.3 The Protein Family Page contains a phylogenetic tree and a table of the members of the family. Blue text and symbols in the table represent linkouts. The four gray tabs near the top of the page allow the user to switch back and forth to other types of data. The Notung Tree Analysis link activates the Notung applet.
    View Image
  •   Figure 6.11.4 Click the Functional Conservation tab to display a list of curated experiments describing complementation and exogenous expression experiments. The right‐hand column contains curator notes describing the experiment and its results. Linkouts in the middle column connect to the PubMed entry for the paper. This list has been truncated due to space considerations.
    View Image
  •   Figure 6.11.5 Click the Disease References tab to display linkouts to OMIM diseases associated with human members of the protein family (top panel) and a list of papers from SGD containing information about yeast genes with human homologs involved in disease. This list has been truncated due to space considerations.
    View Image
  •   Figure 6.11.6 The Notung applet window. The top pane shows the protein family phylogenetic tree and legend. The bottom pane has several tabs, each with its own set of buttons and checkboxes, allowing access to functions to modify the tree. Additional functions are accessible through the drop‐down menus at the top. The Edit Values button in the lower right corner allows the user to change tree parameters.
    View Image
  •   Figure 6.11.7 The P‐POD pipeline. Protein sequences were assigned to families using several different techniques, and curated information from several sources is displayed with the computational results.
    View Image

Videos

Literature Cited

   Alexeyenko, A., Tamas, I., Liu, G., and Sonnhammer, E.L.L. 2006. Automatic clustering of orthologs and inparalogs shared by multiple proteomes. Bioinformatics 22:e9‐e15.
   Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403‐410.
   Durand, D., Halldórsson, B.V., and Vernot, B. 2006. A hybrid micro‐macroevolutionary approach to gene tree reconstruction. J. Comput. Biol. 13:320‐335.
   Guindon, S. and Gascuel, O. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696‐704.
   Hamosh, A., Scott, A.F., Amberger, J., Bocchini, C., Valle, D., and McKusick, V.A. 2002. Online Mendelian Inheritance in Man (OMIM): A knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 30:52‐55.
   Heinicke, S., Livstone, M.S., Lu, C., Oughtred, R., Kang, F., Angiuoli, S.V., White, O., Botstein, D., and Dolinski, K. 2007. The Princeton Protein Orthology Database (P‐POD): A comparative genomics analysis tool for biologists. PLoS One 22:e766.
   Katoh, K. and Toh, H. 2008. Recent developments in the MAFFT multiple sequence alignment program. Brief. Bioinform. 9:286‐298.
   Li, L., Stoeckert, C.J. Jr., and Roos, D.S. 2003. OrthoMCL: Identification of ortholog groups for eukaryotic genomes. Genome Res. 13:2178‐2189.
   Mi, H., Dong, Q., Muruganujan, A., Gaudet, P., Lewis, S., and Thomas, P.D. 2010. PANTHER version 7: Improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium. Nucleic Acids Res. 38:D204‐D210.
   Reference Genome Group of the Gene Ontology Consortium. 2009. The Gene Ontology's Reference Genome Project: A unified framework for functional annotation across species. PLoS Comput. Biol. 5:e1000431.
Key References
   Heinicke et al., 2007. See above.
   The original 2007 P‐POD paper, with discussion of the reasons for building P‐POD and testing of the literature curation. The pipeline and user interface have changed since 2007; refer to the P‐POD help page (below) for a current technical description of P‐POD.
   Durand et al., 2006. See above
   Technical description of Notung.
Internet Resources
   http://ppod.princeton.edu/
   The main P‐POD page and search interface.
   http://ppod.princeton.edu/help/
   The P‐POD help page contains an overview of the P‐POD pipeline, a brief tutorial, and links to additional information.
   http://ppod.princeton.edu/help/help_identifiers.html
   Valid identifiers for P‐POD and sample searches.
   http://ppod.princeton.edu/help/help_tech.html
   P‐POD technical information, including version numbers and settings for all software in the P‐POD pipeline.
   http://ppod.princeton.edu/help/help_notung_ortho_para.html
   A more extensive and illustrated explanation of how Notung infers orthologs and paralogs in P‐POD.
   ftp://gen‐ftp.princeton.edu/ppod/
   The P‐POD ftp site containing all families, support files, and the 48‐species PANTHER 7.0 dataset. The current release is in the “version4” folder. More detail is available in README's.
   http://ppod.princeton.edu/help/help_data_archive.html
   Archival technical information for the original 2007 P‐POD release only.
   http://www.cs.cmu.edu/∼durand/Notung/
   The Notung application and documentation.
   http://www.ncbi.nlm.nih.gov/omim
   Online Mendelian Inheritance in Man (OMIM).
   http://www.pantherdb.org/
   The PANTHER 7.0 database.
   http://www.yeastgenome.org/
   The Saccharomyces Genome Database.
   http://www.geneontology.org/GO.refgenome.shtml
   Homepage of the Gene Ontology Consortium's Reference Genome project.
   http://amigo.geneontology.org/cgi‐bin/amigo/go.cgi
   The Gene Ontology Consortium's AmiGO database.
   http://evolution.genetics.washington.edu/phylip/newicktree.html
   Description of the Newick tree format.
GO TO THE FULL PROTOCOL:
PDF or HTML at Wiley Online Library
 
提问
扫一扫
丁香实验小程序二维码
实验小助手
丁香实验公众号二维码
扫码领资料
反馈
TOP
打开小程序