Using the iPlant Collaborative Discovery Environment
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
The iPlant Collaborative is an academic consortium whose mission is to develop an informatics and social infrastructure to address the ?grand challenges? in plant biology. Its cyberinfrastructure supports the computational needs of the research community and facilitates solving major challenges in plant science. The Discovery Environment provides a powerful and rich graphical interface to the iPlant Collaborative cyberinfrastructure by creating an accessible virtual workbench that enables all levels of expertise, ranging from students to traditional biology researchers and computational experts, to explore, analyze, and share their data. By providing access to iPlant's robust data?management system and high?performance computing resources, the Discovery Environment also creates a unified space in which researchers can access scalable tools. Researchers can use available Applications (Apps) to execute analyses on their data, as well as customize or integrate their own tools to better meet the specific needs of their research. These Apps can also be used in workflows that automate more complicated analyses. This module describes how to use the main features of the Discovery Environment, using bioinformatics workflows for high?throughput sequence data as examples. Curr. Protoc. Bioinform. 42:1.22.1?1.22.26. © 2013 by John Wiley & Sons, Inc.
Keywords: cyberinfrastructure; science gateways; bioinformatics; plant sciences; plant biology; computational biology; RNA?Seq
Table of Contents
- Introduction
- Basic Protocol 1: Entering Data Into the Discovery Environment
- Alternate Protocol 1: Bulk Data Upload Using iDrop Lite
- Basic Protocol 2: Running an Application in the Discovery Environment
- Basic Protocol 3: Processing Bar‐Coded Sequence Read Data with Sabre, Scythe, and Sickle
- Basic Protocol 4: Creating an Automated Workflow
- Basic Protocol 5: RNA‐Seq Analysis with the Tuxedo Workflow
- Commentary
- Literature Cited
- Figures
- Tables
Materials
Figures
-
Figure 1.22.1 The Discovery Environment Data window from which users may upload or import data. View Image -
Figure 1.22.2 The iDrop Lite applet, which allows users to upload bulk data into the iPlant Data Store via the Discovery Environment. View Image -
Figure 1.22.3 The FastQC 0.10.1 App located within the Apps window. View Image -
Figure 1.22.4 Example of how the Settings section of the Sabre App interface should appear before launching the analysis. View Image -
Figure 1.22.5 Example of how the Settings section of the Scythe App interface should appear before launching the analysis View Image -
Figure 1.22.6 A window where Apps are added for an automated workflow. The boxes at the lower left corner display how many Apps are currently in the workflow, as well as the last App added to the workflow. View Image -
Figure 1.22.7 In an automated workflow, the output of one App must be mapped as the input of another App so that they may run in sequence. View Image -
Figure 1.22.8 The Data window displaying the outputs of the cp_tophat analysis. View Image -
Figure 1.22.9 It is easier to add many input files for an analysis by opening a separate Data window. View Image -
Figure 1.22.10 Select multiple files in the Data window by clicking each file with the CTRL key pressed on Windows/Linux (and CMD key on Macs), and dragging all the files onto the input field of the Cufflinks2 App window. View Image -
Figure 1.22.11 The Data window displaying the bam folder output of the cp_tophat analysis, from which only the WT BAM files will be used as input for Cuffdiff2. View Image
Videos
Literature Cited
Conway, M., Rajasekar, A., and Moore, R. 2010. iDROP: A Graphical User Interface for Community Remote Sensing. 2010 GSA Denver Annual Meeting. Geological Society of America, Boulder, Colo. | |
Goff, S.A. Vaughn, M., McKay, S., Lyons, E., Stapleton, A.E., Gessler, D., Matasci, N., Wang, L., Hanlon, M., Lenards, A., Muir, A., Merchant, N., Lowry, S., Mock, S., Helmke, M., Kubach, A., Narro, M., Hopkins, N., Micklos, D., Hilgert, U., Gonzales, M., Jordan, C., Skidmore, E., Dooley, R., Cazes, J., McLay, R., Lu, Z., Pasternak, S., Koesterke, L., Piel, W.H., Grene, R., Noutsos, C., Gendler, K., Feng, X., Tang, C., Lent, M., Kim, S.J., Kvilekval, K., Manjunath, B.S., Tannen, V., Stamatakis, A., Sanderson, M., Welch, S.M., Cranston, K.A., Soltis, P., Soltis, D., O'Meara, B., Ane, C., Brutnell, T., Kleibenstein, D.J., White, J.W., Leebens‐Mack, J., Donoghue, M.J., Spalding, E.P., Vision, T.J., Myers, C.R., Lowenthal, D., Enquist, B.J., Boyle, B., Akoglu, A., Andrews, G., Ram, S., Ware, D., Stein, L., and Stanzione, D. 2011. The iPlant Collaborative: Cyberinfrastructure for plant biology. Front. Plant Sci. 2:34. | |
Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. 2012. Ultrafast and memory‐efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25. | |
Lenards, A.J., Merchant, N., and Stanzione, D. 2011. Building an environment to facilitate discoveries for plant sciences. Proceedings from Gateway Computing Environments 2011 at Supercomputing 11, Seattle. IEEE Computer Society, Washington D.C. | |
Pennisi, E. 2011. Human genome 10th anniversary. Will computers crash genomics? Science 331:666‐668. | |
Stanzione, D. 2011. The iPlant collaborative: Cyberinfrastructure to feed the world. IEEE Computer 44:44‐52. | |
Trapnell, C., Pachter, L., and Salzberg, S.L. 2009. TopHat: Discovering splice junctions with RNA‐Seq. Bioinformatics 25:1105‐1111. | |
Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B.J., and Pachter, L. 2010. Transcript assembly and quantification by RNA‐Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28:511‐515. | |
Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D.R., Pimentel, H., Salzberg, S.L., Rinn, J.L., and Pachter, L. 2012. Differential gene and transcript expression analysis of RNA‐seq experiments with TopHat and Cufflinks. Nat. Protoc. 7:562‐578. | |
Zhang, H., He, H., Wang, X., Wang, X., Yang, X., Li, L., and Deng X.W. 2011. Genome‐wide mapping of the HY5‐mediated gene networks in Arabidopsis that involve both transcriptional and post‐transcriptional regulation. Plant J. 65:346‐358. | |
Internet Resources | |
http://user.iplantcollaborative.org | |
iPlant Collaborative's user management portal. | |
http://de.iplantcollaborative.org | |
iPlant Collaborative's Discovery Environment home page. | |
http://ask.iplantcollaborative.org | |
iPlant Collaborative's user help forum. | |
http://iplantcollaborative.org/CPB2012_apps | |
iPlant Collaborative's Discovery Environment App documentation pages. | |
http://www.iplantcollaborative.org/DEManual | |
iPlant Collaborative's Discovery Environment manual. | |
http://iplantcollaborative.org/CPB2012_table1 | |
iPlant Collaborative's documentation page describing uploading and importing data. | |
http://iplant.co/ScytheDoc | |
iPlant Collaborative's documentation page for the Scythe App. | |
http://www.java.com/en | |
Java's home page, where Java downloads and documentation are available. | |
http://www.bioinformatics.babraham.ac.uk/projects/fastqc | |
The Babraham Bioinformatics group is the developer of FastQC. | |
http://cufflinks.cbcb.umd.edu/manual.html#cuffdiff | |
Cuffdiff's documentation Web site. | |
http://www.irods.org | |
iRODS is the core storage infrastructure for the iPlant Collaborative. | |
http://www.xsede.org | |
XSEDE is a high‐performance computing resource utilized by the Discovery Environment. | |
http://www.samtools.sourceforge.net/ | |
Available source code for SAMtools | |
http://www.github.com/najoshi/sabre | |
Available source code page for the Sabre application. | |
http://github.com/vsbuffalo/scythe | |
Available source code page for the Scythe application. | |
http://www.github.com/najoshi/sickle | |
Available source code page for the Sickle application. | |
http://hannonlab.cshl.edu/fastx_toolkit | |
FASTX‐Toolkit documentation Web site. | |
http://www.tophat.cbcb.umd.edu | |
TopHat documentation Web site. | |
http://compbio.mit.edu/cummeRbund/manual_2_0.html | |
CummeRbund documentation Web site. |