TPP SEWW Demo2012

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 00:17, 14 December 2011
Mmiller (Talk | contribs)
(Workflow at Institute for Systems Biology Demo)
← Previous diff
Revision as of 23:11, 17 December 2011
Mmiller (Talk | contribs)

Next diff →
Line 1: Line 1:
-= Workflow at Institute for Systems Biology Demo =+= Workflow at Institute for Systems Biology Setup and Demo =
 +== setting up WISB ==
 +=== 1. Download and install the TPP ===
-= Quick Start to data analysis using the TPP =+To install on your Windows system as a localhost, please follow our TPP [[Windows Installation Guide]], making sure that you select to download the latest version of TPP from
-== 1. Download and install the TPP ==+our [http://sourceforge.net/projects/sashimi/files/ Sourceforge download site].
-To install on your Windows system, please follow our [[Windows Installation Guide]], making sure that you select to download the file "TPP_Setup_v4_4_VUVUZELA_rev_1.exe" from our [http://sourceforge.net/projects/sashimi/files/ Sourceforge download site].+As a way to verify that the installation was successful, log into ''Petunia'' by double-clicking on the '''Trans-Proteomic Pipeline''' flower icon on your Desktop or through
-=== Log into Petunia, the TPP GUI ===+the '''Start''' menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl . You can use the credentials ''guest'' and
-As a way to verify that the installation was successful, log into ''Petunia'' by double-clicking on the '''Trans-Proteomic Pipeline''' flower icon on your Desktop or through the '''Start''' menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl . You can use the credentials ''guest'' and ''guest'' as user name and password to log in.+''guest'' as user name and password to log in.
-Once you are in the '''Home''' page, please select '''Tandem''' as the analysis pipeline, just below the ''Welcome'' message.+Make sure that the WEBSERVER_ROOT environment variable is set. From the start menu check ''Control Panel.System.Advanced system settings.Environment Variables...''. If you
-== 2. Download and install the test data and database ==+chose the default location for Petunia, WEBSERVER_ROOT should be ''c:/Inetpub/wwwroot''
-For this demo, we will be using a SILAC-labeled Yeast dataset, comprised of 2 runs on a high mass-accuracy Orbitrap instrument, along with a Yeast database appended with decoys. We also include a search parameters file.+=== 2. Install Jetty and WISB ===
-* If you would like to start the pipeline with the conversion of the vendor's raw data format to the open ''mzML'' format, you will need to install the free [http://www.microsoft.com/downloads/details.aspx?familyid=A5C84275-3B97-4AB7-A40D-3802B2AF5FC2&displaylang=en Microsoft Visual C++ 2008 SP1 Redistributable Package] and the [http://www.microsoft.com/downloads/details.aspx?FamilyID=ab99342f-5d1a-413d-8319-81da479ab0d7&displaylang=en Microsoft .NET Framework 3.5 SP1]. You can then [ftp://ftp:a@ftp.peptideatlas.org/pub/PeptideAtlas/Repository/TPP_Demo2009/TPP_Demo2009_RAW_data.zip download the raw demo data as a zip file] (309Mb) and unzip (you can obtain a free unzip utility, such as 7zip, from the web). You should find 2 files.+To install on your Windows systems, please go to [http://sourceforge.net/projects/sashimi/files/ Sourceforge download site] and select ''jetty-distribution-
-* If you would rather skip this conversion step, please [ftp://ftp:a@ftp.peptideatlas.org/pub/PeptideAtlas/Repository/TPP_Demo2009/TPP_Demo2009_mzML_data.zip download the pre-converted mzML files] (768Mb) and unzip.+7.3.0.v20110203wWISB.zip'' for download into the directory you wish to install Jetty (''c:\Jetty'' would be the standard directory). Right click on the zip file and select
-* Lastly, [ftp://ftp:a@ftp.peptideatlas.org/pub/PeptideAtlas/Repository/TPP_Demo2009/TPP_Demo2009_db_and_tandemParams.zip download the parameters and database files] (2.1Mb) and unzip.+extract all. This will create the subdirectory ''jetty-distribution-7.3.0.v20110203''. Now create the '''JETTY_HOME''' environment variable by going to ''Control
 + 
 +Panel.System.Advanced system settings.Environment Variables...'' from the start menu. In the ''Systems variables'' field click on ''New ...'' and set ''Variable name:'' to
 + 
 +'JETTY_HOST' and ''Variable value:'' to the directory Jetty was installed to (''c:\Jetty\jetty-distribution-7.3.0.v20110203'' by default).
 + 
 +The Jetty distribution in sashimi has been updated to contain the necessary WISB related files
 + 
 +=== 3. Configure WISB ===
 + 
 +To configure your Windows systems, please go to [http://sourceforge.net/projects/sashimi/files/ Sourceforge download site] and select ''wisb-svc-1.0.config.zip'' for download
 + 
 +into some temporary directory. Right click on it and select ''Extract all'' into the temp directory. The following two files should be present:
 + 
 +* createparamfile-1.0-SNAPSHOT.jar
 +* initial-modules-workflows.config.xml
 + 
 +Move ''createparamfile-1.0-SNAPSHOT.jar'' into $WEBSERVER_ROOT$\tpp-bin
 + 
 +Go to $JETTY_HOST$/ and double click on start.bat to start the Jetty server (double clinking on stop.bat will stop the service). Open Mozilla Firefox or Google Chrome and
 + 
 +copy the following URL into the address window: [http://localhost:8888/wisb-svc-1.0-SNAPSHOT/html/bootstrap.html]. Fill in a user name (no validation is done in this
 + 
 +version) and click on the ''bootstrap'' button. This will take you to the WISB start page. Click the ''Load config(s)'' button and in the window select ''initial-modules-
 + 
 +workflows.config.xml''. From the start page click on the ''Templates and workflows'' button and you should see, as a template, ''Tandem To Protein Prophet Template''.
 + 
 +== WISB Demo ==
 + 
 +=== 1. Download and install the test data and database ===
 + 
 +For this demo, we will be using a SILAC-labeled Yeast dataset, comprised of 2 runs on a high mass-accuracy Orbitrap instrument, along with a Yeast database appended with
 + 
 +decoys. We also include a search parameters file. This is the same as for the Petunia demo.
 + 
 +* [ftp://ftp:a@ftp.peptideatlas.org/pub/PeptideAtlas/Repository/TPP_Demo2009/TPP_Demo2009_mzML_data.zip download the mzML files] (768Mb) and unzip.
 + 
 +* Then, [ftp://ftp:a@ftp.peptideatlas.org/pub/PeptideAtlas/Repository/TPP_Demo2009/TPP_Demo2009_db_and_tandemParams.zip download the parameters and database files] (2.1Mb)
 + 
 +and unzip.
* Copy or move the ''yeast_orfs_all_REV.20060126.short.fasta'' file into the folder ''C:\Inetpub\wwwroot\ISB\data\dbase'' * Copy or move the ''yeast_orfs_all_REV.20060126.short.fasta'' file into the folder ''C:\Inetpub\wwwroot\ISB\data\dbase''
-* Copy or move the two data files (''OR20080317_S_SILAC-LH_1-1_01.raw'' and ''OR20080317_S_SILAC-LH_1-1_11.raw'') -- or the .mzML files if that is what you downloaded -- as well as the tandem parameters file ''tandem.xml'' into the folder ''C:\Inetpub\wwwroot\ISB\data\demo2009\tandem'' . Create this last folder if necessary.+* Copy or move the two data files (''OR20080317_S_SILAC-LH_1-1_01.mzML'' and ''OR20080317_S_SILAC-LH_1-1_11.mzML'') as well as the tandem parameters file ''tandem.xml'' into
-== 3. Convert raw data to the mzML format ==+the folder ''C:\Inetpub\wwwroot\ISB\data\demo2009\tandem''. Create this last folder if necessary.
-We have developed the TPP (and dozens of related tools) to read mass-spec data from a common, open data format. We must therefore first convert the proprietary raw data to this format, called '''mzML'''.+== 2. Setup the template ==
-::''If you downloaded the mzML files directly in step 2, skip to step 4.''+From the start page click on the ''Templates and workflows'' button.
-* Mouse-over the '''Analysis Pipeline (Tandem)''' portion of the navigation links near the top of the Petunia page; a pop-up menu should appear. Select the '''mzML''' item in this menu. 
-* Make sure the option '''Thermo RAW''' is selected as the instrument type you want to convert 
-* Click on the '''Add Files''' button in the first section; the File Chooser window will open. 
-* Click on the '''demo2009''' directory link on the right portion of the page. Then select '''tandem'''. 
-* Select both raw data files by clicking on the checkbox next to each, then on the '''Select''' button at the bottom. This should return you to the mzML page along with a confirmation of the files that you just selected. 
-* Leave the ''Conversion Options'' unchecked. 
-* Click on '''Convert to mzML'''; a wait page should appear. It takes about 30mins to convert two files. 
-* The ''Command Status'' box should automatically change color to orange when the conversions are done. 
-== 4. Search data with X!Tandem ==+''Tandem To Protein Prophet Template'' is owned by public and is not editable. You should select it and do a ''Save as'', choosing whatever name you would like and, for this
-A custom version of the popular open-source search engine X!Tandem is bundled and installed with the TPP. It has been modified from the original distribution by adding the ''K-Score'' scoring function, developed by a team at the Fred Hutchinson Cancer Research Center.+demo, as a template. Selecting the template brings up the graphical representation of the workflow diagram. Notice that the ASAPratio module is included since these are
-* First, make sure that ''Tandem'' is selected as the analysis pipeline.+SILAC based samples. Clicking on a component will bring up the parameter form for it, a right click will bring up a description.
-* Click the '''Database Search''' tab under Analysis Pipeline to access the X!Tandem search interface.+ 
-* Under '''Specify mzXML Input Files''', click '''Add Files''' and select the two ''mzML'' files present in the ''demo2009\tandem'' directory as input files for database searching.+As best practice, the template should have parameters set that do not vary for the particular MS/MS device that produced the files.
-* Similarly, under '''Specify Tandem Parameters File''' choose the Tandem parameters file called '''tandem.xml''' located in the same directory.+ 
 +=== Explore the template ===
 +By clicking on the different nodes, you can see how the components are linked together. In all the components that represent the TPP executables there will be an input file
 + 
 +that is linked to the previous component's output file and an output file that will be linked to by the next component's input file. Both these parameters are marked as
 + 
 +''value required by module'' since they always must be specified. The input value specifies what value it is linked to in the previous component. The ouput value has a
 + 
 +check box to ''lock value''. If this is checked for any parameter, then that parameter's value will not be able to be changed when the workflow is run.
 + 
 +The ASAP Ratio component has a few other parameters set to run XInteract as ASAP Ratio.
 + 
 +''ASAP Ratio'' parameters
 +* Click on the component and in the parameter form, chose the ''General Parameters'' tab
 +* Click on the arrow at the bottom of the form to bring up the ''Advanced parameters''
 +* Notice that the field in the upper left is set to not run ''PeptideProphet''
 +* Click on ''Back'' to return to the workflow diagram
 + 
 +* Click on the ''ASAP Parameters'' tab
 +* Notice that in the upper left, the ''ASAP Options'' parameter is specified.
 +* Click on ''Back'' to return to the workflow diagram
 + 
 +We will come back to this form later to set more parameters
 + 
 +''Pepxml'' parameters
 +* Click on Pepxml
 +* Notice that ''The output file'' has a default
 +* ''regex:|:{0}|:|PepXML Input File:|:(.+).tandem:|:$1.pep.xml'' by beginning with regex, this tells the workflow to use the ''PepXML Input File'' name and replace the
 + 
 +extension of ''tandem'' with ''pep.xml''
 +* Since this is not marked as locked by the workflow, you can change this if you want.
 +* Click on ''Back'' to return to the workflow diagram
 + 
 +=== Setup the initial input ===
 + 
 +''Template Params File''
:''This file defines the database search parameters that override the full set of default settings referenced in the file isb_default_input.'' :''This file defines the database search parameters that override the full set of default settings referenced in the file isb_default_input.''
 +:''In this example, the mass tolerance is set to -2.1 Da to 4.1 Da in the template parameter file, and the residue modification mass is set to 57.021464@C. A wide mass
-:''In this example, the mass tolerance is set to -2.1 Da to 4.1 Da, and the residue modification mass is set to 57.021464@C. A wide mass tolerance is used to include all the spectra with precursor m/z off by one or more isotopic separations; the high accuracy achieved by the instrument is then modeled by PeptideProphet with the accurate mass model.''+tolerance is used to include all the spectra with precursor m/z off by one or more isotopic separations; the high accuracy achieved by the instrument is then modeled by
 +PeptideProphet with the accurate mass model.''
:''For more information, please go to [http://www.thegpm.org/TANDEM/api/ TANDEM]'' :''For more information, please go to [http://www.thegpm.org/TANDEM/api/ TANDEM]''
-* Lastly, select a sequence database to search against. Navigate '''up''' to the '''dbase''' directory in the ''File Chooser'', and select the database file '''yeast_orfs_all_REV.20060126.short.fasta'''. 
-* Start the search by clicking on '''Run Tandem Search'''. The search needs about 25mins for two files. 
-=== Convert results to PepXML ===+* Click on ''Template Params File''
 +* In the form, click on the ''Browse'' button
 +* In the file browser, open the ''tandem'' folder and choose ''tandem.xml'' then click on the ''Select file'' button
 +* Back in the form, click on 'Save' and then 'Back' to return to the workflow diagram page
-Since each search engine provides results in different ways, the TPP requires that they be converted to a common format for downstream processing. This is the PepXML format, and can the conversion can be effected via the '''pepXML''' tab of the ''Analysis Pipeline''.+:''Note that the form states that the value is required to run the workflow''
-* Choose the two OR2008*.tandem files in the '''demo2009\tandem''' directory; these are the X!Tandem search results.+''Seq Database''
-* Click on '''Convert to PepXML'''.+:''This fasta file contains the Yeast protein sequences''
-== 5. Search data with SpectraST ==+* Click on ''Seq Database''
 +* In the form, click on the ''Browse'' button
 +* In the file browser, open the ''dbase'' folder and then the ''speclibs'' folder.
 +* Choose ''yeast_orfs_all_REV.20060126.short.fasta'' then click on the ''Select file'' button
 +* Back in the form, click on 'Save' and then 'Back' to return to the workflow diagram page
-''SpectraST'' is a search engine that compares acquired spectra against a library of pre-identified spectra to which peptide sequences have been assigned. In order to conduct the search, we must first download the appropriate spectral library.+''Runpath''
 +* Click on ''Runpath''
 +* Note that the form is grayed out and that the value is locked by the workflow.
 +* Whatever runs are generated initiating from this template, a directory will be created for the run files. The initial files and the files generated by X!Tandem will go in
-* Go to the '''Home''' page, and switch the pipeline type to '''SpectraST'''.+this subdirectory.
-* Under the ''SpectraST Tools'' section of the navigation menu, select the '''Download Spectral Libraries''' menu item.+* Click on 'Back' to return to the workflow diagram page
-* You are now at at page that shows a list of spectral libraries available at ''PeptideAtlas'', along with locally-installed/downloaded ones. Select the '''NIST_yeast_IT_v2.0_2008-07-11.splib.zip''' (yeast ion trap) library on the right pane, and click on '''Download Selected Libraries'''.+
 +''Mz X M L File'' will not be set yet, this will be the final input before the workflow is run.
-We also need to copy the ''mzML'' data files we converted in step 4 into the SpectraST data area. While this can be accomplished within Petunia, it is easier to use Windows file copy. ''Copy'' the two ''mzML'' files located at '''C:\Inetpub\wwwroot\ISB\data\demo2009\tandem''' into the directory '''C:\Inetpub\wwwroot\ISB\data\demo2009\spectrast''' (which you will need to create). Now we can move on to searching these data:+=== Setting ''Peptide Prophet'' parameters ===
-* Mouse-over the ''Analysis Pipeline'' menu title in ''Petunia'', and then click on the '''SpectraST Search''' menu item to access the SpectraST search interface.+There is one more set of parameters to be set based on the high accuracy of the MS/MS then the template can be saved as a workflow
-* In section 1, select the two mzML data files under ''demo2009\spectrast'' and click Add Files.+
-* For section 2, select the '''NIST_yeast_IT_v2.0_2008-07-11.splib.splib''' spectral library file located under '''dbase\speclibs'''. This is the file you downloaded from ''PeptideAtlas'' above.+
-* Finally, for section 3, select the '''yeast_orfs_all_REV.20060126.short.fasta''' sequence database, located under '''dbase'''.+
-* Leave the rest of the options on the page at their default values, and click on '''Run SpectraST''' to initiate the search.+
-== 6. Validation of Peptide-Spectrum assignments with PeptideProphet ==+''Peptide Prophet'' parameters
 +* Click on the ''Peptide Prophet'' module
 +* Click on the ''Peptide Prophet'' tab
 +* Find the ''PeptideProphet options'' parameter in the upper left
 +* Check the ''specify options?'' check box
 +* Since this template is being set up for high accuracy MS/MS, also check the ''lock value'' check box
 +* Find the ''use accurate mass binning in PeptideProphet'' parameter in the third column
 +* Check the ''specify options?'' check box
 +* Also check the ''lock value'' check box
 +* Click on 'Back' to return to the workflow diagram page
-''PeptideProphet'' provides statistical validation of search engine results by assigning a probability to each peptide-spectrum match. 
-* Click on the '''Analyze Peptides''' tab under the ''Analysis Pipeline'' section in ''Petunia'' to access the ''xinteract'' interface.+=== Saving as a workflow ===
-''xinteract'' is a general utility that is able to launch+
-several components of the TPP, including ''PeptideProphet''.+
-* Select the two '''OR2008*.pep.xml''' files in the directory '''demo2009\tandem'''. Make sure that there are only two files selected for analysis; you can edit the selections using the checkboxes and ''Remove'' button on the right-hand side.+
-* Under ''PeptideProphet Options'', find and select the option to '''Use accurate mass binning''' since this is a high-accuracy data.+
-* Leave all other options set to their defaults, and click on '''Run XInteract''' at the bottom of the page to run ''PeptideProphet''.+
-* Once the command finishes running, you can click on the '''view results''' link that appears in the ''Command Status'' box to view and analyze the results. [http://tools.proteomecenter.org/wiki/index.php?title=Image:PeptideProphet.JPG IMG:PepProphet] On this page, sort the list in descending order based on Probabilities. The identifications at the top of the resulting list are most likely to be correct. Click on the hypertext link for any probability. This brings up a details page [http://tools.proteomecenter.org/wiki/index.php?title=Image:PlotModel.JPG IMG:PlotModel] which shows graphically how successful the modeling was. In the upper pane, it is desirable for the red curve (sensitivity) to hug the upper right corner, and for the green curve (error) to hug the lower left corner. The lower pane shows how well the data (black line) follows the PeptideProphet modeling for each charge state. The blue curve describes the modeling of the negative results, and the purple one, the positive results. If these two curves are well separated and fit the black line well, then the analysis for that charge state was successful.+
-* You can now go back run this analysis on the ''SpectraST'' results. Again, make sure you are only analyzing two input files.+
-== 7. Visualize LC-MS/MS data using Pep3D ==+Now save the template as a workflow. The template is now set up for high accuracy MS/MS. For this workflow we will setup the parameters for how the SILAC samples were
-''Pep3D'' is a tool for visualizing LC MS data, along with results from ''PeptideProphet''.+prepared. If different isotopes are used for other SILAC runs, then the template can be saved as another workflow to capture that difference.
-* Under the ''Utilities -> Browse Files'' section in ''Petunia'', navigate to the '''demo2009\tandem''' directory. (nB. you may already be in that directory.) You may also select the ''interact.pep.xml'' file under the ''spectrast'' folder.+* Choose ''Save as''
-* Open the ''PeptideProphet'' results file by clicking on the '''[ PepXML ]''' link next to the file named '''interact.pep.xml'''. This will launch the ''PepXMLViewer'' application.+* Fill in ''MyWorkflow'' for the Name
-* Click on the '''Other Actions''' top-level tab, and then on the '''Generate Pep3D''' button. A new window will launch the ''Pep3D'' viewer.+* Choose the ''Workflow'' button
-* Leave the default options (or change to taste) and click on the '''Generate Pep3D Image''' button.+* Click the ''Save'' button
-* After a few moments, you should see two images displayed on the page, one per ''mzML'' input file. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Pep3D.JPG IMG:Pep3D]+
-== 8. Further peptide-level validation iProphet ==+=== ''ASAP Ratio'' parameters ===
-:''iProphet'' (or ''InterProphet'') is a tool that provides statistical refinement of ''PeptidePropet'' results.+The sample specific parameters can now be set for ''ASAP Ratio''
-* Click on the '''Combine Analyses''' tab under the ''Analysis Pipeline'' section in ''Petunia'' to access the ''iProphet'' interface.+* Click on the ''ASAP Ratio'' module
-* Select the '''interact.pep.xml''' file in the directory '''demo2009\tandem''', as well as the file of the same name under the '''demo2009\spectrast''' directory. Make sure that there are two files selected for analysis; you can edit the selections using the checkboxes and ''Remove'' button on the right-hand side.+* Click on the ''ASAP Parameters'' tab
-* Under ''Output File and Location'', make sure that the ''File path (folder)'' is set to '''c:/Inetpub/wwwroot/ISB/data/demo2009'''. You may have to edit out part of the default value that is first shown.+
-* Leave all other options set to their defaults, and click on '''Run InterProphet''' at the bottom of the page to run ''iProphet''.+''change labeled residues'' parameter
-* Once the command finishes running, you can click on the '''view results''' link that appears in the ''Command Status'' box to view and analyze the results. [http://tools.proteomecenter.org/wiki/index.php?title=Image:iprophet.JPG IMG:iprophet]+* Find the ''change labeled residues'' parameter at the top of the second column
 +* Check the ''value needed?'' check box to let the workflow know this value will need to be set in order to run the workflow
 +* Select ''K'' and ''R'' from the dropdown
 +* leave ''lock value'' unchecked to allow this to be changed for a run if desired
-== 9. Peptide Quantitation with ASAPRatio ==+''range around precusor m/z to search for peak'' parameter
 +* Find the ''range around precusor m/z to search for peak'' parameter in the middle of the first column
 +* Check the ''value needed?'' check box
 +* Select ''K'' and ''R'' from the dropdown
-:''ASAPRatio'' is a tool for measuring relative expression levels of peptides and proteins from isotopically-labeled samples (e.g. ICAT, SILAC, etc).+''specified label masses'' parameter
 +* Find the ''specified label masses (e.g. M74.325Y125.864), only relevant for static modification quantification'' parameter at the bottom of the third column
 +* Check the ''value needed?'' check box
 +* Set the following three amino acid values:
 +** Pick ''M'' from the ''Select one:'' dropdown and set a value of '''147.035''' and click ''Add choice''
 +** Pick ''K'' from the ''Select one:'' dropdown and set a value of '''136.10916''' and click ''Add choice''
 +** Pick ''R'' from the ''Select one:'' dropdown and set a value of '''166.10941''' and click ''Add choice''
 + 
 +* Click the ''Save'' button
 + 
 +=== Save as a compiled workflow ===
 + 
 +Now save the workflow as a compiled workflow. This will lock in the choices made and allow the compiled workflow to be run as many times as desired.
 + 
 +* Click the 'Compile workflow' button
 +* Name the workflow ''MyCompiledWorkflow''
 +* Click the ''Compile'' button
 + 
 +This will take you to the ''Compiled Workflows'' form and allow you to setup and run a workflow.
 + 
 +=== Run a workflow ===
 + 
 +Setup and run a workflow
 + 
 +* Hilight''My Compiled Workflow'' in the Compiled Workflows page
 +* Click the ''Setup run'' button
 +* In the ''Setup to run'' dialog, name the run ''MyFirstRun''
 +* Click the ''Setup'' button
 + 
 +Because the number of parameters that are required are drastically reduced at this point, the tabs reflect the components of the workflow.
 + 
 +''ASAP Ratio''
 +* Click on the ''ASAP Ratio'' tab
 +* Notice that we see the parameters that were required but not locked. Thes parameters can be changed
 +* Click the arrow at the bottom of the form
 +* Notice that this brings up the form that shows the values that are locked for this module.
 + 
 +''MzXML File''
 +* Now click on the ''MzXML File'' tab to add the final paramter value to run the workflow
 +* Click the ''Browse'' button
 +* Open the ''tandem'' folder
 +* Hilight ''OR20080317_S_SILAC-LH_1-1_01.mzML''
 +* Click the ''Select files'' button
 +* Back on the parameter form clcik ''Save''
 + 
 +''Run the Workflow''
 +* Click the ''Run'' button
 + 
 +This brings up the workflow diagram which updates to show a graphical depiction of the workflow running. When it completes it will go to the output files form with a
 + 
 +directory tree of where the output files ahve been saved.
 + 
 +=== Exploring the output ===
 +'''''Place holder for now'''''
 + 
 +You can run the visualization and analysis tools in the output files form.
 + 
 +''PeptideProphet'' results
 + 
 +* Open the ''peptide_prophet'' directory
 +* Click on ''interact.pep.shtml'' file
 +* This will open a viewer. Sort the list in descending order based on Probabilities. The identifications at the top of the resulting list are most likely to be correct.
 + 
 +Click on the hypertext link for any probability. This brings up a details page [http://tools.proteomecenter.org/wiki/index.php?title=Image:PlotModel.JPG IMG:PlotModel] which
 + 
 +shows graphically how successful the modeling was. In the upper pane, it is desirable for the red curve (sensitivity) to hug the upper right corner, and for the green curve
 + 
 +(error) to hug the lower left corner. The lower pane shows how well the data (black line) follows the PeptideProphet modeling for each charge state. The blue curve describes
 + 
 +the modeling of the negative results, and the purple one, the positive results. If these two curves are well separated and fit the black line well, then the analysis for that
 + 
 +charge state was successful.
 +* Click the ''Close'' button
 + 
 +* Click on the ''interact.pep.xml'' file
 +* This will open the file using the ''PepXML viewer'' application.
 +* Click on the '''Other Actions''' top-level tab, and then on the '''Generate Pep3D''' button. A new window will launch the ''Pep3D'' viewer.
 +* Leave the default options (or change to taste) and click on the '''Generate Pep3D Image''' button.
 +* After a few moments, you should see an image displayed on the page for per ''mzML'' input file. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Pep3D.JPG
-* Click on the '''Analyze Peptides''' tab under the ''Analysis Pipeline'' section in ''Petunia'' to access the ''xinteract'' interface again.+''Inter Prophet'' results
-* Select the '''interact.iproph.pep.xml''' file in the directory '''demo2009'''. Make sure that there is only one file selected for analysis; you can edit the selections using the checkboxes and ''Remove'' button on the right-hand side.+
-* '''Important''': Make sure that you '''uncheck''' the option to ''RUN PeptideProphet'' under ''PeptideProphet Options'', as this file already contains results from PeptideProphet.+
-* Under ''ASAPRatio Options'', select to '''RUN ASAPRatio''', change ''Labeled Residues'' to '''K''' and '''R''', set ''m/z range to include in summation of peak'' to '''0.05''', set ''Specified masses'' to '''M 147.035, K 136.10916,''' and '''R 166.10941'''.+
-* Leave all other options set to their defaults, and click on '''Run XInteract''' at the bottom of the page to run ''ASAPRatio''.+* Open the ''inter_prophet'' directory
-* Once the command finishes running (about 4 hrs), you can click on the '''view results''' link that appears in the ''Command Status'' box to view and analyze the results. The “asapratio” column contains quantitation results with a link to the ASAPRatio ion trace. The number listed in the “asapratio” column is the light to heavy ratio. [http://tools.proteomecenter.org/wiki/index.php?title=Image:ASAPRatioProfiles.png IMG:ASAPRatioProfiles]+* Click on the ''interact.pep.shtml'' file to view and analyze the results. [http://tools.proteomecenter.org/wiki/index.php?title=Image:iprophet.JPG IMG:iprophet]
 +* Click the ''Close'' button
-== 10. Protein-level validation with ProteinProphet ==+''ASAP Ratio'' results
-''ProteinProphet'' is a tool that provides statistical validation of Protein identifications, and is based on ''PeptideProphet'' results.+* Open the ''asap_ratio'' directory
 +* Click on the ''asap.pep.shtml'' file to view and analyze the results. The “asapratio” column contains quantitation results with a link to the ASAPRatio ion trace. The number listed in the “asapratio” column is the light to heavy ratio. [http://tools.proteomecenter.org/wiki/index.php?title=Image:ASAPRatioProfiles.png IMG:ASAPRatioProfiles]
 +* Click the ''Close'' button
-* Click on the '''Analyze Proteins''' tab under the ''Analysis Pipeline'' section in ''Petunia'' to access the ''ProteinProphet'' interface.+''Protein Prophet'' results
-* Select the '''interact.pep.xml''' file in the directory '''demo2009'''. Make sure that there is only one file selected for analysis; you can edit the selections using the checkboxes and ''Remove'' button on the right-hand side.+
-* Leave all other options set to their defaults, and click on '''Run ProteinProphet''' at the bottom of the page to run ''ProteinProphet''.+
-* once the command finishes running, you can click on the view results link that appears in the Command Status box to view and analyze the results. Protein groups are sorted in descending order by Probability so that the groups at the top of the page are the most confident identifications. The protein probabilities are the red numbers listed next to each protein group. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Protxml.JPG IMG:Protxml]+* Open the ''protein_prophet'' directory
 +* Click on the ''asap.pep.shtml'' file to view and analyze the results. Protein groups are sorted in descending order by Probability so that the groups at the top of the page are the most confident identifications. The protein probabilities are the red numbers listed next to each protein group. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Protxml.JPG IMG:Protxml]
 +* Click the ''Close'' button

Revision as of 23:11, 17 December 2011

Contents

Workflow at Institute for Systems Biology Setup and Demo

setting up WISB

1. Download and install the TPP

To install on your Windows system as a localhost, please follow our TPP Windows Installation Guide, making sure that you select to download the latest version of TPP from

our Sourceforge download site.

As a way to verify that the installation was successful, log into Petunia by double-clicking on the Trans-Proteomic Pipeline flower icon on your Desktop or through

the Start menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl . You can use the credentials guest and

guest as user name and password to log in.

Make sure that the WEBSERVER_ROOT environment variable is set. From the start menu check Control Panel.System.Advanced system settings.Environment Variables.... If you

chose the default location for Petunia, WEBSERVER_ROOT should be c:/Inetpub/wwwroot

2. Install Jetty and WISB

To install on your Windows systems, please go to Sourceforge download site and select jetty-distribution-

7.3.0.v20110203wWISB.zip for download into the directory you wish to install Jetty (c:\Jetty would be the standard directory). Right click on the zip file and select

extract all. This will create the subdirectory jetty-distribution-7.3.0.v20110203. Now create the JETTY_HOME environment variable by going to Control

Panel.System.Advanced system settings.Environment Variables... from the start menu. In the Systems variables field click on New ... and set Variable name: to

'JETTY_HOST' and Variable value: to the directory Jetty was installed to (c:\Jetty\jetty-distribution-7.3.0.v20110203 by default).

The Jetty distribution in sashimi has been updated to contain the necessary WISB related files

3. Configure WISB

To configure your Windows systems, please go to Sourceforge download site and select wisb-svc-1.0.config.zip for download

into some temporary directory. Right click on it and select Extract all into the temp directory. The following two files should be present:

  • createparamfile-1.0-SNAPSHOT.jar
  • initial-modules-workflows.config.xml

Move createparamfile-1.0-SNAPSHOT.jar into $WEBSERVER_ROOT$\tpp-bin

Go to $JETTY_HOST$/ and double click on start.bat to start the Jetty server (double clinking on stop.bat will stop the service). Open Mozilla Firefox or Google Chrome and

copy the following URL into the address window: [1]. Fill in a user name (no validation is done in this

version) and click on the bootstrap button. This will take you to the WISB start page. Click the Load config(s) button and in the window select initial-modules-

workflows.config.xml. From the start page click on the Templates and workflows button and you should see, as a template, Tandem To Protein Prophet Template.

WISB Demo

1. Download and install the test data and database

For this demo, we will be using a SILAC-labeled Yeast dataset, comprised of 2 runs on a high mass-accuracy Orbitrap instrument, along with a Yeast database appended with

decoys. We also include a search parameters file. This is the same as for the Petunia demo.

and unzip.

  • Copy or move the yeast_orfs_all_REV.20060126.short.fasta file into the folder C:\Inetpub\wwwroot\ISB\data\dbase
  • Copy or move the two data files (OR20080317_S_SILAC-LH_1-1_01.mzML and OR20080317_S_SILAC-LH_1-1_11.mzML) as well as the tandem parameters file tandem.xml into

the folder C:\Inetpub\wwwroot\ISB\data\demo2009\tandem. Create this last folder if necessary.

2. Setup the template

From the start page click on the Templates and workflows button.


Tandem To Protein Prophet Template is owned by public and is not editable. You should select it and do a Save as, choosing whatever name you would like and, for this

demo, as a template. Selecting the template brings up the graphical representation of the workflow diagram. Notice that the ASAPratio module is included since these are

SILAC based samples. Clicking on a component will bring up the parameter form for it, a right click will bring up a description.

As best practice, the template should have parameters set that do not vary for the particular MS/MS device that produced the files.

Explore the template

By clicking on the different nodes, you can see how the components are linked together. In all the components that represent the TPP executables there will be an input file

that is linked to the previous component's output file and an output file that will be linked to by the next component's input file. Both these parameters are marked as

value required by module since they always must be specified. The input value specifies what value it is linked to in the previous component. The ouput value has a

check box to lock value. If this is checked for any parameter, then that parameter's value will not be able to be changed when the workflow is run.

The ASAP Ratio component has a few other parameters set to run XInteract as ASAP Ratio.

ASAP Ratio parameters

  • Click on the component and in the parameter form, chose the General Parameters tab
  • Click on the arrow at the bottom of the form to bring up the Advanced parameters
  • Notice that the field in the upper left is set to not run PeptideProphet
  • Click on Back to return to the workflow diagram
  • Click on the ASAP Parameters tab
  • Notice that in the upper left, the ASAP Options parameter is specified.
  • Click on Back to return to the workflow diagram

We will come back to this form later to set more parameters

Pepxml parameters

  • Click on Pepxml
  • Notice that The output file has a default
  • regex:|:{0}|:|PepXML Input File:|:(.+).tandem:|:$1.pep.xml by beginning with regex, this tells the workflow to use the PepXML Input File name and replace the

extension of tandem with pep.xml

  • Since this is not marked as locked by the workflow, you can change this if you want.
  • Click on Back to return to the workflow diagram

Setup the initial input

Template Params File

This file defines the database search parameters that override the full set of default settings referenced in the file isb_default_input.
In this example, the mass tolerance is set to -2.1 Da to 4.1 Da in the template parameter file, and the residue modification mass is set to 57.021464@C. A wide mass

tolerance is used to include all the spectra with precursor m/z off by one or more isotopic separations; the high accuracy achieved by the instrument is then modeled by

PeptideProphet with the accurate mass model.

For more information, please go to TANDEM
  • Click on Template Params File
  • In the form, click on the Browse button
  • In the file browser, open the tandem folder and choose tandem.xml then click on the Select file button
  • Back in the form, click on 'Save' and then 'Back' to return to the workflow diagram page
Note that the form states that the value is required to run the workflow

Seq Database

This fasta file contains the Yeast protein sequences
  • Click on Seq Database
  • In the form, click on the Browse button
  • In the file browser, open the dbase folder and then the speclibs folder.
  • Choose yeast_orfs_all_REV.20060126.short.fasta then click on the Select file button
  • Back in the form, click on 'Save' and then 'Back' to return to the workflow diagram page

Runpath

  • Click on Runpath
  • Note that the form is grayed out and that the value is locked by the workflow.
  • Whatever runs are generated initiating from this template, a directory will be created for the run files. The initial files and the files generated by X!Tandem will go in

this subdirectory.

  • Click on 'Back' to return to the workflow diagram page

Mz X M L File will not be set yet, this will be the final input before the workflow is run.

Setting Peptide Prophet parameters

There is one more set of parameters to be set based on the high accuracy of the MS/MS then the template can be saved as a workflow

Peptide Prophet parameters

  • Click on the Peptide Prophet module
  • Click on the Peptide Prophet tab
  • Find the PeptideProphet options parameter in the upper left
  • Check the specify options? check box
  • Since this template is being set up for high accuracy MS/MS, also check the lock value check box
  • Find the use accurate mass binning in PeptideProphet parameter in the third column
  • Check the specify options? check box
  • Also check the lock value check box
  • Click on 'Back' to return to the workflow diagram page


Saving as a workflow

Now save the template as a workflow. The template is now set up for high accuracy MS/MS. For this workflow we will setup the parameters for how the SILAC samples were

prepared. If different isotopes are used for other SILAC runs, then the template can be saved as another workflow to capture that difference.

  • Choose Save as
  • Fill in MyWorkflow for the Name
  • Choose the Workflow button
  • Click the Save button

ASAP Ratio parameters

The sample specific parameters can now be set for ASAP Ratio

  • Click on the ASAP Ratio module
  • Click on the ASAP Parameters tab

change labeled residues parameter

  • Find the change labeled residues parameter at the top of the second column
  • Check the value needed? check box to let the workflow know this value will need to be set in order to run the workflow
  • Select K and R from the dropdown
  • leave lock value unchecked to allow this to be changed for a run if desired

range around precusor m/z to search for peak parameter

  • Find the range around precusor m/z to search for peak parameter in the middle of the first column
  • Check the value needed? check box
  • Select K and R from the dropdown

specified label masses parameter

  • Find the specified label masses (e.g. M74.325Y125.864), only relevant for static modification quantification parameter at the bottom of the third column
  • Check the value needed? check box
  • Set the following three amino acid values:
    • Pick M from the Select one: dropdown and set a value of 147.035 and click Add choice
    • Pick K from the Select one: dropdown and set a value of 136.10916 and click Add choice
    • Pick R from the Select one: dropdown and set a value of 166.10941 and click Add choice
  • Click the Save button

Save as a compiled workflow

Now save the workflow as a compiled workflow. This will lock in the choices made and allow the compiled workflow to be run as many times as desired.

  • Click the 'Compile workflow' button
  • Name the workflow MyCompiledWorkflow
  • Click the Compile button

This will take you to the Compiled Workflows form and allow you to setup and run a workflow.

Run a workflow

Setup and run a workflow

  • HilightMy Compiled Workflow in the Compiled Workflows page
  • Click the Setup run button
  • In the Setup to run dialog, name the run MyFirstRun
  • Click the Setup button

Because the number of parameters that are required are drastically reduced at this point, the tabs reflect the components of the workflow.

ASAP Ratio

  • Click on the ASAP Ratio tab
  • Notice that we see the parameters that were required but not locked. Thes parameters can be changed
  • Click the arrow at the bottom of the form
  • Notice that this brings up the form that shows the values that are locked for this module.

MzXML File

  • Now click on the MzXML File tab to add the final paramter value to run the workflow
  • Click the Browse button
  • Open the tandem folder
  • Hilight OR20080317_S_SILAC-LH_1-1_01.mzML
  • Click the Select files button
  • Back on the parameter form clcik Save

Run the Workflow

  • Click the Run button

This brings up the workflow diagram which updates to show a graphical depiction of the workflow running. When it completes it will go to the output files form with a

directory tree of where the output files ahve been saved.

Exploring the output

Place holder for now

You can run the visualization and analysis tools in the output files form.

PeptideProphet results

  • Open the peptide_prophet directory
  • Click on interact.pep.shtml file
  • This will open a viewer. Sort the list in descending order based on Probabilities. The identifications at the top of the resulting list are most likely to be correct.

Click on the hypertext link for any probability. This brings up a details page IMG:PlotModel which

shows graphically how successful the modeling was. In the upper pane, it is desirable for the red curve (sensitivity) to hug the upper right corner, and for the green curve

(error) to hug the lower left corner. The lower pane shows how well the data (black line) follows the PeptideProphet modeling for each charge state. The blue curve describes

the modeling of the negative results, and the purple one, the positive results. If these two curves are well separated and fit the black line well, then the analysis for that

charge state was successful.

  • Click the Close button
  • Click on the interact.pep.xml file
  • This will open the file using the PepXML viewer application.
  • Click on the Other Actions top-level tab, and then on the Generate Pep3D button. A new window will launch the Pep3D viewer.
  • Leave the default options (or change to taste) and click on the Generate Pep3D Image button.
  • After a few moments, you should see an image displayed on the page for per mzML input file. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Pep3D.JPG

Inter Prophet results

  • Open the inter_prophet directory
  • Click on the interact.pep.shtml file to view and analyze the results. IMG:iprophet
  • Click the Close button

ASAP Ratio results

  • Open the asap_ratio directory
  • Click on the asap.pep.shtml file to view and analyze the results. The “asapratio” column contains quantitation results with a link to the ASAPRatio ion trace. The number listed in the “asapratio” column is the light to heavy ratio. IMG:ASAPRatioProfiles
  • Click the Close button

Protein Prophet results

  • Open the protein_prophet directory
  • Click on the asap.pep.shtml file to view and analyze the results. Protein groups are sorted in descending order by Probability so that the groups at the top of the page are the most confident identifications. The protein probabilities are the red numbers listed next to each protein group. IMG:Protxml
  • Click the Close button
Personal tools