TPP SEWW Demo2012

From SPCTools

Revision as of 22:55, 8 March 2013; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Contents

Simple Executable Workflow Webapp Setup and Demo

Setting up SEWW

1. Download and install the TPP

If TPP is not already installed on your Windows system, to install as a localhost, please follow our TPP Windows Installation Guide, making sure that you select to download the latest version of TPP from our Sourceforge download site (111Mb).

As a way to verify that the installation was successful, log into Petunia by double-clicking on the Trans-Proteomic Pipeline flower icon on your Desktop or through the Start menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl. You can use the credentials guest and guest as user name and password to log in.

Make sure that the WEBSERVER_ROOT environment variable is set. From the start menu check Control Panel.System.Advanced system settings.Environment Variables.... If you chose the default location for Petunia, WEBSERVER_ROOT should be c:/Inetpub/wwwroot

2. Install Jetty and SEWW

To install on your Windows systems, please go to Sourceforge download site and from the SEWW folder select jetty-distribution-7.3.0.v20110203-SEWW-0.5.zip for download (36Mb) into the folder you wish to install Jetty (c:\Jetty would be the standard folder). Right click on the zip file and select extract all. This will create the subfolder jetty-distribution-7.3.0.v20110203-SEWW-0.5. Now create the JETTY_HOME environment variable by going to Control Panel.System.Advanced system settings.Environment Variables... from the start menu. In the Systems variables field click on New ... and set Variable name: to 'JETTY_HOME' and Variable value: to the folder Jetty was installed to (c:\Jetty\jetty-distribution-7.3.0.v20110203-SEWW-0.5 by default).

The Jetty distribution in sashimi has been updated to contain the necessary SEWW related files.

You can decide to manually start Jetty when needed (which includes after rebooting your machine) or you can set up Jetty as a Windows service.

To start manually, open the JETTY_HOME folder (c:\jetty\jetty-distribution-7.3.0.v20110203-SEWW-0.5 by default) and double click on the file named start of type Windows Batch file. Double clicking on the Windows Batch file stop will stop the service. On Vista or Windows 7 you may need to right click on start, stop, and the start ini file and click on Unblock.

To set up Jetty as a service, go to Windows Service Wrapper and follow the instructions there. Make sure to follow the instructions at Tanuki Software to download the executable Jetty-Service.exe

Note that Java is required for Jetty to run and the JAVA_HOME environment variable must be set. SEWW has been extensively tested with Java 6.27. To verify that Java is installed, on Windows 7, go to Control Panel.Programs and Features and look for Java(TM) Update .... The version will be the listed at the end of the column. If there are multiple versions, use the latest. To find the installed location look for the Location column. If you don't see the Location column, click on the column headers, click More and select Location. Write the location down and go to Control Panel.System.Advanced system settings.Environment Variables... from the start menu. If JAVA HOME is not listed, create it and make sure the value is the location from Control Panel.Programs and Features. If you need to install Java, go to Free Java Download, follow the instructions there, then set the JAVA_HOME environment variable.

3. Configure SEWW

To configure your Windows systems, please go to Sourceforge download site and from the SEWW folder select seww-svc-0.5.config.zip for download (5Mb) into $WEBSERVER_ROOT$ ("C:\Inetpub\wwwroot" by default). Right click on it and select Extract all into a temp folder under $WEBSERVER_ROOT$ (create folder "C:\Inetpub\wwwroot\temp" by default to extract to). The following two files should be present:

  • createparamfile-0.5-SNAPSHOT.jar
  • simple_template_complete.xml

Move createparamfile-0.5-SNAPSHOT.jar into $WEBSERVER_ROOT$\..\tpp-bin (by default C:\Inetpub\tpp-bin).

Make sure that you have started the Jetty server. Open Internet Explorer, Mozilla Firefox or Google Chrome and copy the following URL into the address window: http://localhost:8888/seww-svc-0.5-SNAPSHOT/html/bootstrap.html. Fill in a user name (no validation is done in this version) and click on the Start SEWW button. This will take you to the SEWW start page. Click the Load config(s) button and click the Select configuration file... button. In the window tree view, navigate to the temporary folder and select simple_template_complete.xml. Now in the Specify Java bean name: field type register then click the Upload button.

From the start page click on the Templates and workflows button and you should now see the template Tandem To Protein Prophet Template. Loading this simple_template_complete.xml configuration file only needs to be done once.

4. Advanced SEWW Configuration

For advanced users.

SEWW has two environmental variables that can be set to maximize the use of the server's resources.

Since SEWW makes lightweight use of threads, it will automatically set the number of available threads to five times the number of CPUs or thirty-two, depending on which is smaller. This can be set lower, if that number is too high and causes the server to become CPU-bound, by using the environmental variable SEWW_NUM_THREADS. This can also be set higher, up to a maximum of five times the number of processors. Click on Control Panel.System.Advanced system settings.Environmental variables... from the start menu. Choose the New... button from the Systems variables and set SEWW_NUM_THREADS as the Variable name and the number of desired threads for the Variable value. While the server is running workflows, the CPU usage can be monitored by right clinking the taskbar, choosing Start Task Manager and then the Performance tab.

The other resource that influences the server's responsiveness is its ability to handle input/output operations (IO). If too many processes attempt to access the disk simultaneously, the server's performance can slow down to a crawl. But if the server can handle more than one heavily IO-bound process in a workflow (the server default), speed-up can be obtained by setting the SEWW_MAX_SIMULTANEOUS_INTENSIVE_THREADS environmental variable to the highest number possible that doesn't cause the server performance to degrade. To set SEWW_MAX_SIMULTANEOUS_INTENSIVE_THREADS follow the same steps as above for SEWW_NUM_THREADS. To observe the IO load on the server, go to the Task Manager's Performance Tab and click on the Resource Monitor... button and look at the Disk statistics.

Note that in order for these settings to take effect, you must either restart Jetty from a new application or open up a new instance of the browser, go to the SEWW home page, and click the Reset server button. The new environmental variables will not be visible to a currently running application.

Another value that can be set is in the start configuration file at the JETTY_HOME location (c:\Jetty\jetty-distribution-7.3.0.v20110203-SEWW-0.5 by default). If you are running 64x Windows and have 4GB of RAM or more, the amount of memory allocated for the server can be increased. Modify the value of the Java -Xmx parameter to 3000m or whatever value allows Jetty and SEWW to run on the server without interfering with other processes.

SEWW Tutorial

1. Download and install the test data and database

For this tutorial, we will be using a SILAC-labeled Yeast dataset, comprised of 2 runs on a high mass-accuracy Orbitrap instrument, along with a Yeast database appended with decoys. We also include a search parameters file. This is the same as for the Petunia tutorial.

and unzip.

  • Copy or move the yeast_orfs_all_REV.20060126.short.fasta file into the folder C:\Inetpub\wwwroot\ISB\data\dbase\speclibs creating the folders if necessary
  • Copy or move the two data files (OR20080317_S_SILAC-LH_1-1_01.mzML and OR20080317_S_SILAC-LH_1-1_11.mzML) as well as the tandem parameters file tandem.xml into the folder C:\Inetpub\wwwroot\ISB\data\demo2009\tandem. Create this last folder if necessary.

2. Setup the template

TandemToProteinProphetTemplate in the Templates grid is owned by public and is not editable. You should click the Select button and do a Save as from the graphical view of the workflow, choosing whatever name you would like and, for this tutorial, as a template. Selecting the template brings up the graphical representation of the workflow diagram. Notice that the name of the template on the title bar has changed to the name of your new template. Also notice that the ASAPRatio module is included since these are SILAC based samples. Clicking on a component will bring up the parameter form for it, a right click will bring up a description.

As best practice, the template should have parameters set that do not vary for the particular MS/MS device that produced the files.

3. Explore the template

By clicking on the different nodes, you can see how the components are linked together to run. In all the components that represent the TPP executables there will be an input file that is linked to the previous component's output file and an output file that will be linked to by the next component's input file. Both these parameters are marked as value required by module since they always must be specified. The input value specifies what value it is linked to in the previous component. The ouput value has a check box to lock value. If this is checked for any parameter, then that parameter's value will not be able to be changed when the workflow is setup to run.

The ASAPRatio component has a few other parameters set to run XInteract as ASAPRatio.

ASAPRatio parameters

  • Click on the component and in the parameter form, chose the General Parameters tab
  • Click on the arrow at the bottom of the form to bring up the Advanced parameters
  • Notice that the field in the upper left is set to not run PeptideProphet
  • Click on the ASAP Parameters tab
  • Notice that in the upper left, the ASAP Options parameter is specified
  • Click on Back to return to the workflow diagram

We will come back to this form later to set more parameters.

Pepxml parameters

  • Click on NestedTandem
  • This will bring up the graphical representation of the nested workflow for running multiple instances of X!Tandem
  • Click on Pepxml
  • Notice that The output file has a default
  • regex:|:{0}|:|PepXML Input File:|:(.+).tandem:|:$1.pep.xml by beginning with regex, this tells the workflow to use the PepXML Input File name and replace the extension of tandem with pep.xml
  • Since this is not marked as locked by the workflow, you can change this if you want.
  • Click on Back to return to the nested workflow diagram
  • Click on Parent wokflow to return to the top-level workflow

4. Setup the initial input

Template Params File

This file defines the database search parameters that override the full set of default settings referenced in the file isb_default_input.
In this example, the mass tolerance is set to -2.1 Da to 4.1 Da in the template parameter file, and the residue modification mass is set to 57.021464@C. A wide mass tolerance is used to include all the spectra with precursor m/z off by one or more isotopic separations; the high accuracy achieved by the instrument is then modeled by PeptideProphet with the accurate mass model.
For more information, please go to TANDEM
  • Click on TemplateParamsFile
  • In the form, click on the Browse button
  • In the file browser, open the demo2009 folder and then the tandem folder and choose tandem.xml then click on the Select file button
Note that the form states that the value is required to run the workflow
  • Back in the form, click 'Save' and then 'Back' to return to the workflow diagram page

SeqDatabase

This fasta file contains the Yeast protein sequences
  • Click on SeqDatabase
  • In the form, click on the Browse button
  • In the file browser, open the dbase folder and then the speclibs folder.
  • Choose yeast_orfs_all_REV.20060126.short.fasta then click on the Select file button
  • Click 'Save' and then 'Back' to return to the workflow diagram page

Runpath

  • Click on Runpath
  • Note that the form is grayed out and that the value is locked by the workflow.
  • Whatever runs are generated initiating from this template, a folder will be created for the run files. The initial files and the files generated by X!Tandem will go in this subfolder.
  • Click on 'Back' to return to the workflow diagram page

MzXMLFile will not be set yet, this will be the final input before the workflow is run.

5. Setting Peptide Prophet parameters

There is one more set of parameters to be set based on the high accuracy of the MS/MS then the template can be saved as a workflow

PeptideProphet parameters

  • Click on the PeptideProphet module
  • Click on the PeptideProphet Parameters tab
  • Find the PeptideProphet options parameter in the upper left
  • Check the specify options? check box
  • Since this template is being set up for high accuracy MS/MS, also check the lock value check box
  • Find the use accurate mass binning in PeptideProphet parameter in the third column
  • Check the specify options? check box
  • Also check the lock value check box
  • Click on Save then Back to return to the workflow diagram page

6. Saving as a workflow

Now save the template as a workflow. The template is now set up for high accuracy MS/MS. For this workflow we will setup the parameters for how the SILAC samples were prepared. If different isotopes are used for other SILAC runs, then the template can be saved as another workflow to capture that difference.

  • Choose Save as
  • Fill in MyWorkflow for the Name
  • Choose the Workflow button
  • Click the Save button

ASAPRatio parameters

The sample specific parameters can now be set for ASAP Ratio

  • Click on the ASAPRatio module
  • Click on the ASAP Parameters tab

change labeled residues parameter

  • Find the change labeled residues parameter at the top of the second column
  • Check the value needed? check box to let the workflow know this value will need to be set in order to run the workflow
  • Select K and R from the dropdown
  • leave lock value unchecked to allow this to be changed for a run if desired

range around precusor m/z to search for peak parameter

  • Find the range around precusor m/z to search for peak parameter in the middle of the first column
  • Check the value needed? check box
  • set the value to 0.05

specified label masses parameter

  • Find the specified label masses (i.e. M74.325Y125.864), only relevant for static modification quantification parameter at the bottom of the third column
  • Check the value needed? check box
  • Set the following three amino acid values:
    • Pick M from the Select one: dropdown and set a value of 147.035 and click Add choice
    • Pick K from the Select one: dropdown and set a value of 136.10916 and click Add choice
    • Pick R from the Select one: dropdown and set a value of 166.10941 and click Add choice
  • Click the Save button

7. Save as a compiled workflow

Now save the workflow as a compiled workflow. This will lock in the choices made and allow the compiled workflow to be run as many times as desired.

  • Click the 'Compile workflow' button
  • Name the workflow MyCompiledWorkflow
  • Click the Compile button

This will take you to the Compiled Workflows form and allow you to setup and run a workflow.

8. Run a workflow

Setup and run a workflow

  • Hilight My Compiled Workflow in the Compiled Workflows page
  • Click the Setup run button
  • In the Setup to run dialog, name the run MyFirstRun
  • Click the Setup button

Because the number of parameters that are required are drastically reduced at this point, the tabs reflect the components of the workflow.

ASAPRatio

  • Click on the ASAPRatio tab
  • Notice that we see the parameters that were required but not locked. These parameters can be changed
  • Click the arrow at the bottom of the form
  • Notice that this brings up the form that shows the values that are locked for this module and can't be changed

MzXMLFile

  • Now click on the MzXMLFile tab to add the final parameter value to run the workflow
  • Click the Browse button
  • Open the demo2009 folder, then the tandem folder
  • Select OR20080317_S_SILAC-LH_1-1_01.mzML and OR20080317_S_SILAC-LH_1-1_11.mzML files (hold the Ctrl key for multiple select)
  • Click the Select files button
  • Back on the parameter form click Save

Run the Workflow

  • Click the Run button

This brings up the workflow diagram which updates to show a graphical depiction of the workflow running. When it completes it will go to the output files form with a folder tree of where the output files have been saved.

9. View the summary information on the run

Summary

  • Click on the Summary button
  • Note the overall run time and the run time of each of the modules
  • Click on the icon in the fifth column for the PeptideProphet row
  • The window that opens will contain the log information for that run
  • Close the window
  • Click on the icon in the sixth column of the NestedTandem row
  • Move the window that appears and note that that the top window and the window beneath represent the processing of each of the mzML files
  • Close these two windows and the top-level summary window

10. Exploring the output

You can run the visualization and analysis tools in the output files tree form.

  • Note: when closing a viewer, pick the appropriate method to close the viewer depending on whether the viewer came up in a new tab in the same browser, in the same tab in the same browser, or whether it came up in a new browser instance.

PeptideProphet results

  • Open the peptide_prophet folder
  • Right click on interact.pep.shtml file and select View. This will open a viewer.
  • Using the sorter at the top left, sort the list in descending order based on Probabilities. The identifications at the top of the resulting list are most likely to be correct. Click on the hypertext link for any probability. This brings up a details page IMG:PlotModel which shows graphically how successful the modeling was. In the upper pane, it is desirable for the red curve (sensitivity) to hug the upper right corner, and for the green curve (error) to hug the lower left corner. The lower pane shows how well the data (black line) follows the PeptideProphet modeling for each charge state. The blue curve describes the modeling of the negative results, and the purple one, the positive results. If these two curves are well separated and fit the black line well, then the analysis for that charge state was successful.
  • Click the Close button for the PlotModel and the Close button for the viewer to go back to SEWW result files
  • Right click on the interact.pep.xml file and select PepXML. This will open the file using the PepXML viewer application.
  • Click on the Other Actions top-level tab, and then on the Generate Pep3D button. A new window will launch the Pep3D viewer.
  • Leave the default options (or change to taste) and click on the Generate Pep3D Image button.
  • After a few moments, you should see an image displayed on the page per mzML input file. [http://tools.proteomecenter.org/wiki/index.php?title=Image:Pep3D.JPG
  • Click the Close button for the Pep3D Image and the Close button for the PepXML viewer to go back to SEWW result files
  • Close the peptide_prophet folder

InterProphet results

  • Open the inter_prophet folder
  • Right click on the interact.iproph.pep.shtml file to view and analyze the results. Notice that there is a new column, IPROB, which is a refinement of the PROB column IMG:iprophet
  • Click the Close button for the PepXML viewer
  • Close the inter_prophet folder

ASAPRatio results

  • Open the asap_ratio folder
  • Right click on the asap.pep.shtml file to view and analyze the results. The “ASAPRATIO” column that has been added contains quantitation results with a link to the ASAPRatio ion trace. The number listed in the “asapratio” column is the light to heavy ratio. IMG:ASAPRatioProfiles
  • Click the Close button on the PepXML viewer
  • Close the asap_ratio folder

ProteinProphet results

  • Open the protein_prophet folder
  • Right click on the interact.prot.shtml file to view and analyze the results. Protein groups are sorted in descending order by Probability so that the groups at the top of the page are the most confident identifications. The protein probabilities are the red numbers listed next to each protein group. IMG:Protxml
  • Click the Close button on the PepXML viewer
  • Close the protein_prophet folder

SEWW Trouble Shooting

Configuration

Trans-Proteomic Pipeline

To ensure that the components for TPP are properly installed, they can be checked by using the older GUI, Petunia. Log into Petunia by double-clicking on the Trans-Proteomic Pipeline flower icon on your Desktop or through the Start menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl. You can use the credentials guest and guest as user name and password to log in.

Make sure that the WEBSERVER_ROOT environment variable is set. From the start menu check Control Panel.System.Advanced system settings.Environment Variables.... If you chose the default location for Petunia, WEBSERVER_ROOT should be c:/Inetpub/wwwroot.

Click through the menus to see that everything looks set up correctly. You can also go to the TPP Demo and follow the instructions there to see if the TPP is set up correctly. The test files for that tutorial are the same as for the SEWW tutorial.

Jetty

If you have Jetty setup to start as a service, after a reboot you can check to see if it is running by opening up the task manager, clicking on the Services tab and looking to see if Jetty is listed in the name column and that its status is Running.

If you start Jetty by going to the JETTY_HOME folder (c:\jetty\jetty-distribution-7.3.0.v20110203-SEWW-0.5 by default) then clicking or double-clicking on the file named start of type Windows Batch file, you should see a command window pop up and, after a series of commands are executed, a final line of 2013-01-14 12:14:40.604:INFO::Started SelectChannelConnector@0.0.0.0:8888 appear. There should be no exceptions thrown. You can also go to the logs folder and double-click on the seww.log file to see the information that was output into the command window.

If you see that an exception was thrown, make sure that the Jetty zip file properly unzipped into the folder pointed to by the JETTY_HOME environment variable. If the command window briefly flickers then disappears, the log file should show the problem. Check that both the JETTY_HOME and JAVA_HOME environment variables are set and point to the correct folders.

SEWW

Make sure after unzipping seww-svc-0.5.config.zip that createparamfile-0.5-SNAPSHOT.jar has been moved into WEBSERVER_ROOT\..\tpp-bin (by default C:\Inetpub\tpp-bin). That is the same folder that should also include tpp_gui.pl

If the URL http://localhost:8888/seww-svc-0.5-SNAPSHOT/html/bootstrap.html is pasted into the web browser and the web browser can't connect to the website, check that Jetty is running and no exceptions were thrown when it started up. If Jetty is running, check that the JETTY_HOME subfolder webapps has the following files:

  • seww-svc-0.5-SNAPSHOT.war
  • fs-workspaces-svc.war

and populated subfolders:

  • ext-4.0.7-gpl
  • jit-2.0.1

If the following line appears in the command window or in the log, java.io.FileNotFoundException: class path resource [fs-workspaces-seww-svc.config] cannot be opened because it does not exist then check that a file called fs-workspaces-seww-svc.config exists in the resources subfolder of JETTY_HOME folder. If it does, its contents should look like:

{
server:"localhost",
port:8888,
locals:[{
uri:"/user",
comment:"This points to the SEWW user directory in the TPP file folder",
rootPath:"c:/Inetpub/wwwroot/ISB/data",
uriType:'USER',
useURIInDirListing:false
},{
uri:"/ISB/data",
comment:"This points to the TPP file folder",
rootPath:"c:/Inetpub/wwwroot",
uriType:'TOP_FOLDER',
useURIInDirListing:false
}],
properties:{
comment:'this abstracts what application specific javascript modules need to know',
webserver_domain_port:"http://localhost:80",
top_folder2user:"/ISB/data/user"
}
}

If none of the above looks right, then make sure the Jetty zip file properly unzipped into the correct directory.

When loading the configuration file simple_template_complete.xml, if the following error occurs, Error loading configuration file: Line 1 in XML document from resource loaded from byte array is invalid; nested exception is org.xml.sax.SAXParseException: Content is not allowed in prolog., make sure the correct file was selected and that the first line of the file is:

<?xml version="1.0" encoding="UTF-8"?>

When loading the configuration file simple_template_complete.xml, if the following error occurs, Error loading configuration file: No bean named 'rregister' is defined, make sure the Java bean name was spelled correctly.

SEWW Tutorial

Exploring the Template

On a slower machine, it might take a moment for the ASAPRatio form to come up as it loads all the parameters.

Setting input

On a slower machine, it might also take a moment for the PeptideProphet form to come up.

If at any time the web application seems to become unresponsive, try hitting F5 and relogging on. Then navigate to your template, workflow, compiled workflow or run. Any changes that had been saved in your template, workflow or compiled workflow should still be saved.

Run a workflow

When the Run button is clicked, the graphical workflow depiction should have the module that is running lightly blinking. If it seems frozen or it has stopped responding, click on the To output files button. This will bring you to a tree view of the current output from the workflow.

To check the status, click on the Summary button. The Overall status will be on the top right.

If the workflow is not complete, then close the Summary window to return to the main run form. As long as the workflow is not hung, you can continue to check the Summary page for changes.

If the workflow still seems to be hung, check the log to see if there is an Error or an Exception noted, if so the workflow will have failed. You may discover it is a legitimate error that you an go back and fix by defining a new workflow.

General

SEWW persists templates, workflows, compiled workflows and runs for a user to the file system at WEBSERVER_ROOT/ISB/data/user/<user_name>. The results for a run can be found at WEBSERVER_ROOT/ISB/data/user/<user_name>/compiled/<compiled_workflow_name>/<run_name>. The results can be viewed in Petunia by clicking on the Utilities tab then clicking on the Browse Files tab and then navigating to the appropriate subfolder for the run.

To request help with a specific issue go to SEWW user mailing list. You may just post a message to the list, there is no need to subscribe to the list unless you desire to.

Personal tools