TPP AMZTPP:PetuniaTutorial

From SPCTools

Revision as of 23:07, 16 December 2013; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Contents

Introduction

About Tutorial

This tutorial is written for anyone interested in extending available computational resources via Amazon Web Services in the Trans-Proteomics Pipeline. It will walk the user through the steps of using TPP's Petunia web interface to setup proper Amazon credentials and submit multiple searches on AWS. Readers should already familiar with the TPP's web interface and usage.

System Requirements and TPP Versions

In order to execute this tutorial you must have already installed TPP (version 4.6.3 or greater) on a Windows system. You may also need to install amztpp, the command line tool for managing Amazon Web Services resources used by TPP. Guides for installing both are available at:

Readers should also be aware that executing this tutorial will incur some AWS charges. The exact amount of these charges will vary based on usage but should be on the order of less that $1-$4 USD.

Tutorial

Step 1: Getting Your Amazon Credentials

In order for the TPP to have access to Amazon Web Services you must provide your AWS credentials confirm that you are who you say you are and that you do have permissions to do what you are trying to do. These credentials are known as Amazon's access and secret key. These keys are used to make secure REST or Query protocol requests to any AWS service API.

You can create new keys using the Amazon Security Credentials page. Your access keys are displayed under the Access Keys section in the Credentials Section of the page. Secret keys are now no longer displayed. If you've previously created a access/secret key pair and have forgotten what the secret key is you will need to generate a new key pair. For more information about setting up your AWS credentials please see Where's my secret access key?

Step 2: Registering Your Amazon Credentials in TPP

Log into Petunia web interface of TPP by double-clicking on the Trans-Proteomic Pipeline flower icon on your Desktop or through the Start menu. Alternatively, you can open a browser window into the following URL: http://localhost/tpp-bin/tpp_gui.pl . You can use the credentials guest and guest as user name and password to log in if this is a new TPP installation.

Once logged in click on Account > Cloud in the menu bar at the top of the page. This should take you to the clusters details form. Open the "Register Amazon EC2 Account" section if it isn't already open. In it are two form fields, one for your access key and one for your secret key. Cut & paste both values into the fields provided and click on the "Verify and Use Keys" button. If all goes well the page should be refreshed and you should see an additional status section added to the page containing details on your current Amazon Web Services.

Step 3: Download and install the tutorial data

For this tutorial we'll be using the same dataset used by the TPP tutorial. This is a SILAC-labeled Yeast dataset comprised of 2 runs on a high mass-accuracy Orbitrap instrument, along with a Yeast database appended with decoys. We also include a parameters files for inspect, tandem, myrimatch and omssa for MS/MS identification. You can install it by:

  • Downloading the mzML files, parameter files and database from Sourceforge (92.2Mb).
  • Unpack the demo archive using 7zip, Stuffit, unzip or a similar program and set the destination directory to be C:\Inetpub\wwwroot\ISB\data

If you've successfully installed the demo set you should have a new folder at C:\Inetpub\wwwroot\ISB\data\demoAMZTPP.

Please note that this tutorial assumes that you are running a default TPP installation on a Windows system; if you are using a different system, please adjust the parameters files and file locations accordingly.

Step 4: Search data with X!Tandem

A custom version of the popular open-source search engine X!Tandem is bundled and installed with the TPP. It has been modified from the original distribution by adding the K-Score scoring function, developed by a team at the Fred Hutchinson Cancer Research Center.

  • First, make sure that Tandem is selected as the analysis pipeline.
  • Click the Database Search tab under Analysis Pipeline to access the X!Tandem search interface.
  • Under Specify mzXML Input Files, click Add Files and select the two mzML files present in the demoAMZTPP directory as input files for database searching.
  • Similarly, under Specify Tandem Parameters File choose the Tandem parameters file called tandem.xml located in the same directory.
This file defines the database search parameters that override the full set of default settings referenced in the file isb_default_input.
In this example, the mass tolerance is set to -2.1 Da to 4.1 Da, and the residue modification mass is set to 57.021464@C. A wide mass tolerance is used to include all the spectra with precursor m/z off by one or more isotopic separations; the high accuracy achieved by the instrument is then modeled by PeptideProphet with the accurate mass model.
For more information, please go to TANDEM
  • Next, select a sequence database to search against. Navigate up to the dbase directory in the File Chooser, and select the database file yeast_orfs_all_REV.20060126.short.fasta.
  • Lastly select the "on_Amazon_cloud" option next to the Run Tandem Search button and click the button to launch the searches.
Personal tools