TPP TWA:Tutorial
From SPCTools
Introduction
The TPP Web-launcher for Amazon Web Services (TWA) is a cloud computing web-based application for launching your own instance of TPP in the cloud. Based on Amazon Web Services, it simplifies the process of starting an Amazon Elastic Compute Cloud (EC2) instance with TPP already installed and ready to use.
About Tutorial
This tutorial is written for anyone interested in using a completely cloud based instance of the Trans-Proteomics Pipeline on Amazon Web Services. It will walk the user through the steps of setting up a Amazon Web Services account and then using TWA to launch a compute node on the cloud running the TPP. Users will then use TPP's Petunia web interface to submit multiple searches on AWS and view the results. Readers may find it easier to follow the tutorial if they are already familiar with the TPP's web interface and usage.
Requirements
All that is required to execute this tutorial is a Amazon Web Services account, a Internet connection, and a current version of either the Mozilla Firefox or Internet Explorer web browser. Older browser versions or vendors of browsers may not support all functionality.
Tutorial
Readers should be aware that executing this tutorial will incur some AWS charges. The exact amount of these charges will vary based on a number of factors but should be on the order of $1-$4 USD.
Step 1: Starting TWA
Launch the TWA application by navigating with your web browser to http://tools.proteomecenter.org/twa. The TWA web-based application will appear as a toolbar across the top of the main page with details about the TWA application. Within this toolbar are two fields for entering a Access Key ID and the Secret Access Key, more on these in the next section. Next in the toolbar are two drop down menus, the first labeled "Tools" and the second labeled "AWS Shortcuts". The Tools submenu provides control over options along with several useful operations and is only enabled once you've successfully authenticated your access and secret keys. The AWS Shortcuts menu provides shortcut links to web forms at Amazon Web Services commonly used by TPP cloud services. The last element in the toolbar is a control button that allows you to either start or stop a EC2 instance. The button will only be enabled when successfully authenticated and its label will reflect the current state (start or stop instance).
Step 2: Creating Your Amazon Web Services Account
If you already have a Amazon Web Services account you can skip to the next step. (Please note that an AWS account is different than having an account on the normal Amazon web site). To create a new account, select "Amazon Sign In/Register" under the AWS Shortcuts menu in the TWA toolbar. A new window should open to Amazon Web Services with details on created a new account or signing in to a existing account. Alternately you can navigate with your browser to http://aws.amazon.com/ and click on "Create a Free Account".
Step 3: Getting Your Amazon Credentials
In order for TWA to interact with Amazon Web Services on your behalf you must provide your security credentials to confirm that you are who you say you are and that you do have permissions to do what you are trying to do. These security credentials are known as an access key and is comprised of key id and secret key. This key can then be used to make secure REST or Query protocol requests to any AWS service API. And much like real life your Amazon account may have multiple access keys associated it.
All AWS accounts have what is known as root account credentials. These credentials allow full access to all resources in the account. You'll want to make sure you store your root credentials in a safe place and never share these with anyone, particularly a 3rd party AWS application as they are the "keys to the kingdom". Instead AWS provides a web service known as Identity and Access Management (IAM), which allows you to create user credentials for day-to-day interactions with AWS. Its strongly recommended that you only use these user access keys when using TWA.
You create and manage user access keys using the IAM console at Amazon Web Services. Under AWS shortcuts in the TWA toolbar there is a menu item that when selected will open a new tab/window to the console. Alternatively you can access the console at https://console.aws.amazon.com/iam/home.
Now go ahead and create a new user by selecting on users in the left side menu and then clicking on "Create New User". Name the user something suggestive of TWA, such as "TPP-TWA". Make sure you leave the "Create Access Key" checked. Click "ok" to create the user. A window should appear asking you to download your access and secret key. You can either choose to view your keys or download them to your desktop. If you don't save these keys you will have to delete and regenerate them again the next time you need them.
Next you'll have to grant permissions to the "TPP-TWA" user to enable the keys to work. Select on the user you just created and click permissions and then "Attach User Policy". The window should display a list of policy templates and allow you to create your own policies. The easiest approach is to just select "Power User Access" and click apply. This grants access to all web services to the TPP-TWA user using its access key and you are ready for the next step in the tutorial. Alternatively if you want to be a little more security minded instead of adding the power user template you can scroll through the list of templates and add following templates: Amazon EC2 Full Access, Amazon S3 Full Access, and Amazon SQS Full Access.
For more information on Amazon Web Services security credentials see:
Step 4: Signing into TWA
Using the access and secret keys from the previous step, enter them into the fields in the TWA toolbar and click on the small submission icon to sign into Amazon Web Services. Once signed in both the Tools menu and the Start Instance button should become active. If you encounter an error check that you entered your access key and secret key in correctly and try again.
Step 5: Launch a EC2 instance
Once signed in launching a new EC2 instance is as simple as clicking on the start instance button in the toolbar. But before you do this you can make your instance a little more secure by changing the default password used in the TPP user interface of your instance. To do this click on the Tools menu and choose Start Options to open the options window. Under general settings change the guest password to something other than the default ("guest").
Now go ahead and click "Start Instance". You should see a message dialog stating the instance is launching. It can take up to a few minutes for the instance to start and you should be automatically connected to the instance when it is ready. The default EC2 instance type is the m3.large. This is a 64 bit system with 2 virtual CPUs with 3.25 EC2 Compute Units each, 7.5 GiB of memory, and 32 GB of SSD storage. As of 7/2014 Amazon charges $0.140 per hour the US West availability zone for this instance type. Its always possible to start a different type of EC2 instance by choosing a different instance type under the EC2 options section of the options dialog.
Step 6: Search data with X!Tandem
For the next step we'll use the X!Tandem MS/MS peptide identification program on a set of data automatically downloaded and installed in your instance.
- When the log in page of the TPP web interface appears log in as user guest using either the password "guest" or the password you set in the earlier step.
- Next change the analysis pipeline from "Comet" to "Tandem" using the dropdown menu found in the middle of the home(welcome) page. This configures TPP to use the X!Tandem open source software for peptide and protein identification produced by the Global Proteome Machine (theGPM).
- Click on the "Analysis pipeline" tab, then the "Database search" tab to get to the X!Tandem search submission form.
- Select "Add Files" in the 1st section to add one or more data files. In the right most directory tree navigate to the folder containing the tutorial data by clicking on the "local" link. If you see one or more files with a zip extensions in this directory the demo/tutorial data hasn't finished loading. Simply wait a few moments and then click on the "Go!" button to refresh the directory contents. Once all the zip files have been processed navigate to the "demo / AWS" folder and select the files OR20080317_S_SILAC-LH_1-1_01-trimmed.mzXML and OR20080320_S_SILAC-LH_1-1_11-trimmed.mzXML.
- Next specify the tandem parameters file by clicking “Add Files” in section 2 and selecting the tandem_params.xml file.
- Lastly specify a sequence database by clicking “Add Files” in section 3 and selecting the file tppdata/local/demo/dbase/ yeast_orfs_all_REV.20060126.fix.fasta.
- Click "Run Tandem Search" to launch X!Tandem on the two input files.
Step 7: View the results PepXML Viewer
The job details page will refresh every so often showing you a updated status of the search jobs. When they've completed right click on the PepXML link in the job page or navigate to the results in the file browser and open one of the resulting files with a pep.xml extension.
This display lists the peptide identifications found by X!Tandem along with pertinent statistical information for the identifications. Controls in the display allow you to expand/constrict the view, sort, and filter the identifications.
Step 8: Download the results
Any work done on a EC2 instance goes away when the instance goes away. So its important to either download your results or store them back in the cloud if you don't want to lose any work.
Using the web interface view the directory containing results by selecting Utililties > Browse files in the menu bar and navigating to the folder tppdata/local/class/AWS. Select the two files with the the .pep.xml extension and click the download action. This will download a zip file containing your results that can be saved on your computer. You can use any one of the popular zip utilities for extracting the resultant files from the archive.
Step 9: Save the data in S3
Another option for saving your data is to store it in Amazon's Simple Storage Service (S3). Amazon S3 provides a simple web-services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web.
Again navigate to the tppdata/local/class/AWS folder using the file browsing utility as you did in the previous step. With nothing selected click the S3 Sync button in the actions bar. This will display a form that can be used to "sync" the files and folders in a local directory with a "bucket" in Amazon S3. In the first section of the form will show the local and S3 paths that will be synced. The section step allows you to select options such as whether or not you want to do a "dry-run" or whether you want files that no longer exist deleted. The third section allows you to choose the direction of the sync, either mirroring the local copy in S3 or mirroring what is in S3 locally. This allows you to store your current results in S3 and later retrieve them locally.
Go ahead and click the button "Sync to S3". The jobs page should then be displayed with a new running job and the results. When the job completes all of your files and folders should now be stored in Amazon S3. To confirm this open the AWS S3 console using the shortcut found in the TWA toolbar. A new window in your browser should appear. If not already signed in, sign into Amazon Web Services. By default TWA/TPP creates a bucket named "tpp-<user id>" to store your data in. Click on this bucket in the console interface and navigate to the folder local/class/AWS and confirm that your data has been saved.
Step 10: Finishing and Cleanup
Stop the instance by clicking “Stop Instance”. A dialog should appear notifying you that the request for the instance to be shutdown has been sent. Shutting down the instance usually just takes a minute or two. Once the instance is shutdown you will not incur any additional charges for the EC2 instance (which are billed per hour usage). Forgetting to shutdown a instance after you are done using it can incur unwanted extra charges. Fortunately TWA comes with a feature to protect against this. Once the instance is stopped click on the Tools > Start Options in the TWA toolbar. Under EC2 options there is a field labeled "Auto shutdown". The instances started by TWA have a service that will automatically shut them down after a given period of time. You can enter the number of hours you wish it to be using this option.
You may also want to delete the data that you stored in step 8 of this tutorial Amazon S3 is charged on a GB/month metric so leaving any unnecessary data in S3 may incur unwanted charges. To delete this data, open the AWS S3 console as described at the end of step 9 and select the bucket you uploaded your data to (default will named TPP-<id>) and click "delete".
Next Steps
For further information on using Amazon cloud services see the other tutorials and documentation on this website or post your questions to the TPP mailing list at spctools-discuss discussion group.