TPP Amazon Machine Images

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 18:43, 6 April 2011
JoeS (Talk | contribs)
(Building)
← Previous diff
Revision as of 21:02, 6 April 2011
JoeS (Talk | contribs)
(User's Guide)
Next diff →
Line 17: Line 17:
: [http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/ Official Guide] : [http://docs.amazonwebservices.com/AWSEC2/latest/GettingStartedGuide/ Official Guide]
: [http://www.youtube.com/watch?v=RkVSkL76U-M Getting Started With Amazon EC2] : [http://www.youtube.com/watch?v=RkVSkL76U-M Getting Started With Amazon EC2]
 +
 +=== Notes ===
 +
 +==== Filesystem Layout ====
 +
 +The ubuntu images where created with a small root partition and all remaining available disk space is mounted as /mnt. Therefore the following directories were created for TPP data:
 +:: '''/mnt/tppdata/local''' - Used for local data storage. '''''Anything''''' placed here will be lost when the instance is stopped.
 +:: '''/mnt/tppdata/s3''' - Mount point for S3. See note below for more information
 +"" '''/mnt/tppdata/ebs''' - Mount point for elastic block store (EBS). See note below for more information.
 +
 +The Petunia web interface is configured to use /mnt/tppdata as its top level so that users can browse and
 +manipulate data on the instance.
 +==== S3 Persistence ====
 +
 +Since the TPP image comes with s3fs pre-installed it is possible to mount a S3 bucket as a local filesystem to get persistence storage of data in S3. To use this feature please provide the bucket name, your AWS credentials in the userdata field when starting the instance.
 +
 +==== EBS Persistence ====
 +
 +''TBD''
== Developer's Guide == == Developer's Guide ==

Revision as of 21:02, 6 April 2011

Starting with TPP 4.4.1 the TPP group is now making available pre-built Amazon Machine Images (AMI) with the latest TPP software installed to make it even easier to perform proteomics data analysis. These images are configured to be used with either the TPP Web Application (TWA), the TPP AWS high performance computing tools, for your own in house applications, or as a base for your own EC2 images. The images are based on the latest Ubuntu EC2 public images and include features such as persistent store in S3 or EBS backed filesystems and wine based conversions of MS/MS files.

Contents

Details

This is an unix/linux instance-store backed 64-bit image based on the ubuntu 10.10 "mavrick" public image (ami-08f40561). It also contains the following open source software:

  • OMSSA Version 2.1.9
  • InsPecT Version 20101012
  • Myrimatch Version 2.0.85
  • Proteowizard's msconvert (Windows version)

Versions

User's Guide

There are many good guides already written on how to use Amazon Machine Images (AMI) in the EC2 product. Here's just a few:

Official Guide
Getting Started With Amazon EC2

Notes

Filesystem Layout

The ubuntu images where created with a small root partition and all remaining available disk space is mounted as /mnt. Therefore the following directories were created for TPP data:

/mnt/tppdata/local - Used for local data storage. Anything placed here will be lost when the instance is stopped.
/mnt/tppdata/s3 - Mount point for S3. See note below for more information

"" /mnt/tppdata/ebs - Mount point for elastic block store (EBS). See note below for more information.

The Petunia web interface is configured to use /mnt/tppdata as its top level so that users can browse and manipulate data on the instance.

S3 Persistence

Since the TPP image comes with s3fs pre-installed it is possible to mount a S3 bucket as a local filesystem to get persistence storage of data in S3. To use this feature please provide the bucket name, your AWS credentials in the userdata field when starting the instance.

EBS Persistence

TBD

Developer's Guide

Building

The easiest way to build a image is to build it from an already existing image ("rebundling"). So TPP images are built from the official public images provided and supported by the Ubuntu community. The process is fairly straight forward:

  1. Find the AMI-ID of the latest Ubuntu community image for the zone you want to use. The simplest way is to use the convenient search tool found at http://cloud.ubuntu.com/ami. For TPP images filter the AMI list by amd64 architecture (64 bit) and instance-store block store then choose the release you want to use and note the AMI-ID.
  2. Start up a new EC2 instance with the AMI-ID from the previous step. You can do this either using the AWS console web application at http://console.aws.amazon.com or the command line tool ec2-start-instances if you happen to have installed the ec2-ami-tools. Make sure when you start the image that you use a security group that has port 22 and port 80 open and that you specify a key pair so that you can actually log into the instance. Once the instance is running the public domain name can be found using either the console or the command ec2-describe-instances.
  3. Copy your certificate and private key to the /tmp directory of your instance. (scp is your friend here)
  4. Using either ssh or Putty and your key log into your EC2 instance. You'll then need to setup a few environment variables that are used by various scripts and AWS tools:
    export AWS_USER_ID=<your-value>
    export AWS_ACCESS_KEY_ID=<your-value>
    export AWS_SECRET_ACCESS_KEY=<your-value>
    export EC2_CERT=/tmp/<your-value>
    export EC2_PRIVATE_KEY=/tmp/<your-value>
    export TPP_VERSION=4.4.1
  5. Download and run the provided scripts to install, configure, and publish the new TPP image.
    svn export --force /tmp https://sashimi.svn.sourceforge.net/svnroot/sashimi/trunk/trans_proteomic_pipeline/extern/hpctools/ec2 /tmp
    sudo bash /tmp/setup_ec2_image.sh
    sudo bash /tmp/bundle_ec2_image.sh
    sudo bash /tmp/publish_ec2_image.sh

Naming Conventions

Manifest Naming

The current suggested schema for naming manifests is to use the default prefix/names assigned by the ec2 tools and place them in a "folder" with a name following the schema "TPP-<version>-<data>" where version is the version of TPP and date is a date indicator in the format YYYYMMDD. An optional serial number [.1,.2,...] can be included for the YYYYMMDD date if necessary. These "folders" should be placed in the correct S3 bucket by region (see next section).

For an example, the name spctools-images-us/TPP-4.4.1-20110403/manifest.xml references image with TPP 4.4.1 installed build on 4/3/2011.

S3 Buckets

The following buckets have been (or will be) created in each region for storing SPCTools TPP images. Each bucket should have a suffix indicating which region the bucket is in:

  • spctools-images-us
  • spctools-iamges-us-west-1
  • spctools-images-eu

The following additional buckets have been created, primarily to reserve them:

  • spctools

External Links

Personal tools