Building Peptide Atlas

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 17:57, 17 October 2008
Tfarrah (Talk | contribs)

← Previous diff
Revision as of 18:02, 17 October 2008
Tfarrah (Talk | contribs)

Next diff →
Line 4: Line 4:
* Obtain search batch IDs for each experiment and create an experiments list. * Obtain search batch IDs for each experiment and create an experiments list.
* Run ProteinProphet on all pepXML files combined to create a single protXML file * Run ProteinProphet on all pepXML files combined to create a single protXML file
-* Run PeptideAtlas build "pipeline". Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts.+* Run PeptideAtlas build "pipeline". Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts. Each script ultimately calls /net/db/projects/PeptideAtlas/pipeline/run_scripts/run_Master_current.csh, and this is where the meat of the pipeline resides.
** Gather all peptides ** Gather all peptides
** Download latest fasta files from web for reference DB (also called biosequence set) ** Download latest fasta files from web for reference DB (also called biosequence set)

Revision as of 18:02, 17 October 2008

Here is how I built a human urine PeptideAtlas in October 2008. The details of how to do each step are found in regis.systemsbiology.net:~tfarrah/PeptideAtlasBuild/HumanUrine_2008-09.notes.

  • Start with one or more pepXML files for each experiment in each project. A project is a set of related experiments. For example, a project may study proteins found in normal and diseased liver, and may include 4 experiments: tissue from two normal patients and from two diseased patients. The pepXML files should be created by searching the spectra with a database search engine such as SEQUEST, X!Tandem, or SpectraST, then validating the hits using PeptideProphet.
  • Register projects and experiments in PeptideAtlas using SBEAMS interface.
  • Obtain search batch IDs for each experiment and create an experiments list.
  • Run ProteinProphet on all pepXML files combined to create a single protXML file
  • Run PeptideAtlas build "pipeline". Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts. Each script ultimately calls /net/db/projects/PeptideAtlas/pipeline/run_scripts/run_Master_current.csh, and this is where the meat of the pipeline resides.
    • Gather all peptides
    • Download latest fasta files from web for reference DB (also called biosequence set)
    • Map peptides to proteins in reference DB
    • Get chromosomal coordinates
    • Compile statistics on the peptides and proteins in the build
    • Build a SpectraST library from the build (optional)
  • Load the reference DB (biosequence set) if new one is needed
  • Define Atlas build via SBEAMS
  • Load data into build
  • Build search key
  • Update empirical proteotypic scores
  • Load spectra and spectrum IDs
  • Update statistics
Personal tools