Building Peptide Atlas

From SPCTools

(Difference between revisions)

Revision as of 18:01, 20 October 2008

Here is how I built a human urine PeptideAtlas in October 2008. The details of how to do each step are found in regis.systemsbiology.net:~tfarrah/PeptideAtlasBuild/HumanUrine_2008-09.notes.

1 Start with one or more pepXML files for each experiment in each project.
2 Register projects and experiments in PeptideAtlas using SBEAMS interface.
3 Obtain search batch IDs for each experiment and create an experiments list.
4 Run ProteinProphet on all pepXML files combined to create a single protXML file
5 Run PeptideAtlas build "pipeline".
6 Load the reference DB (biosequence set) if new one is needed
7 Define Atlas build via SBEAMS
8 Load data into build
9 Build search key
10 Update empirical proteotypic scores
11 Load spectra and spectrum IDs
12 Update statistics

Start with one or more pepXML files for each experiment in each project.

A project is a set of related experiments. For example, a project may study proteins found in normal and diseased liver, and may include 4 experiments: tissue from two normal patients and from two diseased patients. The pepXML files should be created by searching the spectra with a database search engine such as SEQUEST, X!Tandem, or SpectraST, then validating the hits using PeptideProphet.

Register projects and experiments in PeptideAtlas using SBEAMS interface.

Obtain search batch IDs for each experiment and create an experiments list.

Run ProteinProphet on all pepXML files combined to create a single protXML file

Run PeptideAtlas build "pipeline".

Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts. Each script ultimately calls /net/db/projects/PeptideAtlas/pipeline/run_scripts/run_Master_current.csh, and this is where the meat of the pipeline resides.

Gather all peptides

Download latest fasta files from web for reference DB (also called biosequence set)

Map peptides to proteins in reference DB

Get chromosomal coordinates

Compile statistics on the peptides and proteins in the build

Build a SpectraST library from the build (optional)

Load the reference DB (biosequence set) if new one is needed

Define Atlas build via SBEAMS

Load data into build

Build search key

Update empirical proteotypic scores

Load spectra and spectrum IDs

Update statistics

Retrieved from "http://tools.proteomecenter.org/wiki/index.php?title=Building_Peptide_Atlas"

 Here is how I built a human urine PeptideAtlas in October 2008. The details of  how to do each step are found in regis.systemsbiology.net:~tfarrah/PeptideAtlasBuild/HumanUrine_2008-09.notes.
-* Start with one or more pepXML files for each experiment in each project. A project is a set of related experiments. For example, a project may study proteins found in normal and diseased liver, and may include 4 experiments: tissue from two normal patients and from two diseased patients. The pepXML files should be created by searching the spectra with a database search engine such as SEQUEST, X!Tandem, or SpectraST, then validating the hits using PeptideProphet.
+===Start with one or more pepXML files for each experiment in each project.===
-* Register projects and experiments in PeptideAtlas using SBEAMS interface.
+A project is a set of related experiments. For example, a project may study proteins found in normal and diseased liver, and may include 4 experiments: tissue from two normal patients and from two diseased patients. The pepXML files should be created by searching the spectra with a database search engine such as SEQUEST, X!Tandem, or SpectraST, then validating the hits using PeptideProphet.
-* Obtain search batch IDs for each experiment and create an experiments list.
+===Register projects and experiments in PeptideAtlas using SBEAMS interface.===
-* Run ProteinProphet on all pepXML files combined to create a single protXML file
+===Obtain search batch IDs for each experiment and create an experiments list.===
-* Run PeptideAtlas build "pipeline". Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts. Each script ultimately calls /net/db/projects/PeptideAtlas/pipeline/run_scripts/run_Master_current.csh, and this is where the meat of the pipeline resides.
+===Run ProteinProphet on all pepXML files combined to create a single protXML file===
-** Gather all peptides
+===Run PeptideAtlas build "pipeline".===
-** Download latest fasta files from web for reference DB (also called biosequence set)
+ Scripts can be found in /net/db/projects/PeptideAtlas/pipeline/run_scripts. Each script ultimately calls /net/db/projects/PeptideAtlas/pipeline/run_scripts/run_Master_current.csh, and this is where the meat of the pipeline resides.
-** Map peptides to proteins in reference DB
+====Gather all peptides====
-** Get chromosomal coordinates
+====Download latest fasta files from web for reference DB (also called biosequence set)====
-** Compile statistics on the peptides and proteins in the build
+====Map peptides to proteins in reference DB====
-** Build a SpectraST library from the build (optional)
+====Get chromosomal coordinates====
-* Load the reference DB (biosequence set) if new one is needed
+====Compile statistics on the peptides and proteins in the build====
-* Define Atlas build via SBEAMS
+====Build a SpectraST library from the build (optional)====
-* Load data into build
+===Load the reference DB (biosequence set) if new one is needed===
-* Build search key
+===Define Atlas build via SBEAMS===
-* Update empirical proteotypic scores
+===Load data into build===
-* Load spectra and spectrum IDs
+===Build search key===
-* Update statistics
+===Update empirical proteotypic scores===
+===Load spectra and spectrum IDs===
+===Update statistics===