ComputationalPredictionofProteotypicPeptides

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 23:04, 13 April 2010
Zsun (Talk | contribs)
(ESPPredictor)
← Previous diff
Current revision
Zsun (Talk | contribs)
(ESPPredictor)
Line 1: Line 1:
 +===PeptideSieve===
 +'''Reference''':[http://www.nature.com/nbt/journal/v25/n1/abs/nbt1275.html Nat Biotechnol. 2007 Jan;25(1):125-31. Computational prediction of proteotypic peptides for quantitative proteomics]. Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D, Ranish J, Raught B, Schmitt R, Werner T, Kuster B, Aebersold R.
 +
 +'''Getting the software''': New native C++ version (.51) released 5/2008: download the [http://sourceforge.net/project/showfiles.php?group_id=69281&package_id=287807 peptideSieve files] from the Sashimi project at SourceForge. Linux, os x and windows binaries (PeptideSieve.exe PeptideSieve.linux.i386 PeptideSieve.osx.i386) are available.
 +
 +A [http://tools.proteomecenter.org/software/PeptideSieve/PeptideSieve_v0.51.static.with.GUI.setup.exe GUI windows] version is available from our collaborator Chee-Hong! It is updated to PeptideSieve version .51
 +
 +'''Running the software''':
 +
 +<pre>
 +PeptideSieve is a commandline utility. Running it sans arguments gives the usage instructions:
 +
 +PeptideSieve: Identify Proteotypic Peptides from a FASTA or TXT file.
 +Version - 0.6
 +Options:
 + -O [ --outputDirectory ] arg : set output directory
 + -e [ --outputExtension ] arg : set extension for output files
 + -o [ --outputFile ] arg : output file name if not input.extension
 + -P [ --propertyFile ] arg (=properties.txt) : set property file
 + -f [ --inputFormat ] arg (=FASTA) : FASTA or TXT, specifying input format
 + -l [ --minSeqLength ] arg (=6) : minimum sequence length to consider
 + -L [ --maxSeqLength ] arg (=40) : maximum sequence length to consider
 + -m [ --minMass ] arg (=400) : minimum mass to consider
 + -M [ --maxMass ] arg (=3000) : maximum mass to consider
 + -c [ --numAllowedMisCleavages ] arg (=0): maximum number of
 + miscleavages to consider
 + -s [ --saveConvertedFile ] : save the converted
 + propertyFile
 + -h [ --help ] : display usage information
 + -d [ --experimentalDesign ] arg (=PAGE_MALDI.txt): which design to return,
 + any of the following, in
 + quotes, comma separated
 + "PAGE_MALDI.txt,PAGE_ESI.tx
 + t,MUDPIT_ESI.txt,MUDPIT_ICA
 + T.txt"
 + -p [ --pValue ] arg (=0.80000000000000004): only return peptides with p values greater than X
 +example usages:
 + Simple Run with Fasta : PeptideSieve shortExample.tfa
 + Simple Run with txt: PeptideSieve -f TXT example.txt
 + Specify Classifiers: PeptideSieve -d "MUDPIT_ESI,PAGE_MALDI" -f TXT example.txt
 + Make Properties File and Quit: PeptideSieve -d -s -f TXT example.txt
 +</pre>
 +
 +It is CRITICAL to either place the properties.txt file in the directory where PeptideSieve is being executed or to specify the location of properties.txt using the -P flag or PeptideSieve will work very strangely.
 +
===ESPPredictor=== ===ESPPredictor===
Line 7: Line 52:
'''Classfier''': Random forest '''Classfier''': Random forest
-'''How to run the module'''+'''How to run the module''':
-There are two ways of running it:+There are two ways of running this:
*Using genepattern web service tool hosted by ''[http://www.broadinstitute.org/ Broad Institute]''. There is a detailed ''[http://www.broadinstitute.org/cancer/software/genepattern/modules/ESPPredictor.html instruction]'' on how to run it. The tool can accept the peptide list only. The invalid amino acids are not allowed, such as B, J, U, O, Z and X. *Using genepattern web service tool hosted by ''[http://www.broadinstitute.org/ Broad Institute]''. There is a detailed ''[http://www.broadinstitute.org/cancer/software/genepattern/modules/ESPPredictor.html instruction]'' on how to run it. The tool can accept the peptide list only. The invalid amino acids are not allowed, such as B, J, U, O, Z and X.
Line 32: Line 77:
'''How to run''': '''How to run''':
-There are also two ways of running it.+There are also two ways of running this.
*Through Delectability Predictor ''[http://darwin.informatics.indiana.edu/applications/PeptideDetectabilityPredictor/ web service tool]'' hosted by ''[http://www.iub.edu/ Indiana University]''. *Through Delectability Predictor ''[http://darwin.informatics.indiana.edu/applications/PeptideDetectabilityPredictor/ web service tool]'' hosted by ''[http://www.iub.edu/ Indiana University]''.
Line 47: Line 92:
'''Classifier''': SVM (Support Vector Machine) '''Classifier''': SVM (Support Vector Machine)
-'''How to run''': The software is available from [http://omics.pnl.gov/software/STEPP.php STEPP]+'''Getting the software''': The software is available from [http://omics.pnl.gov/software/STEPP.php STEPP]

Current revision

Contents

PeptideSieve

Reference:Nat Biotechnol. 2007 Jan;25(1):125-31. Computational prediction of proteotypic peptides for quantitative proteomics. Mallick P, Schirle M, Chen SS, Flory MR, Lee H, Martin D, Ranish J, Raught B, Schmitt R, Werner T, Kuster B, Aebersold R.

Getting the software: New native C++ version (.51) released 5/2008: download the peptideSieve files from the Sashimi project at SourceForge. Linux, os x and windows binaries (PeptideSieve.exe PeptideSieve.linux.i386 PeptideSieve.osx.i386) are available.

A GUI windows version is available from our collaborator Chee-Hong! It is updated to PeptideSieve version .51

Running the software:

PeptideSieve is a commandline utility.  Running it sans arguments gives the usage instructions:

PeptideSieve: Identify Proteotypic Peptides from a FASTA or TXT file.
Version - 0.6
Options:
  -O [ --outputDirectory ] arg           : set output directory
  -e [ --outputExtension ] arg           : set extension for output files
  -o [ --outputFile ] arg                : output file name if not input.extension
  -P [ --propertyFile ] arg (=properties.txt) : set property file
  -f [ --inputFormat ] arg (=FASTA)      : FASTA or TXT, specifying input format
  -l [ --minSeqLength ] arg (=6)         : minimum sequence length to consider
  -L [ --maxSeqLength ] arg (=40)        : maximum sequence length to consider
  -m [ --minMass ] arg (=400)            : minimum mass to consider
  -M [ --maxMass ] arg (=3000)           : maximum mass to consider
  -c [ --numAllowedMisCleavages ] arg (=0): maximum number of
                                                    miscleavages to consider
  -s [ --saveConvertedFile ]              : save the converted
                                                    propertyFile
  -h [ --help ]                           : display usage information
  -d [ --experimentalDesign ] arg (=PAGE_MALDI.txt): which design to return,
                                                    any of the following, in
                                                    quotes, comma separated
                                                    "PAGE_MALDI.txt,PAGE_ESI.tx
                                                    t,MUDPIT_ESI.txt,MUDPIT_ICA
                                                    T.txt"
  -p [ --pValue ] arg (=0.80000000000000004): only return peptides with p values greater than X
example usages:
        Simple Run with Fasta : PeptideSieve shortExample.tfa
        Simple Run with txt: PeptideSieve -f TXT example.txt
        Specify Classifiers: PeptideSieve -d "MUDPIT_ESI,PAGE_MALDI" -f TXT example.txt
        Make Properties File and Quit: PeptideSieve -d  -s -f TXT example.txt

It is CRITICAL to either place the properties.txt file in the directory where PeptideSieve is being executed or to specify the location of properties.txt using the -P flag or PeptideSieve will work very strangely.

ESPPredictor

Reference:Prediction of high-responding peptides for targeted protein assays by mass spectrometry Vincent A. Fusaro, D. R. Mani, Jill P. Mesirov & Steven A. Carr Nature Biotechnology (2009) 27:190-198.

Classfier: Random forest

How to run the module:

There are two ways of running this:

  • Using genepattern web service tool hosted by Broad Institute. There is a detailed instruction on how to run it. The tool can accept the peptide list only. The invalid amino acids are not allowed, such as B, J, U, O, Z and X.
  • Through command line
  • SYSTEM requirement: R, matlab, Java
  • Follow the first two steps of "How to run the module" in the instruction page.
  • Click export on the right hand side of reset button and between "properties" and "help" text to export a zip file, which contains the program source files.
You will need to do a little bit of modification on ESPPredictor.java file to let it parse the command line parameters correctly, since the class, CmdSplitter, does not exist. After a simple modification, my local ESPPredictor can run using the following command line. The "zzz" phrase is the separator for the input parameters of the matlab and R program.
java -classpath <libdir>/../ ESPPredictor.ESPPredictor <libdir> peptideFeatureSet <input.file> zzz \
<R2.5_HOME> <libdir>/ESP_Predictor.R Predict <libdir>PeptideFeatureSet.csv <libdir>ESP_Predictor_Model_020708.RData

Detectability Predictor

Reference: H. Tang, R. J. Arnold, P. Alves, Z. Xun, D. E. Clemmer, M. V. Novotny, J. P. Reilly, P. Radivojac. A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics, (2006) 22 (14): e481-e488.

Classifier: 30 two-layer feed-forward neural networks trained using the resilient back propagation algorithm

How to run: There are also two ways of running this.

  • Through Delectability Predictor web service tool hosted by Indiana University.
  • Through command line: you need to make request to hatang@indiana.edu in order to get the standalone version

APEX

This tool is still under development. If you want more information, please contact lars@imsb.biol.ethz.ch

STEPP

Reference:Webb-Robertson BJ, Cannon WR, Oehmen CS, Shah AR, Gurumoorthi V, Lipton MS, Waters KM. A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics. Bioinformatics, 2008 Jul 1;24(13):1503-9. Epub 2008 May 3.

Classifier: SVM (Support Vector Machine)

Getting the software: The software is available from STEPP

Personal tools