TPP:X!Tandem and the TPP

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 17:48, 28 June 2007
Jeng (Talk | contribs)
(What is the K-Score?)
← Previous diff
Current revision
Mhoopman (Talk | contribs)

Line 1: Line 1:
==What is X!Tandem?== ==What is X!Tandem?==
-X!Tandem is an open-source (and free) search engine, developed by the GPM.+X!Tandem is an open-source (and free) search engine, developed by [http://www.thegpm.org/ the GPM].
-==What is the K-Score?==+==What is the k-score?==
-The k-score is a score plug-in implementing the COMET score function as described in [http://www.nature.com/msb/journal/v1/n1/full/msb4100024.html this Keller et al.] manuscript. Effectively the k-score is a dot product (which breaks down to summing up matched peak intensities) after significant intensity manipulation of the query spectrum as described in the manuscript.+The k-score is a [http://bioinformatics.oxfordjournals.org/cgi/content/abstract/22/22/2830 Tandem score plug-in] implementing the COMET score function as described in [http://www.nature.com/msb/journal/v1/n1/full/msb4100024.html this Keller et al.] manuscript. The pluggable scoring framework and this specific score plug-in was implemented by Brendan MacLean of [http://www.labkey.com LabKey Software] working with the [http://proteomics.fhcrc.org McIntosh group] at the Fred Hutch Cancer Research Center.
 + 
 +Effectively the k-score is a dot product (which breaks down to summing up matched peak intensities) after significant intensity manipulation of the query spectrum as described in the manuscript. You can access the k-score plug-in as well as documentation on the general Tandem pluggable scoring API [https://proteomics.fhcrc.org/CPAS/Project/Published%20Experiments/Tandem%20Pluggable%20Scoring/begin.view? here].
 + 
 +To use the k-score plug-in, these three Tandem parameters should be set as follows:
 + <note label="scoring, algorithm" type="input">k-score</note>
 + <note label="spectrum, use conditioning" type="input">no</note>
 + <note label="scoring, minimum ion count" type="input">1</note>
 + 
 +Additionally, also set the following Tandem parameter which allows [http://tools.proteomecenter.org/wiki/index.php?title=Software:Tandem2XML Tandem2XML] (a Tandem to pepXML converter) to correctly assign scan numbers to the pepXML scan attributes, important for the quantitation tools, when searching mzXML files directly:
 + <note type="input" label="output, spectra">yes</note>
 +This parameter is not k-score specific but applies to native Tandem as well.
==Using X!Tandem and the TPP== ==Using X!Tandem and the TPP==
Line 11: Line 22:
While X!Tandem data will run through the pipeline without error, results are non-optimal. Also, we do not yet support X!Tandem in the GUI. While X!Tandem data will run through the pipeline without error, results are non-optimal. Also, we do not yet support X!Tandem in the GUI.
-*native Tandem is supposed to work with TPP, but results are quite bad+*native Tandem is supposed to work with TPP, but performance appears worse than other search engines in datasets that we've tested
-*using K-score plugin (included in TPP cygwin distrubtion, also avaliable from [https://proteomics.fhcrc.org/CPAS/Project/Published%20Experiments/Tandem%20Pluggable%20Scoring/begin.view? CPAS site] will work much better+*using K-score plugin (included in TPP cygwin distrubtion: '''tandem version Apr-1-2007''' and '''k-score version Mar-27-2007''', also avaliable from [https://proteomics.fhcrc.org/CPAS/Project/Published%20Experiments/Tandem%20Pluggable%20Scoring/begin.view? CPAS site]) performs better in a few datasets that we've tested
* proceed at your own risk if they use the more sophisticated Tandem features (e.g. some refinement options) because PeptideProphet may not work with them. * proceed at your own risk if they use the more sophisticated Tandem features (e.g. some refinement options) because PeptideProphet may not work with them.
Line 22: Line 33:
[[Software:Tandem2XML|Tandem2XML]] is used to convert Tandem results into pepXML. Please see that page for instructions. [[Software:Tandem2XML|Tandem2XML]] is used to convert Tandem results into pepXML. Please see that page for instructions.
 +
 +==X!Tandem default modifications==
 +Tandem applies default variable modifications to every search:
 +
 +If a peptide has an N-terminal Q or E, it will spontaneously eliminate -17 Da (-NH3) or -18 (-H2O) respectively, to form an N-terminal pyrolidone carboxylic acid. If the N-terminal residue is C and it has been modified by iodoacetimide, it also has an analogous reaction (-17, -NH3). Because they are rather special (they always occur to some degree), they are the only potential modifications that are built into X! Tandem.
 +
 +==X!Tandem parameters not in the GPM documentation==
 +Defined X!Tandem parameters can be found on the [http://www.thegpm.org/TANDEM/api X! Series API Documentation] page. These parameters represent the features currently supported by the authors of X!Tandem. However, many additional features have been removed from this documentation for various reasons (no longer a feature, generally never changed, etc.). Although removed from the documentation, setting/changing these parameters WILL produce changes in the analysis and results. Below is a list of parameters still known to function and their effects. This list represents the '''Advanced X! Series API''', and is recommended only for advanced users of X!Tandem.
 +
 + GROUP, output
 + 1. http - Function unknown.
 + 2. sort best scores by - Function unknown.
 + 3. title - Function unknown.
 +
 + GROUP, protein
 + 1. cleavage N-terminal limit - Function unknown.
 + 2. homolog management - Function unknown.
 + 3. use minimial annotations - Function unknown.
 +
 + GROUP, refine
 + 1. full unanticipated cleavage - Function unknown.
 + 2. potential C-terminus modifications - Function unknown.
 + 3. potential N-terminus modification position limit - Function unknown.
 +
 + GROUP, residue
 + 1. NG deamidation - Function unknown.
 +
 + GROUP, scoring
 + 1. algorithm - Specific to pluggable scoring implementations (e.g. TPP).
 + 2. pluggable scoring - Must match the compilation settings (in TPP, should always be yes).
 +
 + GROUP, spectrum
 + 1. allowed neutral losses - Function unknown.
 + 2. check all charges - Function unknown.
 + 3. homolog error - Function unknown.
 + 4. maximum parent charge - Do not search spectra with known parent charge exceeding this value.
 + 5. use conditioning - Modify spectrum using X!Tandem's internal rules prior to analysis.

Current revision

Contents

What is X!Tandem?

X!Tandem is an open-source (and free) search engine, developed by the GPM.

What is the k-score?

The k-score is a Tandem score plug-in implementing the COMET score function as described in this Keller et al. manuscript. The pluggable scoring framework and this specific score plug-in was implemented by Brendan MacLean of LabKey Software working with the McIntosh group at the Fred Hutch Cancer Research Center.

Effectively the k-score is a dot product (which breaks down to summing up matched peak intensities) after significant intensity manipulation of the query spectrum as described in the manuscript. You can access the k-score plug-in as well as documentation on the general Tandem pluggable scoring API here.

To use the k-score plug-in, these three Tandem parameters should be set as follows:

  <note label="scoring, algorithm" type="input">k-score</note>
  <note label="spectrum, use conditioning" type="input">no</note>
  <note label="scoring, minimum ion count" type="input">1</note>

Additionally, also set the following Tandem parameter which allows Tandem2XML (a Tandem to pepXML converter) to correctly assign scan numbers to the pepXML scan attributes, important for the quantitation tools, when searching mzXML files directly:

  <note type="input" label="output, spectra">yes</note>

This parameter is not k-score specific but applies to native Tandem as well.

Using X!Tandem and the TPP

This page is intended to give instructions for various ways of working with X!Tandem and the TPP. Currently, we continue to refine the PeptideProphet models for X!Tandem-Kscore.

current state

While X!Tandem data will run through the pipeline without error, results are non-optimal. Also, we do not yet support X!Tandem in the GUI.

  • native Tandem is supposed to work with TPP, but performance appears worse than other search engines in datasets that we've tested
  • using K-score plugin (included in TPP cygwin distrubtion: tandem version Apr-1-2007 and k-score version Mar-27-2007, also avaliable from CPAS site) performs better in a few datasets that we've tested
  • proceed at your own risk if they use the more sophisticated Tandem features (e.g. some refinement options) because PeptideProphet may not work with them.

Running X!Tandem

Someone could contribute info on setting up the params file for Tandem here..

Converting X!Tandem data for the TPP

Please see warnings in "current status" section above.

Tandem2XML is used to convert Tandem results into pepXML. Please see that page for instructions.

X!Tandem default modifications

Tandem applies default variable modifications to every search:

If a peptide has an N-terminal Q or E, it will spontaneously eliminate -17 Da (-NH3) or -18 (-H2O) respectively, to form an N-terminal pyrolidone carboxylic acid. If the N-terminal residue is C and it has been modified by iodoacetimide, it also has an analogous reaction (-17, -NH3). Because they are rather special (they always occur to some degree), they are the only potential modifications that are built into X! Tandem.

X!Tandem parameters not in the GPM documentation

Defined X!Tandem parameters can be found on the X! Series API Documentation page. These parameters represent the features currently supported by the authors of X!Tandem. However, many additional features have been removed from this documentation for various reasons (no longer a feature, generally never changed, etc.). Although removed from the documentation, setting/changing these parameters WILL produce changes in the analysis and results. Below is a list of parameters still known to function and their effects. This list represents the Advanced X! Series API, and is recommended only for advanced users of X!Tandem.

  GROUP, output
  1. http - Function unknown.
  2. sort best scores by - Function unknown.
  3. title - Function unknown.
  GROUP, protein
  1. cleavage N-terminal limit - Function unknown.
  2. homolog management - Function unknown.
  3. use minimial annotations - Function unknown.
  GROUP, refine
  1. full unanticipated cleavage - Function unknown.
  2. potential C-terminus modifications - Function unknown.
  3. potential N-terminus modification position limit - Function unknown.
  GROUP, residue
  1. NG deamidation - Function unknown.
  GROUP, scoring
  1. algorithm - Specific to pluggable scoring implementations (e.g. TPP).
  2. pluggable scoring - Must match the compilation settings (in TPP, should always be yes).
  GROUP, spectrum
  1. allowed neutral losses - Function unknown.
  2. check all charges - Function unknown.
  3. homolog error - Function unknown.
  4. maximum parent charge - Do not search spectra with known parent charge exceeding this value.
  5. use conditioning - Modify spectrum using X!Tandem's internal rules prior to analysis.
Personal tools