Software:SuperHirn

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 16:36, 25 February 2008
Lmueller (Talk | contribs)
(Reference)
← Previous diff
Current revision
Lmueller (Talk | contribs)

Line 1: Line 1:
[[Image:SuperHirnLogoThumb.jpg|right]] [[Image:SuperHirnLogoThumb.jpg|right]]
 +
 +==News==
 +'''SuperHirn v0.3''' is now officially available since end of this january. The software is still maintained by the group of Prof. Ruedi Aebersold at the Institute of Molecular Systems Biology (ETHZ, Switzerland). We would like to acknowledge all people who helped to test the new version, provided use with ideas and inputs to improve SuperHirn and reported back bugs. Below is a list of new features which are now available in version 0.3:
 +* higher detection coverage of MS/MS sampled peptides ( i.e. higher number of MS/MS ids that map to extracted MS/MS features)
 +* automated feature merging of similar MS1 features
 +* Automatic signal to Noise calculation
 +* parsing and annotation of XQuest results
 +* optimized feature extraction parameters for the current FT data
 +* locked MS1 peak extraction of MS/MS sequenced peptides
 +* parsing of protXML data updating MS/MS ids with ProteinProphet probabilities
 +
 +
==Description== ==Description==
'''SuperHirn''' is a novel tool to quantitatively analyze multi dimensional LC-MS data in a label-free approach and was developed by the group of Prof. Ruedi Aebersold at the Institute of Molecular Systems Biology (ETHZ, Switzerland). The software is programmed in C++ and is compatible with Unix platforms (tested on Linux and OSX). LC-MS data are preprocessed by a MS1 feature extraction routine and the different LC-MS runs are then combined by a multi dimensional LC-MS alignment into a general repository called ''MasterMap''. ''SuperHirn'' then offers several modules for post data analysis of the ''MasterMap'': '''SuperHirn''' is a novel tool to quantitatively analyze multi dimensional LC-MS data in a label-free approach and was developed by the group of Prof. Ruedi Aebersold at the Institute of Molecular Systems Biology (ETHZ, Switzerland). The software is programmed in C++ and is compatible with Unix platforms (tested on Linux and OSX). LC-MS data are preprocessed by a MS1 feature extraction routine and the different LC-MS runs are then combined by a multi dimensional LC-MS alignment into a general repository called ''MasterMap''. ''SuperHirn'' then offers several modules for post data analysis of the ''MasterMap'':
Line 7: Line 19:
* Targeted peptide/protein profiling: Correlate peptide/protein profile vs. a given target profile * Targeted peptide/protein profiling: Correlate peptide/protein profile vs. a given target profile
* MS1 feature annotation: Annotation of MS1 features in the ''MasterMap'' (inclusion list etc.) * MS1 feature annotation: Annotation of MS1 features in the ''MasterMap'' (inclusion list etc.)
 +
==Avaliablity== ==Avaliablity==
Line 12: Line 25:
Supporting material to this software: Supporting material to this software:
 +* For questions, suggestions and general comments visit the [http://groups.google.com/group/superhirn?hl=en Google Groups "SuperHirn" ].
* To access the benchmark Latin Square profiling data from the SuperHirn technical manuscript (Mueller et al.), follow [http://prottools.ethz.ch/muellelu/web/Latin_Square_Data.php this link]. * To access the benchmark Latin Square profiling data from the SuperHirn technical manuscript (Mueller et al.), follow [http://prottools.ethz.ch/muellelu/web/Latin_Square_Data.php this link].
* For more details about ''SuperHirn'', please read the corresponding publication (Mueller et al.) or download the [http://prottools.ethz.ch/muellelu/web/SuperHirn/superhirn_user_manual.pdf ''SuperHirn'' User Manual]. * For more details about ''SuperHirn'', please read the corresponding publication (Mueller et al.) or download the [http://prottools.ethz.ch/muellelu/web/SuperHirn/superhirn_user_manual.pdf ''SuperHirn'' User Manual].
* For an example data set of ''SuperHirn'', please download from this link: [http://prottools.ethz.ch/muellelu/web/SuperHirn/example_data_set.zip Example Test Set]. * For an example data set of ''SuperHirn'', please download from this link: [http://prottools.ethz.ch/muellelu/web/SuperHirn/example_data_set.zip Example Test Set].
* For additional readings for experimental wetlab procedures in combination with ''SuperHirn'' data processing: [http://tools.proteomecenter.org/doc/experimental_tips.pdf Experimental Tips]. * For additional readings for experimental wetlab procedures in combination with ''SuperHirn'' data processing: [http://tools.proteomecenter.org/doc/experimental_tips.pdf Experimental Tips].
 +
==Reference== ==Reference==
Line 32: Line 47:
Applications of SuperHirn: Applications of SuperHirn:
-* Mueller, LN, Rinner, Rinner, O, Hubálek, M, Müller, M, Gstaiger, M and Aebersold, R, An integrated mass spectrometric and computational framework for the comprehensive analysis of protein interaction networks, Nature Biotechnology 25, 345 - 352 (2007) [http://www.nature.com/nbt/journal/v25/n3/abs/nbt1289.html go to article]+* Mueller, LN and Rinner, O, Hubálek, M, Müller, M, Gstaiger, M and Aebersold, R, An integrated mass spectrometric and computational framework for the comprehensive analysis of protein interaction networks, Nature Biotechnology 25, 345 - 352 (2007) [http://www.nature.com/nbt/journal/v25/n3/abs/nbt1289.html go to article]
* Bodenmiller, B, Mueller, LN, Mueller, M, Domon, B and Aebersold, R, Reproducible Isolation of Distinct, Overlapping Segments of the Phospho-Proteome. Nature Methods - 4, 231 - 237 (2007) [http://www.nature.com/nmeth/journal/v4/n3/full/nmeth1005.html go to article] * Bodenmiller, B, Mueller, LN, Mueller, M, Domon, B and Aebersold, R, Reproducible Isolation of Distinct, Overlapping Segments of the Phospho-Proteome. Nature Methods - 4, 231 - 237 (2007) [http://www.nature.com/nmeth/journal/v4/n3/full/nmeth1005.html go to article]
 +* Rinner, O, Seebacher, J, Walzthoeni, T, Mueller, LN, Beck, M, Schmidt, A, Mueller, M, Aebersold, R, Identification of cross-linked peptides from large sequence databases. Nature Methods - 5, 315 - 323 (2008) [http://www.nature.com/doifinder/10.1038/nmeth.1192 go to article]
 +* Schmidt A, Gehlenborg N, Bodenmiller B, Mueller LN, Campbell D, Mueller M, Aebersold R, Domon B., An integrated, directed mass spectrometric approach for in-depth characterization of complex peptide mixtures. MCP, 2008 Nov;7(11):2138-5 [http://www.mcponline.org/cgi/reprint/M700498-MCP200v1 go to article]
 +* Schiess R, Mueller LN, Mueller M, Wollscheid, B, Aebersold R, Analysis of cell surface proteome changes via label-free, quantitative mass spectrometry. MCP, 2008 Nov;7(11):2138-5 [http://www.mcponline.org/cgi/reprint/M800172-MCP200v1 go to article]
 +* Mueller LN, Brusniak M, Mani DR, Aebersold R, An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. J Proteome Res. 2008 Jan;7(1):51-61 [http://pubs.acs.org/doi/abs/10.1021/pr700758r go to article]
 +* Urwyler S, Nyfeler Y, Ragaz C, Lee H, Mueller LN, Aebersold R, Hilbi H, Proteome analysis of Legionella vacuoles purified by magnetic immuno-separation reveals secretory and endosomal GTPases. Traffic 2008 Oct 29, AOP, [http://www3.interscience.wiley.com/journal/121494837/abstract?CRETRY=1&SRETRY=0 go to article]
 +* Letarte S, Brusniak M, Campbell D, Eddes J, Kemp C, Lau H, Mueller LN, Schmidt A, Shannon P, Kelly-Spratt, Vitek O, Zhang H, Aebersold R, Watts J, Differential Plasma Glycoproteome of p19ARF Skin Cancer Mouse Model Using the Corra Label-Free LC-MS Proteomics Platform. Clinical Proteomics, Volume 4, Numbers 3-4 / December, 2008, [http://www.springerlink.com/content/cq2p4vg658475312/ go to article]
 +
==Developers== ==Developers==
* Lukas N. Mueller ([mailto:Lukas.Mueller@imsb.biol.ethz.ch send email]) * Lukas N. Mueller ([mailto:Lukas.Mueller@imsb.biol.ethz.ch send email])
 +
==Other Stuff == ==Other Stuff ==
Das SuperHirn:: http://www.youtube.com/watch?v=LPj6cfX_U9o Das SuperHirn:: http://www.youtube.com/watch?v=LPj6cfX_U9o
 +
 +
 +==About SuperHirn Parameters ==
 +
 +'''Description about the most important SuperHirn parameters:'''
 +
 +The following table describes the most important SuperHirn processing parameters. These are stored in the Root-Parameter file and mostly optimized for FT profile data. To modify a parameter, do as following:
 +
 +1.) copy the parameter in this format to your param.def file:
 +
 +MS1 retention time tolerance=1.0
 +
 +2.) adjust the parameters as you wish:
 +
 +MS1 retention time tolerance=2.0
 +
 +3.) run SuperHirn
 +
 +4.) if you do not need the parameter anymore, just delete it or comment it out by //MS1 retention time tolerance ...
 +
 +
 +
 +'''Parameters:'''
 +
 +{| class="wikitable" style="text-align:left;" border="1" cellpadding="1"
 +|-
 +! Parameter name !! Description !! Suggested Value !! Comment
 +|-
 +! General:
 +|-
 +| MS1 retention time tolerance || RT tolerance between MS1 features used for the alignment (min) || 1.0 || -
 +|-
 +|MS1 m/z tolerance|| Mass tolerance between MS1 features used for the alignment (ppm) || 10 || -
 +|-
 +|MS2 PPM m/z tolerance || Mass tolerance for annotation of MS1 features with MS/MS identifications (ppm) || 20 || -
 +|-
 +|MS2 mass matching modus || if theoretical peptide mass (1) or precursor mass (0) used to mapp MS/MS ids to MS1 features || 0 || -
 +|-
 +|Peptide Prophet Threshold || Peptide Prophet Threshold, see peptide prophet paper || 0.9 || -
 +|-
 +|MS2 SCAN tolerance || Scan tolerance for annotation of MS1 features with MS/MS identifications (# scans) || 100 || -
 +|-
 +|MS2 retention time tolerance || RT tolerance for annotation of MS1 features with MS/MS identifications (min), if set to -1.0, then value from MS1 retention time tolerance parameter used || -1.0 || -
 +|-
 +|INCLUSIONS LIST MS2 SCAN tolerance || Scan tolerance for annotation of MS1 features with inclusion list MS/MS identifications (# scans) || 100000 || -
 +|-
 +! LC-MS Alignment:
 +|-
 +|retention time window || RT window to search for common MS1 feature across LC-MS runs before the alignment, i.e. maximal RT shift possible (min) || 5.0 || -
 +|-
 +|mass / charge window || Mass window to search for common MS1 feature across LC-MS runs before the alignment, i.e. maximal mass shift possible (ppm) || 20 || -
 +|-
 +! Peak Detection:
 +|-
 +|MS1 external isotopic distribution file || Path to XML file containing external Isotopic Peptide Distributions, use only if abnormal peptide distributions expected || "" || -
 +|-
 +|MS1 data centroid data || If MS1 data is in centroid (1) or raw (0) format || 0 || -
 +|-
 +|Save MS/MS sequenced MS1 monoisotopic peaks || Option to keep detected Monoisotopic peaks which have been selected for MS/MS but do NOT fullfill the LC elution peak criteria (i.e. times detected, ∆RT between the detected peaks): on(1) or off(0) || 1 || -
 +|-
 +|Precursor detection scan levels || MS scan from which level should be submitted for peak extraction, example for scan level 1 and 2: 1,2 || 1 || -
 +|-
 +|FT peak detect MS1 m/z tolerance || Mass tolerance to cluster detected monoisotopic peaks into RT elution clusters, i.e. MS1 featuers (ppm) || 10 || -
 +|-
 +|MS1-to-MS2 precursor tolerance || Mass tolerance to associate MS/MS precursors to extracted mono isotopic peaks (ppm) || 15 || -
 +|-
 +|FT peak detect MS1 min nb peak members || Minimal number of detected mono isotopic peaks for a MS1 feature || 4 || -
 +|-
 +|MS1 max inter scan distance || Maximal allowed RT distance between detected mono isotopic peak (min) || 0.2 || -
 +|-
 +|FT peak detect MS1 intensity min threshold || Minimal intensity of a detected mono isotopic peak (counts) || 1000 || -
 +|-
 +|Absolute isotope mass precision || Mass precision used in the detection of isotopic distributions (Da) || 0.01 || -
 +|-
 +|Relative isotope mass precision || Mass precision used in the detection of isotopic distributions (ppm) || 10 || -
 +|-
 +|IntensityCV || Coefficient of variance used to correlate a theoretical isotopic distributions to the observed one, i.e. the closer to one the better these two distributions need to coorelate || 0.9 || -
 +|-
 +|Detectable isotope factor || At which % of the highest Isotope other isotopes to not need to be detected anymore || 0.1 || -
 +|-
 +|Min. RAW MS Signal Intensity || Minimal intensity of a raw signal (before centroiding) || 10 || -
 +|-
 +|Minimal peak height || Minimal intensity of a peak signal (after centroiding) || 0 || -
 +|-
 +! MS1 Feature Merging:
 +|-
 +|Activation of MS1 feature merging post processing || Turn peak merging on(1) or off(0) || 1 || -
 +|-
 +|PPM value for the m/z clustering of merging candidates || Mass tolerance to merge MS1 features (ppm) || 10 || -
 +|-
 +|Initial Apex Tr tolerance || RT window to search for MS1 feature candidates which should be merged (min) || 5.0 || -
 +|-
 +|MS1 feature Tr merging tolerance || Final RT tolerance for MS1 feature to be merged (min), measured from their start/end elution points || 1.0 || -
 +|-
 +|Percentage of intensity variation between LC border peaks || Maximal log10 Intensity variation between start/end elution points of 2 features which should be merged || 50.0 || -
 +|-
 +! KMeans Profile Clustering Parameters :
 +|-
 +|number of clusters || Number of initial start clusters, really depends on how many number of profile trends (or biological groups) you expect from your data || || -
 +|-
 +|min. nb. of profile data points || How many profile points a MS1 features needs (same as how many times aligned) to be integrated into the clustering analysis || || -
 +|-
 +|min. nb. of cluster members || How many members a build cluster needs minimally to survive the next clustering iteration || || -
 +|-
 +
 +|}

Current revision

Contents

News

SuperHirn v0.3 is now officially available since end of this january. The software is still maintained by the group of Prof. Ruedi Aebersold at the Institute of Molecular Systems Biology (ETHZ, Switzerland). We would like to acknowledge all people who helped to test the new version, provided use with ideas and inputs to improve SuperHirn and reported back bugs. Below is a list of new features which are now available in version 0.3:

  • higher detection coverage of MS/MS sampled peptides ( i.e. higher number of MS/MS ids that map to extracted MS/MS features)
  • automated feature merging of similar MS1 features
  • Automatic signal to Noise calculation
  • parsing and annotation of XQuest results
  • optimized feature extraction parameters for the current FT data
  • locked MS1 peak extraction of MS/MS sequenced peptides
  • parsing of protXML data updating MS/MS ids with ProteinProphet probabilities


Description

SuperHirn is a novel tool to quantitatively analyze multi dimensional LC-MS data in a label-free approach and was developed by the group of Prof. Ruedi Aebersold at the Institute of Molecular Systems Biology (ETHZ, Switzerland). The software is programmed in C++ and is compatible with Unix platforms (tested on Linux and OSX). LC-MS data are preprocessed by a MS1 feature extraction routine and the different LC-MS runs are then combined by a multi dimensional LC-MS alignment into a general repository called MasterMap. SuperHirn then offers several modules for post data analysis of the MasterMap:

  • LC-MS similarity analysis: Binary similarity analysis of LC-MS runs (intensity reproducibility, feature overlap)
  • Feature intensity normalization: global MS1 feature intensity normalization across LC-MS runs
  • Unsupervised feature profiling: Kmeans cluster analysis of MS1 features
  • Targeted peptide/protein profiling: Correlate peptide/protein profile vs. a given target profile
  • MS1 feature annotation: Annotation of MS1 features in the MasterMap (inclusion list etc.)


Avaliablity

The source code of SuperHirn can now be downloaded from the download page: go to download page

Supporting material to this software:

  • For questions, suggestions and general comments visit the Google Groups "SuperHirn" .
  • To access the benchmark Latin Square profiling data from the SuperHirn technical manuscript (Mueller et al.), follow this link.
  • For more details about SuperHirn, please read the corresponding publication (Mueller et al.) or download the SuperHirn User Manual.
  • For an example data set of SuperHirn, please download from this link: Example Test Set.
  • For additional readings for experimental wetlab procedures in combination with SuperHirn data processing: Experimental Tips.


Reference

Software Article:

  • Mueller, LN, Rinner, O, Schmidt, A, Letarte, S, Bodenmiller, B, Brusniak, MY, Vitek, O, Aebersold, R and Muller, M, SuperHirn - a novel tool for high resolution LC-MS based peptide/protein profiling, Proteomics, accepted for publication (2007) go to article





Applications of SuperHirn:

  • Mueller, LN and Rinner, O, Hubálek, M, Müller, M, Gstaiger, M and Aebersold, R, An integrated mass spectrometric and computational framework for the comprehensive analysis of protein interaction networks, Nature Biotechnology 25, 345 - 352 (2007) go to article
  • Bodenmiller, B, Mueller, LN, Mueller, M, Domon, B and Aebersold, R, Reproducible Isolation of Distinct, Overlapping Segments of the Phospho-Proteome. Nature Methods - 4, 231 - 237 (2007) go to article
  • Rinner, O, Seebacher, J, Walzthoeni, T, Mueller, LN, Beck, M, Schmidt, A, Mueller, M, Aebersold, R, Identification of cross-linked peptides from large sequence databases. Nature Methods - 5, 315 - 323 (2008) go to article
  • Schmidt A, Gehlenborg N, Bodenmiller B, Mueller LN, Campbell D, Mueller M, Aebersold R, Domon B., An integrated, directed mass spectrometric approach for in-depth characterization of complex peptide mixtures. MCP, 2008 Nov;7(11):2138-5 go to article
  • Schiess R, Mueller LN, Mueller M, Wollscheid, B, Aebersold R, Analysis of cell surface proteome changes via label-free, quantitative mass spectrometry. MCP, 2008 Nov;7(11):2138-5 go to article
  • Mueller LN, Brusniak M, Mani DR, Aebersold R, An assessment of software solutions for the analysis of mass spectrometry based quantitative proteomics data. J Proteome Res. 2008 Jan;7(1):51-61 go to article
  • Urwyler S, Nyfeler Y, Ragaz C, Lee H, Mueller LN, Aebersold R, Hilbi H, Proteome analysis of Legionella vacuoles purified by magnetic immuno-separation reveals secretory and endosomal GTPases. Traffic 2008 Oct 29, AOP, go to article
  • Letarte S, Brusniak M, Campbell D, Eddes J, Kemp C, Lau H, Mueller LN, Schmidt A, Shannon P, Kelly-Spratt, Vitek O, Zhang H, Aebersold R, Watts J, Differential Plasma Glycoproteome of p19ARF Skin Cancer Mouse Model Using the Corra Label-Free LC-MS Proteomics Platform. Clinical Proteomics, Volume 4, Numbers 3-4 / December, 2008, go to article


Developers


Other Stuff

Das SuperHirn:: http://www.youtube.com/watch?v=LPj6cfX_U9o


About SuperHirn Parameters

Description about the most important SuperHirn parameters:

The following table describes the most important SuperHirn processing parameters. These are stored in the Root-Parameter file and mostly optimized for FT profile data. To modify a parameter, do as following:

1.) copy the parameter in this format to your param.def file:

MS1 retention time tolerance=1.0

2.) adjust the parameters as you wish:

MS1 retention time tolerance=2.0

3.) run SuperHirn

4.) if you do not need the parameter anymore, just delete it or comment it out by //MS1 retention time tolerance ...


Parameters:

Parameter name Description Suggested Value Comment
General:
MS1 retention time tolerance RT tolerance between MS1 features used for the alignment (min) 1.0 -
MS1 m/z tolerance Mass tolerance between MS1 features used for the alignment (ppm) 10 -
MS2 PPM m/z tolerance Mass tolerance for annotation of MS1 features with MS/MS identifications (ppm) 20 -
MS2 mass matching modus if theoretical peptide mass (1) or precursor mass (0) used to mapp MS/MS ids to MS1 features 0 -
Peptide Prophet Threshold Peptide Prophet Threshold, see peptide prophet paper 0.9 -
MS2 SCAN tolerance Scan tolerance for annotation of MS1 features with MS/MS identifications (# scans) 100 -
MS2 retention time tolerance RT tolerance for annotation of MS1 features with MS/MS identifications (min), if set to -1.0, then value from MS1 retention time tolerance parameter used -1.0 -
INCLUSIONS LIST MS2 SCAN tolerance Scan tolerance for annotation of MS1 features with inclusion list MS/MS identifications (# scans) 100000 -
LC-MS Alignment:
retention time window RT window to search for common MS1 feature across LC-MS runs before the alignment, i.e. maximal RT shift possible (min) 5.0 -
mass / charge window Mass window to search for common MS1 feature across LC-MS runs before the alignment, i.e. maximal mass shift possible (ppm) 20 -
Peak Detection:
MS1 external isotopic distribution file Path to XML file containing external Isotopic Peptide Distributions, use only if abnormal peptide distributions expected "" -
MS1 data centroid data If MS1 data is in centroid (1) or raw (0) format 0 -
Save MS/MS sequenced MS1 monoisotopic peaks Option to keep detected Monoisotopic peaks which have been selected for MS/MS but do NOT fullfill the LC elution peak criteria (i.e. times detected, ∆RT between the detected peaks): on(1) or off(0) 1 -
Precursor detection scan levels MS scan from which level should be submitted for peak extraction, example for scan level 1 and 2: 1,2 1 -
FT peak detect MS1 m/z tolerance Mass tolerance to cluster detected monoisotopic peaks into RT elution clusters, i.e. MS1 featuers (ppm) 10 -
MS1-to-MS2 precursor tolerance Mass tolerance to associate MS/MS precursors to extracted mono isotopic peaks (ppm) 15 -
FT peak detect MS1 min nb peak members Minimal number of detected mono isotopic peaks for a MS1 feature 4 -
MS1 max inter scan distance Maximal allowed RT distance between detected mono isotopic peak (min) 0.2 -
FT peak detect MS1 intensity min threshold Minimal intensity of a detected mono isotopic peak (counts) 1000 -
Absolute isotope mass precision Mass precision used in the detection of isotopic distributions (Da) 0.01 -
Relative isotope mass precision Mass precision used in the detection of isotopic distributions (ppm) 10 -
IntensityCV Coefficient of variance used to correlate a theoretical isotopic distributions to the observed one, i.e. the closer to one the better these two distributions need to coorelate 0.9 -
Detectable isotope factor At which % of the highest Isotope other isotopes to not need to be detected anymore 0.1 -
Min. RAW MS Signal Intensity Minimal intensity of a raw signal (before centroiding) 10 -
Minimal peak height Minimal intensity of a peak signal (after centroiding) 0 -
MS1 Feature Merging:
Activation of MS1 feature merging post processing Turn peak merging on(1) or off(0) 1 -
PPM value for the m/z clustering of merging candidates Mass tolerance to merge MS1 features (ppm) 10 -
Initial Apex Tr tolerance RT window to search for MS1 feature candidates which should be merged (min) 5.0 -
MS1 feature Tr merging tolerance Final RT tolerance for MS1 feature to be merged (min), measured from their start/end elution points 1.0 -
Percentage of intensity variation between LC border peaks Maximal log10 Intensity variation between start/end elution points of 2 features which should be merged 50.0 -
KMeans Profile Clustering Parameters :
number of clusters Number of initial start clusters, really depends on how many number of profile trends (or biological groups) you expect from your data -
min. nb. of profile data points How many profile points a MS1 features needs (same as how many times aligned) to be integrated into the clustering analysis -
min. nb. of cluster members How many members a build cluster needs minimally to survive the next clustering iteration -
Personal tools