ReleasePlanning

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 17:57, 3 June 2014
JoeS (Talk | contribs)
(Proposed)
← Previous diff
Current revision
JoeS (Talk | contribs)
(5.0.0 Release)
Line 1: Line 1:
This page is intended to assist in release planning by tracking new feature requests/bugs/improvements etc for TPP. This page is intended to assist in release planning by tracking new feature requests/bugs/improvements etc for TPP.
-== 4.8.0 Release ==+== 5.0.0 Release ==
- +
-* Release date: post-ASMS (July 2014?)+
==== Proposed ==== ==== Proposed ====
- 
-* Do not write out FPKM model data if it was not run/used 
-** Do not write alt_pos_neg_ratio if not necessary 
* Would it be easy to adjust the idconvert to: * Would it be easy to adjust the idconvert to:
Line 31: Line 26:
* (Joe+Luis) NEW FEATURE: Create a centralized path furnishing system * (Joe+Luis) NEW FEATURE: Create a centralized path furnishing system
- 
-* (Luis/Joe) BUG: lorikeet uses hardcoded paths to the html and javascript directories on the Apache server (e.g. ISB/js/lorikeet.js) instead of using the TPP_WEB Makefile variable. 
* (Luis) NEW FEATURE: Provide a db fmt/verifier that gets run as an optional first step in petunia * (Luis) NEW FEATURE: Provide a db fmt/verifier that gets run as an optional first step in petunia
Line 57: Line 50:
** Some attempts to build a library larger than 2 GB fails. Some attempts succeed but then a search yields an error that the index is bad. Perhaps it works fine on 64-bit Linux and/or 64-bit Windows?? ** Some attempts to build a library larger than 2 GB fails. Some attempts succeed but then a search yields an error that the index is bad. Perhaps it works fine on 64-bit Linux and/or 64-bit Windows??
-* Remove unneeded cgi's:+* Remove unneeded executables (add to this list!) :
-** AnalysisSummaryParser.cgi, more_pepanal.pl, show_search_params.pl, show_sens_err.pl, more_anal.pl, prot_wt_xml.pl, show_nspbin.pl, findsubsets.pl(?), peptidexml_html2.pl, "Additional Analysis Info" button on PepXMLViewer+** AnalysisSummaryParser.cgi
 +** more_pepanal.pl
 +** show_search_params.pl
 +** show_sens_err.pl
 +** more_anal.pl
 +** prot_wt_xml.pl
 +** show_nspbin.pl
 +** findsubsets.pl(?)
 +** peptidexml_html2.pl
 +** "Additional Analysis Info" button on PepXMLViewer
 +** pepxml2html.pl
 + 
* Integrate calctpp_stats into (new) models page; fix "WARNNING" typo. * Integrate calctpp_stats into (new) models page; fix "WARNNING" typo.
Line 71: Line 75:
==== To Do ==== ==== To Do ====
 +
 +* (Joe) Fix Proteowizard builds on Linux so that they support HDF5 (or not, perhaps it should be optional).
* (Joe) Clean up the TPP README. Future READMEs should just be the release notes with the URLS to the wikis. * (Joe) Clean up the TPP README. Future READMEs should just be the release notes with the URLS to the wikis.
- 
-* GUI for idconvert 
- 
-* Fix Windows updatepaths.pl to fix the pepXML reference IN ProtXML 
* reSpect * reSpect
Line 85: Line 87:
* Data downloader system * Data downloader system
 +
* Make all tools be happy with mzMLs in a different folder * Make all tools be happy with mzMLs in a different folder
** Luis will add "output folder" option for Petunia Search, and test from there ** Luis will add "output folder" option for Petunia Search, and test from there
 +
 +* Fix '''comet-pepxml.cgi''' to point at correct location for (/tpp/html/css/)PepXMLViewer.css in Windows
* 64-bit build * 64-bit build
* Nightly Windows build * Nightly Windows build
- 
-* Final new protXML viewer 
-* Luis suggests having ASAPRatio write out P-value information into the 
-* pepXML instead of as a PNG 
* Update the ProtXML schema for ASAPRatio change AND for n_distinct_sequences * Update the ProtXML schema for ASAPRatio change AND for n_distinct_sequences
- 
-* In ProteinProphet, make it write total number of distinct sequences corresponding to the total number of PSMs (already there) 
* Update ProtXMLViewer.pl to display n_distinct_sequences if present * Update ProtXMLViewer.pl to display n_distinct_sequences if present
==== Completed ==== ==== Completed ====
 +
 +* GUI for idconvert
 +** ''In Petunia''
 +
 +* In ProteinProphet, make it write total number of distinct sequences corresponding to the total number of PSMs (already there)
 +** ''Done''
 +
 +* Final new protXML viewer
 +** ''Checked in''
 +
 +* (Luis/Joe) BUG: lorikeet uses hardcoded paths to the html and javascript directories on the Apache server (e.g. ISB/js/lorikeet.js) instead of using the TPP_WEB Makefile variable.
 +** ''Joe fixed this''
 +
 +* Luis suggests having ASAPRatio write out P-value information into the protXML instead of as a PNG
 +** ''Now writing pvalue model data into protXML; will leave png for a trial period''
 +
 +* Do not write out FPKM model data if it was not run/used
 +** ''Done''
 +
 +* (Joe) Fixed Windows updatepaths.pl to fix the pepXML reference IN ProtXML
* (Luis) Streamline Petunia's directory listing (can be very slow) -- rely on file extensions? Implement AJAX? * (Luis) Streamline Petunia's directory listing (can be very slow) -- rely on file extensions? Implement AJAX?
-** ''The culprit was calls to tpp_hostname (via tpplib_perl); removed those and now it is lighting fast!''+** ''The culprit was calls to tpp_hostname (via tpplib_perl); removed those and now it is lightning fast!''
* (Eric+Joe+Steve+Luis) NEW FEATURE: Bundle idConvert, Petunia page, Steven is fixing it and working with Matt * (Eric+Joe+Steve+Luis) NEW FEATURE: Bundle idConvert, Petunia page, Steven is fixing it and working with Matt
Line 161: Line 180:
== Legacy Items == == Legacy Items ==
-  
- <nowiki> 
- Fix iProphet models page ("performance" models are missing)+* Fix iProphet models page ("performance" models are missing)
-- Add "Bruker" files option to msconvert in Petunia+* <strike>Add "Bruker" files option to msconvert in Petunia</strike> ''done''
-- Fix pepxml2html (View Peptide hits) peptide link for SpectraST results+* Fix pepxml2html (View Peptide hits) peptide link for SpectraST results
-- peptidexmlhtml2 : does not work with AA highlighting option (mark_aa=)+* peptidexmlhtml2 : does not work with AA highlighting option (mark_aa=)
-- Add some filtering options to Pep3D (specifically: view only +1 ions)+* Add some filtering options to Pep3D (specifically: view only +1 ions)
-- Add QuickMod to speclibs page?+* Add QuickMod to speclibs page?
-- retire getSpectrum -> use readMzXML+* retire getSpectrum -> use readMzXML
-- incorporate psm2pdf+* incorporate psm2pdf
-- include t2d support(?)+* include t2d support(?)
-- Petunia: option to convert to mzXML using msconvert+* <strike>Petunia: option to convert to mzXML using msconvert</strike> ''done''
-- Petunia: launch all external links in blank/target window+* <strike>Petunia: launch all external links in blank/target window</strike> ''done''
-- Reindex mzXML: new file with old name; rename old to: old-index or some such+* Reindex mzXML: new file with old name; rename old to: old-index or some such
-- Petunia: option to run xinteract jobs as separate (instead of combining into a single analysis)+* <strike>Petunia: option to run xinteract jobs as separate (instead of combining into a single analysis)</strike> ''done''
-- Petunia: select pep.xml files only for input to Decoy Models pages+* <strike>Petunia: select pep.xml files only for input to Decoy Models pages</strike> ''done''
-- Petunia: auto cd to dir and strip out full path if all files in same dir (shorten command line)+* <strike>Petunia: auto cd to dir and strip out full path if all files in same dir (shorten command line)</strike> ''done''
-- Petunia: rename Mascot file on download+* <strike>Petunia: rename Mascot file on download</strike> ''done''
-- Move /tmp/ directory under tpp-bin (Windows)+* Move /tmp/ directory under tpp-bin (Windows)
-- Petunia: Add TPP version check (ping server) --> usage stats+* <strike>Petunia: Add TPP version check (ping server) --> usage stats</strike> ''done''
-- Petunia: add full filters to msconvert(??)+* <strike>Petunia: add full filters to msconvert(??)</strike> -- ''added free text box for filters and other options''
-- Petunia: change tab graphic when jobs are done?+* Petunia: change tab graphic when jobs are done?
-- PepXMLViewer: implement "preferences" tab: choose ions series, spectrum viewer, etc... ?+* PepXMLViewer: implement "preferences" tab: choose ions series, spectrum viewer, etc... ?
-- PepXMLViewer: no pep-pro probability link in iProphet results+* PepXMLViewer: no pep-pro probability link in iProphet results
-- Libra also needs some love (ability to curate ratios would be a great start...)+* Libra also needs some love
 +** Add ability to curate ratios
 +** Pick first/main protein out of a set of indistinguishable ones when writing quantitation file (and not a random one)
-== From Henry (perhaps he already checked this in?) :+'''From Henry (perhaps he already checked this in?) :'''
 + 
Following up on this discussion, I would like to propose the following. (This is not urgent since our course doesn't use OMSSA or any of the newly-supported engines, so it doesn't need to be in the release soon.) Following up on this discussion, I would like to propose the following. (This is not urgent since our course doesn't use OMSSA or any of the newly-supported engines, so it doesn't need to be in the release soon.)
 +* Change xinteract to run RefreshParser before PeptideProphet.
 +** This takes care of the protein name issues for OMSSA, and makes sure that the NTT and NMC fields are there and calculated correctly for PeptideProphet. Some search engines don't fill in those fields and cause PeptideProphet to crash.
 +** The flip-side is mainly inefficiency. If the user decides not to keep all search hits regardless of probability (i.e. not specifying -p0), then this means RefreshParser unnecessarily refresh a bunch of wrong identifications that won't be in the final pepXML file anyway. But then again, given that nowadays PeptideProphet often relies on the protein name (whether it's a DECOY or not), it seems safer to ensure that the protein names are right -- even for the wrong hits -- before we run PeptideProphet. Also, with the new implementation, for most files RefreshParser is so fast that I don't think it will make much difference in running time anyway.
 +** If the user doesn't want to refresh, he/she can always turn off RefreshParser (-nR option).
-1. Change xinteract to run RefreshParser before PeptideProphet.+* Have InteractParser detects if the search engine is OMSSA, and if so, apply the fix_pyro_mods_ option automatically.
- +** This goes without saying. The only thing is the user doesn't have to turn on the option manually. I don't know if other engines have the same problem. If so, it's easy to add them to the list.
-This takes care of the protein name issues for OMSSA, and makes sure that the NTT and NMC fields are there and calculated correctly for PeptideProphet. Some search engines don't fill in those fields and cause PeptideProphet to crash.+
- +
-The flip-side is mainly inefficiency. If the user decides not to keep all search hits regardless of probability (i.e. not specifying -p0), then this means RefreshParser unnecessarily refresh a bunch of wrong identifications that won't be in the final pepXML file anyway. But then again, given that nowadays PeptideProphet often relies on the protein name (whether it's a DECOY or not), it seems safer to ensure that the protein names are right -- even for the wrong hits -- before we run PeptideProphet. Also, with the new implementation, for most files RefreshParser is so fast that I don't think it will make much difference in running time anyway.+
- +
-If the user doesn't want to refresh, he/she can always turn off RefreshParser (-nR option).+
- +
- +
-2. Have InteractParser detects if the search engine is OMSSA, and if so, apply the fix_pyro_mods_ option automatically.+
-This goes without saying. The only thing is the user doesn't have to turn on the option manually. I don't know if other engines have the same problem. If so, it's easy to add them to the list.+
- +
- </nowiki>+

Current revision

This page is intended to assist in release planning by tracking new feature requests/bugs/improvements etc for TPP.

Contents

5.0.0 Release

Proposed

  • Would it be easy to adjust the idconvert to:
    • 1) If the referenced pepXML cannot be found at the path specified in the protXML, try to see if there’s a file of the same name in the CWD, and if so use that, PLUS emit a warning a warning that this assumption was made?
    • 2) If the correct pepXML cannot be located/opened, generate an ERROR that the matched pepXML cannot be found?
  • Redo Apache configuration file setup.
    • Its very out of date
    • Incompatible with Apache 2.4 (as found in Ubuntu)
    • Hard to deploy/maintain
    • Doesn't use web application conventions (single file in available-sites)
  • (Joe) Include documentation on how to configure sudo/SGE for petunia
  • (Joe) NEW FEATURE: Add ability to include CPAN modules in TPP installer for AMZTPP
    • (Joe) NEW FEATURE: Include amztpp in the distribution using new CPAN module installer
    • (Joe) IMPROVEMENT/BUG: Have the TPP installer fill in the correct paths in tpp_gui.pl when building. Warn if a program can't be found. On Windows it seems to have a hardcoded path to perl so that the programs/cgi scripts fail if you put perl in another location, e.g. installing 64 bit perl which goes to C:\Perl64 instead of C:\Perl. Recently tried the 64 bit version and found that the installer throws ugly perl errors in the installation. The cgi scripts however seem to work.
  • (Luis) NEW FEATURE: Support of OMSSA and hrk-score in the Petunia, prophets, quant, viewers
  • (Joe+Luis) NEW FEATURE: Create a centralized path furnishing system
  • (Luis) NEW FEATURE: Provide a db fmt/verifier that gets run as an optional first step in petunia
  • (Joe) BUG: new decoy generator doesn't know about NCBI "|" so creates funny sequence ids
  • Rewrite QualScore in portable C++ to retire Java
  • Joe would love to rewrite the Make file
  • Luis proposes to replace the current [Analyze Peptides] etc. tabs with simpler interface where the user chooses which steps to perform and in what order. This not only simplifies the page, but could also double as a very simple workflow manager. Selected options would be saved to a simple params file of key-value pairs that would be read by a new runtpp program that would launch tools with the appropriate parameters
  • Update to very latest comet
  • We should really distribute some modern fasta databases with TPP data instead of ancient IPI dinosaurs
  • Revisit the data warmstart strategy. course. tutorial. end-to-end. more!
  • We also discussed the idea of completely generalizing Joe’s mod of bgzip and calling it igzip (for indexed gzip) and giving it the potential to be widely used apart from mass spectrometry/mzML.
  • We should synch as closely as possible how data for the example datasets looks on the EC2 nodes and local installations
  • SpectraST appears to have a 2 GB library limit on Windows? Linux not tested.
    • Some attempts to build a library larger than 2 GB fails. Some attempts succeed but then a search yields an error that the index is bad. Perhaps it works fine on 64-bit Linux and/or 64-bit Windows??
  • Remove unneeded executables (add to this list!) :
    • AnalysisSummaryParser.cgi
    • more_pepanal.pl
    • show_search_params.pl
    • show_sens_err.pl
    • more_anal.pl
    • prot_wt_xml.pl
    • show_nspbin.pl
    • findsubsets.pl(?)
    • peptidexml_html2.pl
    • "Additional Analysis Info" button on PepXMLViewer
    • pepxml2html.pl


  • Integrate calctpp_stats into (new) models page; fix "WARNNING" typo.

TWA Related

  • Joe proposed a new idea where if the code is running on Amazon, then in addition to a [Log Off] button, also provide a [Save data and shutdown] button within Petunia. The current TWA [Stop Instance] button would become more of an [Immediate Kill] button with a JS warning that you may lose work because of this.
  • Add all modern instance types to TWA
  • Add TWA tutorial for paper!

To Do

  • (Joe) Fix Proteowizard builds on Linux so that they support HDF5 (or not, perhaps it should be optional).
  • (Joe) Clean up the TPP README. Future READMEs should just be the release notes with the URLS to the wikis.
  • reSpect
    • Simple reSpect GUI
    • (David) Make sure all reSpect code is fully checked in
    • (David) will make reSpect tutorial on wiki for course data
    • reSpect bugs: "There are none left!" David Shteynberg 2014-05-27
  • Data downloader system
  • Make all tools be happy with mzMLs in a different folder
    • Luis will add "output folder" option for Petunia Search, and test from there
  • Fix comet-pepxml.cgi to point at correct location for (/tpp/html/css/)PepXMLViewer.css in Windows
  • 64-bit build
  • Nightly Windows build
  • Update the ProtXML schema for ASAPRatio change AND for n_distinct_sequences
  • Update ProtXMLViewer.pl to display n_distinct_sequences if present

Completed

  • GUI for idconvert
    • In Petunia
  • In ProteinProphet, make it write total number of distinct sequences corresponding to the total number of PSMs (already there)
    • Done
  • Final new protXML viewer
    • Checked in
  • (Luis/Joe) BUG: lorikeet uses hardcoded paths to the html and javascript directories on the Apache server (e.g. ISB/js/lorikeet.js) instead of using the TPP_WEB Makefile variable.
    • Joe fixed this
  • Luis suggests having ASAPRatio write out P-value information into the protXML instead of as a PNG
    • Now writing pvalue model data into protXML; will leave png for a trial period
  • Do not write out FPKM model data if it was not run/used
    • Done
  • (Joe) Fixed Windows updatepaths.pl to fix the pepXML reference IN ProtXML
  • (Luis) Streamline Petunia's directory listing (can be very slow) -- rely on file extensions? Implement AJAX?
    • The culprit was calls to tpp_hostname (via tpplib_perl); removed those and now it is lightning fast!
  • (Eric+Joe+Steve+Luis) NEW FEATURE: Bundle idConvert, Petunia page, Steven is fixing it and working with Matt
    • Bundled, Petunia page added
  • (Luis) Write out ASAPRatio P-value information into the protXML instead of as a PNG
    • Done, but still need to update protXML schema (new elements). PNG will stay for a transitionary period.
  • Review the state of and get ready to submit the TPP AWS paper!

Miscellaneous Ideas

List of suggested TPP new features, improvements, fixes, etc.. to be cherry picked for upcoming releases.

  • MzXML2Search is used to convert mzML files to mgf for searching with OMSSA. Do we even need this program with the capabilities of msconvert? Only ask as I have a set of files its crashing on right now and it works for msconvert...
  • Drop CompactParser. Doesn't seem to work.
  • Breakout apache configuration into its own file and handle it properly on Ubuntu systems
  • (Luis?) IMPROVEMENT: This one goes to "11", or in this case 10. Charge states above +9 will no doubt cause trouble in TPP.
  • NEW FEATURE: PEFF support in TPP. (Need to define what this means).
  • Better support for viewing/outputting N15 experiments. Sequences are hard to read (modification every base) and makes the xml huge.
  • Add ASAPRatio option to use WaveletQuant in Petunia
    • Verify that xinteract can deal with this option (-w)
  • Petunia: option to run xinteract jobs as separate (instead of combining into a single analysis)
  • Make sure new installation of QualScore works on Windows
  • Output more significant figs for Protein Prophet weight (rounding error on 0.49 and 0.50)
  • Fix precursor charge column (only 1st letter displayed) and miscalculation of the # of unique peptides in EXCELPEPS
    • EXCELPEPS appears to be broken; just remove it?
  • Resolve Andreas issue with Tandem2XML
  • Write (TPP) software version used in pepXML, protXML, etc
  • Silence/catch errors on the *Prophet models generation when no data (e.g. "empty y range")
  • Zhi has been experiencing a bus error one large operation with SpectraST
  • There is a was a reported bug by Henry with mzParser
  • Add commandline option to generate EXCEL output of pepXML files
  • Issue with MzXML2Search converting older mzXML files to mgf. Randomly fails on the same file.
  • Missing javascript file for table filtering in Petunia (only affects Linux installations)
  • I encountered a small problem when using the new TPP. In the prot.shtml view, pink color coding of NxS/T motifs does not work anymore when checking the box. Also, I always liked the "464 peptides with NXS/T motif, 690 total peptides, 658 unique stripped peptides"info at the bottom of the page (as it gives us an immediate idea of how nicely a glyco experiment worked) but this is not displayed correctly anymore.
  • Included Terry's latest version of Inspect2PepXML.py for patching inspect installations
  • Bug: Petunia will (re-)submit a new job if the "back" button is clicked after the initial job launch (Windows only)

Legacy Items

  • Fix iProphet models page ("performance" models are missing)
  • Add "Bruker" files option to msconvert in Petunia done
  • Fix pepxml2html (View Peptide hits) peptide link for SpectraST results
  • peptidexmlhtml2 : does not work with AA highlighting option (mark_aa=)
  • Add some filtering options to Pep3D (specifically: view only +1 ions)
  • Add QuickMod to speclibs page?
  • retire getSpectrum -> use readMzXML
  • incorporate psm2pdf
  • include t2d support(?)
  • Petunia: option to convert to mzXML using msconvert done
  • Petunia: launch all external links in blank/target window done
  • Reindex mzXML: new file with old name; rename old to: old-index or some such
  • Petunia: option to run xinteract jobs as separate (instead of combining into a single analysis) done
  • Petunia: select pep.xml files only for input to Decoy Models pages done
  • Petunia: auto cd to dir and strip out full path if all files in same dir (shorten command line) done
  • Petunia: rename Mascot file on download done
  • Move /tmp/ directory under tpp-bin (Windows)
  • Petunia: Add TPP version check (ping server) --> usage stats done
  • Petunia: add full filters to msconvert(??) -- added free text box for filters and other options
  • Petunia: change tab graphic when jobs are done?
  • PepXMLViewer: implement "preferences" tab: choose ions series, spectrum viewer, etc... ?
  • PepXMLViewer: no pep-pro probability link in iProphet results
  • Libra also needs some love
    • Add ability to curate ratios
    • Pick first/main protein out of a set of indistinguishable ones when writing quantitation file (and not a random one)


From Henry (perhaps he already checked this in?) :

Following up on this discussion, I would like to propose the following. (This is not urgent since our course doesn't use OMSSA or any of the newly-supported engines, so it doesn't need to be in the release soon.)

  • Change xinteract to run RefreshParser before PeptideProphet.
    • This takes care of the protein name issues for OMSSA, and makes sure that the NTT and NMC fields are there and calculated correctly for PeptideProphet. Some search engines don't fill in those fields and cause PeptideProphet to crash.
    • The flip-side is mainly inefficiency. If the user decides not to keep all search hits regardless of probability (i.e. not specifying -p0), then this means RefreshParser unnecessarily refresh a bunch of wrong identifications that won't be in the final pepXML file anyway. But then again, given that nowadays PeptideProphet often relies on the protein name (whether it's a DECOY or not), it seems safer to ensure that the protein names are right -- even for the wrong hits -- before we run PeptideProphet. Also, with the new implementation, for most files RefreshParser is so fast that I don't think it will make much difference in running time anyway.
    • If the user doesn't want to refresh, he/she can always turn off RefreshParser (-nR option).
  • Have InteractParser detects if the search engine is OMSSA, and if so, apply the fix_pyro_mods_ option automatically.
    • This goes without saying. The only thing is the user doesn't have to turn on the option manually. I don't know if other engines have the same problem. If so, it's easy to add them to the list.
Personal tools