PeptideAtlas Pipeline Retool 2012/13
From SPCTools
10/29/12:
Why is the build pipeline so complex and time consuming? What are we trying to do?
1. Start with iPro and ProtPro results, refreshed to ref DB
=> We can skip refresh if ref DB same as search DB
2. For each expt, create PAidentlist template file with all PSMs above P=0.4; can be used for multiple builds on same data.
=> Why does this take so long?
3. Make combined, sorted PAIdentlist file
=> Can we speed the sorting? Currently we use unix sort.
=> What do we need APD files for?