Processing glycopeptide data

From SPCTools

Revision as of 22:23, 20 February 2009; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

Raw notes from Dave Campbell email to tfarrah on Jan. 20, 2009

I think the database conversion script assumes the sequence is all on one line, otherwise the regex might need tweaking.

You can look here for a search that was done with sequest and xtandem, I get the params for the latter from Eric, he can answer questions better than I:

/regis/sbeams/archive/jwatts/HsGlycoPlasma35indiv/HsGlycoPlasma35indiv

I've attached a tgz file that has the pertinent scripts and other useful files. I took these out of context, so there might be some unforeseen issues. If they don't work out of the box just let me know and I'll help you troubleshoot.

The basic method entails searching against a modified db, running the search with a static modification on B in the sequest.params, then back-converting the results and processing as normal (including refresh-parsing against the original db). We modified the method a little to use 'B' (avg of D and N) because our version of Sequest was limited in its ability to accept non-standard amino acids. If you are running xtandem you might want to consider whether there is a better way to do this. I've outlined the files in the archive below, let me know if you have questions.

Atwood-York_GlycopeptideSearchStrategy.pdf - original paper this is based on

nxst_conversion_recipe.txt - README file for this process. Note perl -pi -e step, this must be done.

sequest.params.ft - modified sequest.params file, the salient line is shown below:

add_B_avg_NandD = 0.4920               ; added to B - avg. 114.5962, mono. 114.53494

make_nxst_db.pl - script to convert database

batch_convert.sh - script to translate batch of xml files. changeback.pl - perl script called by batch script above to back-substitute files.

Personal tools