Processing glycopeptide data

From SPCTools

(Difference between revisions)
Jump to: navigation, search
Revision as of 22:23, 20 February 2009
Tfarrah (Talk | contribs)

← Previous diff
Revision as of 22:54, 20 February 2009
Tfarrah (Talk | contribs)

Next diff →
Line 1: Line 1:
=== Raw notes from Dave Campbell email to tfarrah on Jan. 20, 2009 === === Raw notes from Dave Campbell email to tfarrah on Jan. 20, 2009 ===
-I think the database conversion script assumes the sequence is all on one line, otherwise the regex might need tweaking.+You can look here for a search that was done with sequest and xtandem. Xtandem params are the same as usual except that the target database is ipi.HUMAN.v3.38_forwdecoy_nxst.fasta:
- +
-You can look here for a search that was done with sequest and xtandem, I get the params for the latter from Eric, he can answer questions better than I:+
/regis/sbeams/archive/jwatts/HsGlycoPlasma35indiv/HsGlycoPlasma35indiv /regis/sbeams/archive/jwatts/HsGlycoPlasma35indiv/HsGlycoPlasma35indiv
-I've attached a tgz file that has the pertinent scripts and other+Pertinent scripts and other
-useful files. I took these out of context, so there might be some+useful files are in ~tfarrah/alt_nxst. Out of context, so might be some
unforeseen issues. If they don't work out of the box just let me know unforeseen issues. If they don't work out of the box just let me know
and I'll help you troubleshoot. and I'll help you troubleshoot.
-The basic method entails searching against a modified db, running+The basic method entails searching against a modified db with all NXS/T replaced by BXS/T (except for NPS/T and a few other exceptions). B stands for a D that's been substituted in. We then run
-the search with a static modification on B in the sequest.params, then+the search with a static modification on B (to make it the same weight as a D) in the sequest.params, then
-back-converting the results and processing as normal (including+back-converting the results (substituting Ns for all Bs?) and processing as normal (including
refresh-parsing against the original db). We modified the method a refresh-parsing against the original db). We modified the method a
little to use 'B' (avg of D and N) because our version of Sequest was little to use 'B' (avg of D and N) because our version of Sequest was
Line 29: Line 27:
add_B_avg_NandD = 0.4920 ; added to B - avg. 114.5962, mono. 114.53494 add_B_avg_NandD = 0.4920 ; added to B - avg. 114.5962, mono. 114.53494
-make_nxst_db.pl - script to convert database+make_nxst_db.pl - script to convert database; assumes the sequence is all on one line, otherwise the regex might need tweaking.
batch_convert.sh - script to translate batch of xml files. batch_convert.sh - script to translate batch of xml files.
changeback.pl - perl script called by batch script above to back-substitute files. changeback.pl - perl script called by batch script above to back-substitute files.

Revision as of 22:54, 20 February 2009

Raw notes from Dave Campbell email to tfarrah on Jan. 20, 2009

You can look here for a search that was done with sequest and xtandem. Xtandem params are the same as usual except that the target database is ipi.HUMAN.v3.38_forwdecoy_nxst.fasta:

/regis/sbeams/archive/jwatts/HsGlycoPlasma35indiv/HsGlycoPlasma35indiv

Pertinent scripts and other useful files are in ~tfarrah/alt_nxst. Out of context, so might be some unforeseen issues. If they don't work out of the box just let me know and I'll help you troubleshoot.

The basic method entails searching against a modified db with all NXS/T replaced by BXS/T (except for NPS/T and a few other exceptions). B stands for a D that's been substituted in. We then run the search with a static modification on B (to make it the same weight as a D) in the sequest.params, then back-converting the results (substituting Ns for all Bs?) and processing as normal (including refresh-parsing against the original db). We modified the method a little to use 'B' (avg of D and N) because our version of Sequest was limited in its ability to accept non-standard amino acids. If you are running xtandem you might want to consider whether there is a better way to do this. I've outlined the files in the archive below, let me know if you have questions.

Atwood-York_GlycopeptideSearchStrategy.pdf - original paper this is based on

nxst_conversion_recipe.txt - README file for this process. Note perl -pi -e step, this must be done.

sequest.params.ft - modified sequest.params file, the salient line is shown below:

add_B_avg_NandD = 0.4920               ; added to B - avg. 114.5962, mono. 114.53494

make_nxst_db.pl - script to convert database; assumes the sequence is all on one line, otherwise the regex might need tweaking.

batch_convert.sh - script to translate batch of xml files. changeback.pl - perl script called by batch script above to back-substitute files.

Personal tools