GetTransitionsAPI
From SPCTools
(Difference between revisions)
Revision as of 18:09, 27 April 2010 Dcampbel (Talk | contribs) ← Previous diff |
Current revision Dcampbel (Talk | contribs) |
||
Line 1: | Line 1: | ||
<PRE> | <PRE> | ||
- | GetTransitions is a CGI script that allows users to query the peptide and transition information stored in the Peptide Atlas/MRM Atlas. The transitions retrieved | + | GetTransitions is a CGI script that allows users to query the peptide and transition information stored in the Peptide Atlas/MRM Atlas. |
- | are constrained by various parameters set by the user. In a web browser, these can be set interactively as needed, but the page can also be accessed in an automated | + | The transitions retrieved are constrained by various parameters set by the user. In a web browser, these can be set interactively as |
- | fashion using command-line utilities such as wget or curl, or directly from a program using an appropriate URL fetching mechanism. This page is meant to describe the | + | needed, but the page can also be accessed in an automated fashion using command-line utilities such as wget or curl, or directly from a |
- | various parameters that a remote user can use to obtain transitions. | + | program using an appropriate URL fetching mechanism. This page is meant to describe the various parameters that a remote user can use |
+ | to obtain transitions. | ||
Line 9: | Line 10: | ||
[http://tools.proteomecenter.org/wiki/index.php?title=PABST [Return]] to main PABST page. | [http://tools.proteomecenter.org/wiki/index.php?title=PABST [Return]] to main PABST page. | ||
<PRE> | <PRE> | ||
- | This section defines the parameters that can be used to refine the transitions retrieved from the Atlas, some required and some optional, with allowed values following | + | This section defines the parameters that can be used to refine the transitions retrieved from the Atlas, some required and some optional, |
- | the field name where applicable. The following section shows the the descriptive text provided in the web UI to help further explain these options. | + | with allowed values following the field name where applicable. The following section shows the the descriptive text provided in the |
+ | web UI to help further explain these options. | ||
## Required parameters: | ## Required parameters: | ||
Line 24: | Line 26: | ||
protein_name_constraint | protein_name_constraint | ||
upload_file | upload_file | ||
+ | |||
+ | # This param ensures that certain params generally set with user interaction do not limit search results | ||
+ | default_search=1 | ||
# not strictly required, but page with return dense HTML otherwise. | # not strictly required, but page with return dense HTML otherwise. | ||
Line 86: | Line 91: | ||
<PRE> | <PRE> | ||
Example invocation using wget, and the resulting transitions - output mode tsv. | Example invocation using wget, and the resulting transitions - output mode tsv. | ||
- | wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=YAL003W;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3&action=QUERY;output_mode=tsv;organism_name=Yeast' -O YAL003W_transitions.tsv | + | wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?default_search=1;protein_name_constraint=YAL003W;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3&action=QUERY;output_mode=tsv;organism_name=Yeast' -O YAL003W_transitions.tsv |
Line 101: | Line 106: | ||
</PRE> | </PRE> | ||
+ | <pre> | ||
+ | A second example, this time fetching transitions from the Human Complete SRMAtlas: | ||
+ | |||
+ | wget 'https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=P01258;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3;apply_action_hidden=&action=QUERY;output_mode=tsv;default_search=1;organism_name=Human' -O Human_transitions.tsv | ||
+ | |||
+ | This output currently includes some HTML; the workaround is to use a function like cut (or excel) to prune the noisy columns, e.g. | ||
+ | |||
+ | cut -f1,3-16 Human_transitions.tsv > Human_transitions_noHTML.tsv | ||
+ | |||
+ | |||
+ | </pre> | ||
<PRE> | <PRE> | ||
- | A second example, using xml output mode and demonstrating how to specify multiple protein names, by using the HTML escape code for semicolon, %3B: YBR002C%3BYAL003W | + | A third example, using xml output mode and demonstrating how to specify multiple protein names, by using the HTML escape code for semicolon, %3B: YBR002C%3BYAL003W |
- | wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=YBR002C%3BYAL003W;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3;apply_action_hidden=&action=QUERY;output_mode=xml;organism_name=Yeast' -O Yeast_transitions.xml | + | wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=YBR002C%3BYAL003W;default_search=1;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3;apply_action_hidden=&action=QUERY;output_mode=xml;organism_name=Yeast' -O Yeast_transitions.xml |
<?xml version="1.0" standalone="yes"?> | <?xml version="1.0" standalone="yes"?> | ||
<resultset identifier="unknown"> | <resultset identifier="unknown"> |
Current revision
GetTransitions is a CGI script that allows users to query the peptide and transition information stored in the Peptide Atlas/MRM Atlas. The transitions retrieved are constrained by various parameters set by the user. In a web browser, these can be set interactively as needed, but the page can also be accessed in an automated fashion using command-line utilities such as wget or curl, or directly from a program using an appropriate URL fetching mechanism. This page is meant to describe the various parameters that a remote user can use to obtain transitions.
[Return] to main PABST page.
This section defines the parameters that can be used to refine the transitions retrieved from the Atlas, some required and some optional, with allowed values following the field name where applicable. The following section shows the the descriptive text provided in the web UI to help further explain these options. ## Required parameters: ############################################ action [ QUERY ] # One of the following is required, typical remote use will involve specifying the organism. organism_name [ yeast, mouse, human ] pabst_build_id [ any accessible build_id ] # One of the following must be set, upload file requires POST method and file encoding protein_name_constraint upload_file # This param ensures that certain params generally set with user interaction do not limit search results default_search=1 # not strictly required, but page with return dense HTML otherwise. output_mode [ tsv xml ] ## Optional parameters ############################################ peptide_sequence_constraint peptide_length empirical_proteotypic_constraint n_protein_mappings_constraint n_genome_locations_constraint n_highest_intensity_fragment_ions n_peptides_per_protein# Options below affect the PABST peptide scoring algorithm, explained [here]
. Current default values are shown.4H = '1' 5H = '1' BA = '1' C = '0.95' D = '1' DG = '1' DP = '1' Hper = '1' M = '0.95' NG = '1' NxST = '1' P = '0.95' QG = '1' R = '1' S = '1' W = '1' Xc = '1' max_l = '25' max_p = '0.2' min_l = '7' min_p = '0.2' nE = '1' nGPG = '1' nM = '1' nQ = '1' nX = '1' nxxG = '1' obs = '2' ssr_p = '0.5'
This section shows the help text for the various parameters.
protein_name_constraint Constraint for the Protein Name. '%' is wildcard character; '_' is single character wildcard; character range is like '[a-m]'; multiple entries may be separated with a semicolon; Use ! for NOT. upload_file Path to file with list of Protein Names to be uploaded via the web interface (NOTE: if proteins are not found, search defaults to printing all proteins of the selected Atlas build) peptide_sequence_constraint Constraint for the Peptide Sequence. '%' is wildcard character; '_' is single character wildcard; character range is like '[a-m]'; multiple entries may be separated with a semicolon; Use ! for NOT. peptide_length Constraint for the num amino acids in seq Allowed syntax: "n", "> n", "< n", "between n and n", "n +- n" empirical_proteotypic_constraint Constraint for the empirical proteotypic score for a peptide. Allowed syntax: "n.n", "> n.n", "< n.n", "between n.n and n.n", "n.n +- n.n" n_protein_mappings_constraint Constraint for number of distinct proteins for this peptide ( >=0 ) n_genome_locations_constraint Constraint for number of genome locations for this peptide ( >=0 ) n_highest_intensity_fragment_ions Number highest inten frag ions per spec to keep, default 3 n_peptides_per_protein Number of peptides to return per protein, default 3 pabst_build_id Select desired PABST Build to search, required.
Example invocation using wget, and the resulting transitions - output mode tsv. wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?default_search=1;protein_name_constraint=YAL003W;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3&action=QUERY;output_mode=tsv;organism_name=Yeast' -O YAL003W_transitions.tsv Protein Pre Sequence Fol Score Src Q1_mz Q1_chg Q3_mz Q3_chg Label Rank RI SSR YAL003W K SYIEGTAVSQADVTVFK A 1.60 IT 907.96 2 994.52 1 y9 1 10000 33.2 YAL003W K SYIEGTAVSQADVTVFK A 1.60 IT 907.96 2 1093.59 1 y10 2 4129 33.2 YAL003W K SYIEGTAVSQADVTVFK A 1.60 IT 907.96 2 494.30 1 y4 3 2497 33.2 YAL003W K SIVTLDVKPWDDETNLEEMVANVK A 1.56 IT 1373.19 2 1059.01 2 y18 1 3874 44.6 YAL003W K SIVTLDVKPWDDETNLEEMVANVK A 1.56 IT 1373.19 2 945.43 2 y16 2 1152 44.6 YAL003W K SIVTLDVKPWDDETNLEEMVANVK A 1.56 IT 1373.19 2 1009.48 2 y17 3 800 44.6 YAL003W R WFNHIASK A 1.43 IT 1002.52 1 555.32 1 y5 1 730 22.9 YAL003W R WFNHIASK A 1.43 IT 501.76 2 669.37 1 y6 2 2955 22.9 YAL003W R WFNHIASK A 1.43 IT 501.76 2 816.44 1 y7 3 827 22.9
A second example, this time fetching transitions from the Human Complete SRMAtlas: wget 'https://db.systemsbiology.net/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=P01258;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3;apply_action_hidden=&action=QUERY;output_mode=tsv;default_search=1;organism_name=Human' -O Human_transitions.tsv This output currently includes some HTML; the workaround is to use a function like cut (or excel) to prune the noisy columns, e.g. cut -f1,3-16 Human_transitions.tsv > Human_transitions_noHTML.tsv
A third example, using xml output mode and demonstrating how to specify multiple protein names, by using the HTML escape code for semicolon, %3B: YBR002C%3BYAL003W wget 'https://db.systemsbiology.net/devDC/sbeams/cgi/PeptideAtlas/GetTransitions?protein_name_constraint=YBR002C%3BYAL003W;default_search=1;n_highest_intensity_fragment_ions=3;n_peptides_per_protein=3;apply_action_hidden=&action=QUERY;output_mode=xml;organism_name=Yeast' -O Yeast_transitions.xml <?xml version="1.0" standalone="yes"?> <resultset identifier="unknown"> <row identifier="0" Protein="YAL003W" Pre="K" Sequence="SYIEGTAVSQADVTVFK" Fol="A" Score="1.60" Src="IT" Q1_mz="907.96" Q1_chg="2" Q3_mz="994.52" Q3_chg="1" Label="y9" Rank="1" RI="10000" SSR="33.2" /> <row identifier="1" Protein="YAL003W" Pre="K" Sequence="SYIEGTAVSQADVTVFK" Fol="A" Score="1.60" Src="IT" Q1_mz="907.96" Q1_chg="2" Q3_mz="1093.59" Q3_chg="1" Label="y10" Rank="2" RI="4129" SSR="33.2" /> The rest of the file is not shown due to space considerations.