SearchProteins

From SPCTools

Jump to: navigation, search

You need to create a file of protein accessions, one per line. Then, go to the SearchProteins tab in PeptideAtlas (made available in Oct. 2009), select which Atlas you want to search, upload your file, and press "Query". Your results will be downloadable in csv or Excel format.

Your protein accessions need either to come from one of the databases that was used to create the Atlas (for example, for human that would be Swiss-Prot, IPI, or Ensembl), or from one of the databases that is cross-referenced by GOA. Looking at my notes, these seem to be:

SP (Swiss-Prot)
TR (Trembl)
ENSEMBL
REFSEQ_VALIDATED
REFSEQ_REVIEWED
REFSEQ_PROVISIONAL
REFSEQ_PREDICTED
REFSEQ_MODEL
REFSEQ_INFERRED
HINV
VEGA

I'm not sure that's a complete list. I don't know what REFSEQ is. And I know that often (but maybe not always) you can enter a gene name, like BRCA2, and I don't know if that's covered by one of the databases above. Basically, I don't understand the GOA cross-referencing, except that it does allow you to enter some identifiers that would otherwise be reported as UNKNOWN.

Also, there is variation even among what will be recognized by the different Atlas builds for a single species. For example, the Human Urine Atlas was built with IPI+Ensembl, but the Human (all) and Human Plasma Atlases were built with IPI+Ensembl+Swiss-Prot -- and the versions of IPI and Ensembl are more recent than they are for the Human Urine Atlas.

If the PeptideAtlas build you have selected doesn't recognize your protein identifier, it will report UNKNOWN. Reasons for UNKNOWN could include:

  • an identifier that expired before creation of the Atlas (common for IPI)
  • an identifer that is new since the creation of the Atlas
  • an identifier from a database that was not used to build the Atlas and that is not in the GOA cross-reference
  • incomplete GOA cross-referencing
  • unrecognized identifier format (for example, Swiss-Prot identifers must be given in the form P01234. Swiss-Prot identifiers like TNFR_HUMAN are not recognized)

If your identifier is reported UNKNOWN, you should use an external web resource to find another identifier for it -- your protein could still be in that Atlas.

Personal tools