Formats:mzXML

From SPCTools

Revision as of 19:24, 23 January 2008; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

mzXML is a open data format for storage and exchange of mass spectroscopy data, developed at the SPC/Institute for Systems Biology. mzXML provides a standard container for ms and ms/ms proteomics data and is the foundation of our proteomic pipelines. Raw, proprietary file formats from most vendors can be converted to the open mzXML format.

Patrick Pedrioli was the primary original author; please see the references below for the often-cited Nature Biotech publication.

Several versions of this format exist. Currently these are 1.0 (also called "msXML"), 2.0, 2.1 and 3.0. 2.1 is the most common version currently in use.



Contents

Converter Overview

The first step in processing ms or ms/ms data with SPC proteomics software is conversion of raw data to our open mzXML format.

direct vendor support

Bruker

Bruker directly supports the mzXML format, using their CompassXport program:

  • CompassXport is the recommended converter for Bruker (.baf) files. (And for historical reasons only, we provide a page on our own (SPC-created) retired open-source Bruker converter, mzBruker.) Also, see Additional Resources, below.


SPC/TPP supported proprietary formats

The TPP project supports a wide variety of mass-spec instrument formats. Currently, we provide software tools (converters) for the following vendor formats. Follow the links for more specific information for each converter:

Thermo/XCalibur

  • ReAdW: Thermo/XCalibur raw data (.RAW files) to mzXML converter

Analyst (ABI/MDS Sciex)

  • mzStar: Analyst (ABI/MDS Sciex) software's raw data (.wiff files) to mzXML converter; please note, mzStar is about to be retired in favor of the much-improved mzWiff converter.

Waters Masslynx

  • massWolf: MassLynx (Waters) raw data (.raw directories) to mzXML converter


Other external converter projects

These projects are not supported by the SPC, but may be of use.

SCIEX/ABI 4700/4800 MALDI TOFTOF

  • T2Extractor - SCIEX/ABI 4700/4800 MALDI TOFTOF Data to mzXML: Java application to convert data from a SCIEX/ABI 4000 series MALDI TOFTOF instruments into mzXML format. This application is provided courtesy of Phil Andrews lab at the University of Michigan. Please see this link: [1]

ABI Mariner, Voyager


Converters: status and summary table

All known formats and their converters are summarized in the table below:

Compatible Mass-Spectrometer Instrument-Specific Software and File Formats:

Compatible Mass-Spec file formats
raw file vendor instrument acquisition software raw file type raw-to-mzXML converter SPC maintained? converter status/notes download
Bruker (?) .baf CompassXport page no (Bruker supported) refer to vendor; see CompassXport page
Thermo XCalibur .RAW file ReAdW yes, SPC known issues: does not centroid orbi/ft data correctly official release: Nov 2006; see ReAdW page; contact spctools-discuss newsgroup if interested in beta ReAdW version
Waters MassLynx .RAW directory wolf yes, SPC no centroiding offical release: ???; see wolf page; contact spctools-discuss newsgroup if interested in beta wolf version
MDS/Sciex for ABI and Agilent Analyst, AnalystQS .wiff file mzStar (official release); mzWiff (beta, not yet released) yes, SPC mzStar is known to have many bugs which are fixed in mzWiff mzStar page; contact spctools-discuss newsgroup if interested in beta mzWiff
Agilent MassHunter .d directory trapper (beta, not yet released) yes, SPC currently internal only for development purposes
ABI (?) (?) T2DExtractor no (Andrews Lab) 4700/4800 MALDI TOFTOF converter; known issue: correct instrument name is not recorded in mzXML; charge state info is not recorded in mzXML link
ABI (?) (?) PyMsXML no ABI Mariner, Voyager converter external link

Viewing mzXML files

Pep3D

The Pep3D program, included with the TPP, produces a gel-like image, which is very useful for getting an overview of an entire run. It can also link to CID and peptide probability information.

Insilicos viewer

We also recommend the free InSilicos Viewer, which can display mzXML, mzData, and other formats.


Additional Resources

  • out2linux.pl: working with Bioworks Sequest output. See Software:Out2XML page.
  • working with Bruker ion trap and DataAnalysis software. See Software:da2tpp page.

Parsing mzXML data

RAMP

RAMP: C/C++ API, contained in TPP and basis for TPP tools. Also parses mzData.

JRAP: java API

Related Formats

Please note that while widely accepted in the proteomics community mzXML is format, not a standard. There are two other open ms/ms proteomic formats: mzData and mzML. mzData was the first attempt, created by the HUPO/PSI committee process. Many vendors wanted to wait until the format was finalized (a process which took two years); in the meanwhile, mzXML was developed to fill the need. mzML is an upcoming format which aims to marry the best elements of mzXML and mzData, and represents a joint effort of the HUPO/PSI committee, SPC/ISB, instrument vendors, and other proteomics software groups.

Reference

  • Pedrioli PGA, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B, Pratt B, Nilsson E, Angeletti R, Apweiler R, Cheung K, Costello CE, Hermjakob H, Huang S, Julian RK Jr, Kapp E, McComb ME, Oliver SG, Omenn G, Paton NW, Simpson R, Smith R, Taylor CF, Zhu W, Aebersold R. (2004) "A Common Open Representation of Mass Spectrometry Data and its Application in a Proteomics Research Environment." Nature Biotechnology 22(11):1459-1466. Download PDF
  • Learn more at the Sashimi project website, the original website for the mzXML format.
Personal tools