Journal article
An Efficient Data Format for Mass Spectrometry-Based Proteomics
Journal of the American Society for Mass Spectrometry, Vol.21(10), pp.1784-1788
2010
Handle:
https://hdl.handle.net/2376/106299
PMID: 20674389
Abstract
The diverse range of mass spectrometry (MS) instrumentation along with corresponding proprietary and nonproprietary data formats has generated a proteomics community driven call for a standardized format to facilitate management, processing, storing, visualization, and exchange of both experimental and processed data. To date, significant efforts have been extended towards standardizing XML-based formats for mass spectrometry data representation, despite the recognized inefficiencies associated with storing large numeric datasets in XML. The proteomics community has periodically entertained alternate strategies for data exchange, e.g., using a common application programming interface or a database-derived format. However, these efforts have yet to gain significant attention, mostly because they have not demonstrated significant performance benefits over existing standards, but also due to issues such as extensibility to multidimensional separation systems, robustness of operation, and incomplete or mismatched vocabulary. Here, we describe a format based on standard database principles that offers multiple benefits over existing formats in terms of storage size, ease of processing, data retrieval times, and extensibility to accommodate multidimensional separation systems.
File size comparison for different data file formats. The YAFMS file sizes are comparable to the RAW data formats and always smaller than competitors.
Metrics
9 Record Views
Details
- Title
- An Efficient Data Format for Mass Spectrometry-Based Proteomics
- Creators
- Anuj R Shah - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAJennifer Davidson - Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon, USAMatthew E Monroe - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAAnoop M Mayampurath - School of Informatics and Computing, Indiana University, Bloomington, Indiana, USAWilliam F Danielson - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAYan Shi - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAAaron C Robinson - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USABrian H Clowers - National Security Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAMikhail E Belov - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USAGordon A Anderson - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USARichard D Smith - Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA
- Publication Details
- Journal of the American Society for Mass Spectrometry, Vol.21(10), pp.1784-1788
- Academic Unit
- Chemistry, Department of
- Publisher
- Elsevier Inc
- Identifiers
- 99900546859301842
- Language
- English
- Resource Type
- Journal article