A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models

Wen, Bo (ORCID:0000000322613150); Noble, William_Stafford (ORCID:0000000172834715)

doi:10.1038/s41597-024-04068-4

Citation Details

A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models

Abstract Training machine learning models for tasks such asde novosequencing or spectral clustering requires large collections of confidently identified spectra. Here we describe a dataset of 2.8 million high-confidence peptide-spectrum matches derived from nine different species. The dataset is based on a previously described benchmark but has been re-processed to ensure consistent data quality and enforce separation of training and test peptides. more »

Award ID(s):: 2245300

PAR ID:: 10554332

Author(s) / Creator(s):: Wen, Bo; Noble, William_Stafford

Publisher / Repository:: Nature Publishing Group

Date Published:: 2024-11-08

Journal Name:: Scientific Data

Volume:: 11

Issue:: 1

ISSN:: 2052-4463

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Journal Article:
https://doi.org/10.1038/s41597-024-04068-4

More Like this