<?xml version="1.0" encoding="UTF-8"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcq="http://purl.org/dc/terms/"><records count="1" morepages="false" start="1" end="1"><record rownumber="1"><dc:product_type>Journal Article</dc:product_type><dc:title>SpecEncoder: deep metric learning for accurate peptide identification in proteomics</dc:title><dc:creator>Liu, Kaiyuan; Tao, Chenghua; Ye, Yuzhen; Tang, Haixu</dc:creator><dc:corporate_author/><dc:editor/><dc:description>&lt;title&gt;Abstract&lt;/title&gt; &lt;sec&gt;&lt;title&gt;Motivation&lt;/title&gt;&lt;p&gt;Tandem mass spectrometry (MS/MS) is a crucial technology for large-scale proteomic analysis. The protein database search or the spectral library search are commonly used for peptide identification from MS/MS spectra, which, however, may face challenges due to experimental variations between replicated spectra and similar fragmentation patterns among distinct peptides. To address this challenge, we present SpecEncoder, a deep metric learning approach to address these challenges by transforming MS/MS spectra into robust and sensitive embedding vectors in a latent space. The SpecEncoder model can also embed predicted MS/MS spectra of peptides, enabling a hybrid search approach that combines spectral library and protein database searches for peptide identification.&lt;/p&gt;&lt;/sec&gt; &lt;sec&gt;&lt;title&gt;Results&lt;/title&gt;&lt;p&gt;We evaluated SpecEncoder on three large human proteomics datasets, and the results showed a consistent improvement in peptide identification. For spectral library search, SpecEncoder identifies 1%–2% more unique peptides (and PSMs) than SpectraST. For protein database search, it identifies 6%–15% more unique peptides than MSGF+ enhanced by Percolator, Furthermore, SpecEncoder identified 6%–12% additional unique peptides when utilizing a combined library of experimental and predicted spectra. SpecEncoder can also identify more peptides when compared to deep-learning enhanced methods (MSFragger boosted by MSBooster). These results demonstrate SpecEncoder’s potential to enhance peptide identification for proteomic data analyses.&lt;/p&gt;&lt;/sec&gt; &lt;sec&gt;&lt;title&gt;Availability and Implementation&lt;/title&gt;&lt;p&gt;The source code and scripts for SpecEncoder and peptide identification are available on GitHub at https://github.com/lkytal/SpecEncoder. Contact: hatang@iu.edu.&lt;/p&gt;&lt;/sec&gt;</dc:description><dc:publisher>Oxford University Press</dc:publisher><dc:date>2024-06-28</dc:date><dc:nsf_par_id>10521530</dc:nsf_par_id><dc:journal_name>Bioinformatics</dc:journal_name><dc:journal_volume>40</dc:journal_volume><dc:journal_issue>Supplement_1</dc:journal_issue><dc:page_range_or_elocation>i257 to i265</dc:page_range_or_elocation><dc:issn>1367-4803</dc:issn><dc:isbn/><dc:doi>https://doi.org/10.1093/bioinformatics/btae220</dc:doi><dcq:identifierAwardId>2011271</dcq:identifierAwardId><dc:subject>Proteomics, Deep Metric Learning, Mass Spectrometry, Peptide Identification, Spectral Library Searching</dc:subject><dc:version_number/><dc:location/><dc:rights/><dc:institution/><dc:sponsoring_org>National Science Foundation</dc:sponsoring_org></record></records></rdf:RDF>