NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Evolution-Inspired Loss Functions for Protein Representation Learning

Gong, C; Klivans, A; Loy, J M; Chen, T; Liu, Q; Diaz, D J (April 2024, ICLR 2024)

AI-based frameworks for protein engineering use self-supervised learning (SSL) to obtain representations for downstream biological predictions. The most common training objective for these methods is wildtype accuracy: given a sequence or structure where a wildtype residue has been masked, predict the missing amino acid. Wildtype accuracy, however, does not align with the primary goal of protein engineering, which is to suggest a {\em mutation} rather than to identify what already appears in nature. Here we present Evolutionary Ranking (EvoRank), a training objective that incorporates evolutionary information derived from multiple sequence alignments (MSAs) to learn more diverse protein representations. EvoRank corresponds to ranking amino-acid likelihoods in the probability distribution induced by an MSA. This objective forces models to learn the underlying evolutionary dynamics of a protein. Across a variety of phenotypes and datasets, we demonstrate that EvoRank leads to dramatic improvements in zero-shot performance and can compete with models fine-tuned on experimental data. This is particularly important in protein engineering, where it is expensive to obtain data for fine-tuning.
more » « less
Full Text Available
Predicting a Protein's Stability under a Million Mutations

Ouyang-Zhang, J; Diaz, D J; Klivans, A; Krähenbühl, P (October 2023, https://doi.org/10.48550/arXiv.2310.12979)

Stabilizing proteins is a foundational step in protein engineering. However, the evolutionary pressure of all extant proteins makes identifying the scarce number of mutations that will improve thermodynamic stability challenging. Deep learning has recently emerged as a powerful tool for identifying promising mutations. Existing approaches, however, are computationally expensive, as the number of model inferences scales with the number of mutations queried. Our main contribution is a simple, parallel decoding algorithm. Our Mutate Everything is capable of predicting the effect of all single and double mutations in one forward pass. It is even versatile enough to predict higher-order mutations with minimal computational overhead. We build Mutate Everything on top of ESM2 and AlphaFold, neither of which were trained to predict thermodynamic stability. We trained on the Mega-Scale cDNA proteolysis dataset and achieved state-of-the-art performance on single and higher-order mutations on S669, ProTherm, and ProteinGym datasets.
more » « less
Full Text Available
Early Research Scholars Program Update and Reflection Study

Revelo, R. A.; Lichauco, A. B.; Rozhkova, A.; Diaz, D. (June 2023, 2023 ASEE Annual Conference & Exposition)

This conference paper provides an update on the Early Research Scholars Program (ERSP) background, structure, and implementation at the University of Illinois Chicago (UIC), developed at the University of California San Diego and funded by the National Science Foundation Improving Undergraduate STEM Education program. The program aims to support retention of students from marginalized backgrounds in the fields of computing as well as electrical and computer engineering. This paper provides program updates, including data from the 2022-2023 academic year and preliminary results from a reflection study that began in spring 2020. The reflection study examined the impact of the ERSP on a student's computing and engineering identity development based on student reflection responses. In this paper, we also discuss student demographics, retention rates, and changes made to the program's curriculum at UIC. The evaluation results from the last three years of the program are also shared, which show how students are impacted by the program, as well as areas for improvement. Preliminary results show that the program has positively impacted students' computing or engineering identity development for at least three identity dimensions: recognition, competence, and community.
more » « less
Full Text Available
Emulating Turbulence Free Quantum-enhanced Interferometric Telescopy

https://doi.org/10.1364/QUANTUM.2022.QTu2A.31

Diaz, D; Zhang, Y; Lorenz, V O; Kwiat, P G (June 2022, Quantum 2.0)

We demonstrate the underlying mechanism for one version of quantum-enhanced telescopy, using multiple interconnected Hong-Ou-Mandel interferometers to re-cover the visibility amplitude of the source of light in the presence of arbitrary turbulence.
more » « less
Full Text Available
Emulating Quantum-enhanced Long-Baseline Interferometric Telescopy

https://doi.org/10.1364/FIO.2021.FTh6D.7

Diaz, D.; Zhang, Y.; Lorenz, Virginia O.; Kwiat, Paul G. (November 2021, Frontiers in Optics 2021)

We demonstrate the underlying mechanism for quantum-enhanced telescopy, using multiple interconnected Hong-Ou-Mandel interferometers to recover the visibility amplitude and relative phase of the source light into multiple simulated telescopes.
more » « less
Full Text Available
Sea Level and Socioeconomic Uncertainty Drives High‐End Coastal Adaptation Costs

https://doi.org/10.1029/2022EF003061

Wong, T. E.; Ledna, C.; Rennels, L.; Sheets, H.; Errickson, F. C.; Diaz, D.; Anthoff, D. (December 2022, Earth's Future)

Abstract Sea‐level rise and associated flood hazards pose severe risks to the millions of people globally living in coastal zones. Models representing coastal adaptation and impacts are important tools to inform the design of strategies to manage these risks. Representing the often deep uncertainties influencing these risks poses nontrivial challenges. A common uncertainty characterization approach is to use a few benchmark cases to represent the range and relative probabilities of the set of possible outcomes. This has been done in coastal adaptation studies, for example, by using low, moderate, and high percentiles of an input of interest, like sea‐level changes. A key consideration is how this simplified characterization of uncertainty influences the distributions of estimated coastal impacts. Here, we show that using only a few benchmark percentiles to represent uncertainty in future sea‐level change can lead to overconfident projections and underestimate high‐end risks as compared to using full ensembles for sea‐level change and socioeconomic parametric uncertainties. When uncertainty in future sea level is characterized by low, moderate, and high percentiles of global mean sea‐level rise, estimates of high‐end (95th percentile) damages are underestimated by between 18% (SSP1‐2.6) and 46% (SSP5‐8.5). Additionally, using the 5th and 95th percentiles of sea‐level scenarios underestimates the 5%–95% width of the distribution of adaptation costs by a factor ranging from about two to four, depending on SSP‐RCP pathway. The resulting underestimation of the uncertainty range in adaptation costs can bias adaptation and mitigation decision‐making.
more » « less
Elliptic anisotropy measurement of the f0(980) hadron in proton-lead collisions and evidence for its quark-antiquark composition

https://doi.org/10.1038/s41467-025-56200-6

Hayrapetyan, A; Tumasyan, A; Adam, W; Andrejkovic, J W; Bergauer, T; Chatterjee, S; Damanakis, K; Dragicevic, M; Hussain, P S; Jeitler, M; et al (December 2025, Nature Communications)

Abstract Despite the f₀(980) hadron having been discovered half a century ago, the question about its quark content has not been settled: it might be an ordinary quark-antiquark ($${{\rm{q}}}\overline{{{\rm{q}}}}$$ $q \bar{q}$ ) meson, a tetraquark ($${{\rm{q}}}\overline{{{\rm{q}}}}{{\rm{q}}}\overline{{{\rm{q}}}}$$ $q \bar{q} q \bar{q}$ ) exotic state, a kaon-antikaon ($${{\rm{K}}}\overline{{{\rm{K}}}}$$ $K \bar{K}$ ) molecule, or a quark-antiquark-gluon ($${{\rm{q}}}\overline{{{\rm{q}}}}{{\rm{g}}}$$ $q \bar{q} g$ ) hybrid. This paper reports strong evidence that the f₀(980) state is an ordinary$${{\rm{q}}}\overline{{{\rm{q}}}}$$ $q \bar{q}$ meson, inferred from the scaling of elliptic anisotropies (v₂) with the number of constituent quarks (n_q), as empirically established using conventional hadrons in relativistic heavy ion collisions. The f₀(980) state is reconstructed via its dominant decay channel f₀(980) →π⁺π⁻, in proton-lead collisions recorded by the CMS experiment at the LHC, and itsv₂is measured as a function of transverse momentum (p_T). It is found that then_q= 2 ($${{\rm{q}}}\overline{{{\rm{q}}}}$$ $q \bar{q}$ state) hypothesis is favored overn_q= 4 ($${{\rm{q}}}\overline{{{\rm{q}}}}{{\rm{q}}}\overline{{{\rm{q}}}}$$ $q \bar{q} q \bar{q}$ or$${{\rm{K}}}\overline{{{\rm{K}}}}$$ $K \bar{K}$ states) by 7.7, 6.3, or 3.1 standard deviations in thep_T< 10, 8, or 6 GeV/cranges, respectively, and overn_q= 3 ($${{\rm{q}}}\overline{{{\rm{q}}}}{{\rm{g}}}$$ $q \bar{q} g$ hybrid state) by 3.5 standard deviations in thep_T< 8 GeV/crange. This result represents the first determination of the quark content of the f₀(980) state, made possible by using a novel approach, and paves the way for similar studies of other exotic hadron candidates.
more » « less
Free, publicly-accessible full text available December 1, 2026
Search for New Physics in Jet Multiplicity Patterns of Multilepton Events at $\sqrt{s} = 13 TeV$

https://doi.org/10.1103/51fw-klz3

Hayrapetyan, A; Tumasyan, A; Adam, W; Andrejkovic, J W; Bergauer, T; Chatterjee, S; Damanakis, K; Dragicevic, M; Hussain, P S; Jeitler, M; et al (December 2025, Physical Review Letters)

A first search for beyond the standard model physics in jet multiplicity patterns of multilepton events is presented, using a data sample corresponding to an integrated luminosity of $138 {fb}^{- 1}$ of 13 TeV proton-proton collisions recorded by the CMS detector at the LHC. The search uses observed jet multiplicity distributions in one-, two-, and four-lepton events to explore possible enhancements in jet production rate in three-lepton events with and without bottom quarks. The data are found to be consistent with the standard model expectation. The results are interpreted in terms of supersymmetric production of electroweak chargino-neutralino superpartners with cascade decays terminating in prompt hadronic $R$ -parity violating interactions.
more » « less
Free, publicly-accessible full text available December 1, 2026
Search for the Rare Decay $D^{0} \to μ^{+} μ^{-}$ in Proton-Proton Collisions at $\sqrt{s} = 13.6 TeV$

https://doi.org/10.1103/zc76-rgcp

Chekhovsky, V; Hayrapetyan, A; Makarenko, V; Tumasyan, A; Adam, W; Andrejkovic, J W; Benato, L; Bergauer, T; Chatterjee, S; Damanakis, K; et al (October 2025, Physical Review Letters)

A search for the rare decay $D^{0} \to μ^{+} μ^{-}$ is reported using proton-proton collision events at $\sqrt{s} = 13.6 TeV$ collected by the CMS detector in 2022–2023, corresponding to an integrated luminosity of $64.5 {fb}^{- 1}$ . This is the first analysis to use a newly developed inclusive dimuon trigger, expanding the scope of the CMS flavor physics program. The search uses $D^{0}$ mesons obtained from $D^{* +} \to D^{0} π^{+}$ decays. No significant excess is observed. A limit on the branching fraction of $B (D^{0} \to μ^{+} μ^{-}) < 2.4 \times 10^{- 9}$ at 95% confidence level is set. This is the most stringent upper limit set on any flavor changing neutral current decay in the charm sector.
more » « less
Free, publicly-accessible full text available October 1, 2026
Search for a heavy pseudoscalar Higgs boson decaying to a 125 GeV Higgs boson and a Z boson in final states with two tau and two light leptons in proton-proton collisions at $$\sqrt{s}=13$$ TeV

https://doi.org/10.1007/JHEP10(2025)074

Chekhovsky, V; Hayrapetyan, A; Makarenko, V; Tumasyan, A; Adam, W; Andrejkovic, J W; Benato, L; Bergauer, T; Chatterjee, S; Damanakis, K; et al (October 2025, Journal of High Energy Physics)

A<sc>bstract</sc> A search for a heavy pseudoscalar Higgs boson, A, decaying to a 125 GeV Higgs boson h and a Z boson is presented. The h boson is identified via its decay to a pair of tau leptons, while the Z boson is identified via its decay to a pair of electrons or muons. The search targets the production of the A boson via the gluon-gluon fusion process, gg → A, and in association with bottom quarks,$$\text{b}\overline{\text{b}}\text{A }$$. The analysis uses a data sample corresponding to an integrated luminosity of 138 fb⁻¹collected with the CMS detector at the CERN LHC in proton-proton collisions at a centre-of-mass energy of$$\sqrt{s}=13$$TeV. Constraints are set on the product of the cross sections of the A production mechanisms and the A → Zh decay branching fraction. The observed (expected) upper limit at 95% confidence level ranges from 0.049 (0.060) pb to 1.02 (0.79) pb for the gg → A process and from 0.053 (0.059) pb to 0.79 (0.61) pb for the$$\text{b}\overline{\text{b}}\text{A }$$process in the probed range of the A boson mass,m_A, from 225 GeV to 1 TeV. The results of the search are used to constrain parameters within the$${\text{M}}_{\text{h},\text{EFT}}^{125}$$benchmark scenario of the minimal supersymmetric extension of the standard model. Values of tanβbelow 2.2 are excluded in this scenario at 95% confidence level for allm_Avalues in the range from 225 to 350 GeV.
more » « less
Free, publicly-accessible full text available October 1, 2026

« Prev Next »

Search for: All records