skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Peters, John"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose mutational effect transfer learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure and energetics. We fine-tune METL on experimental sequence–function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL’s ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering. 
    more » « less
    Free, publicly-accessible full text available September 1, 2026
  2. Abstract This study investigates how entrainment’s diluting effect on cumulonimbus updraft buoyancy is affected by the temperature of the troposphere, which is expected to increase by the end of the century. A parcel model framework is constructed that allows for independent variations in the temperature (T), the entrainment rateε, the free-tropospheric relative humidity (RH), and the convective available potential energy (CAPE). Using this framework, dilution of buoyancy is evaluated withTand RH independently varied and with CAPE either held constant or increased with temperature. When CAPE is held constant, buoyancy decreases asTincreases, with parcels in warmer environments realizing substantially smaller fractions of their CAPE as kinetic energy (KE). This occurs because the increased moisture difference between an updraft and its surroundings at warmer temperatures drives greater updraft dilution. Similar results are found in midlatitude and tropical conditions when CAPE is increased with temperature. With the expected 6%–7% increase in CAPE per kelvin of warming, KE only increases at 2%–4% K−1in narrow updrafts but tracks more closely with CAPE at 4%–6% in wider updrafts. Interestingly, the rate of increase in the KE withTbecomes larger than that of CAPE when the later quantity increases at more than 10% K−1. These findings emphasize the importance of considering entrainment in studies of moist convection’s response to climate change, as the entrainment-driven dilution of buoyancy may partially counteract the influence of increases in CAPE on updraft intensity. Significance StatementCumulonimbus clouds mix air with their surrounding environment through a process called entrainment, which controls how efficiently environmental energy is converted into upward speed in thunderstorm updrafts. Our research shows that warmer temperatures will exacerbate the moisture difference between cumulonimbus updrafts and their surroundings, leading to greater mixing and less efficient conversion of environmental energy into updraft speeds. This effect should be considered in future research that investigates how climate change will affect cumulonimbus clouds. 
    more » « less
    Free, publicly-accessible full text available November 1, 2025
  3. Abstract Potential temperature and static energy are both useful quantities for understanding our atmosphere, yet static energy receives much less attention in weather science relative to climate science. Bridging this conceptual gap is important, as there is a pressing need for our communities to work together to understand and predict changing weather patterns in a warming world. Here we provide evidence for this gap in usage in American Meteorological Society journal publications and in introductory textbooks. We then describe key benefits of static energy for explaining basic concepts in atmospheric science. We encourage scientists and educators unfamiliar with static energy to familiarize themselves with the concept and consider incorporating it into their science and teaching. 
    more » « less
  4. Introduction Various sequencing based approaches are used to identify and characterize the activities of cis -regulatory elements in a genome-wide fashion. Some of these techniques rely on indirect markers such as histone modifications (ChIP-seq with histone antibodies) or chromatin accessibility (ATAC-seq, DNase-seq, FAIRE-seq), while other techniques use direct measures such as episomal assays measuring the enhancer properties of DNA sequences (STARR-seq) and direct measurement of the binding of transcription factors (ChIP-seq with transcription factor-specific antibodies). The activities of cis -regulatory elements such as enhancers, promoters, and repressors are determined by their sequence and secondary processes such as chromatin accessibility, DNA methylation, and bound histone markers. Methods Here, machine learning models are employed to evaluate the accuracy with which cis -regulatory elements identified by various commonly used sequencing techniques can be predicted by their underlying sequence alone to distinguish between cis -regulatory activity that is reflective of sequence content versus secondary processes. Results and discussion Models trained and evaluated on D. melanogaster sequences identified through DNase-seq and STARR-seq are significantly more accurate than models trained on sequences identified by H3K4me1, H3K4me3, and H3K27ac ChIP-seq, FAIRE-seq, and ATAC-seq. These results suggest that the activity detected by DNase-seq and STARR-seq can be largely explained by underlying DNA sequence, independent of secondary processes. Experimentally, a subset of DNase-seq and H3K4me1 ChIP-seq sequences were tested for enhancer activity using luciferase assays and compared with previous tests performed on STARR-seq sequences. The experimental data indicated that STARR-seq sequences are substantially enriched for enhancer-specific activity, while the DNase-seq and H3K4me1 ChIP-seq sequences are not. Taken together, these results indicate that the DNase-seq approach identifies a broad class of regulatory elements of which enhancers are a subset and the associated data are appropriate for training models for detecting regulatory activity from sequence alone, STARR-seq data are best for training enhancer-specific sequence models, and H3K4me1 ChIP-seq data are not well suited for training and evaluating sequence-based models for cis -regulatory element prediction. 
    more » « less
  5. Large, polymorphic inversions can contribute to population structure and enable mutually-exclusive adaptations to survive in the same population. Current methods for detecting inversions from single-nucleotide polymorphisms (SNPs) called from population genomics data require an experienced, human user to prepare the data and interpret the results. Ideally, these methods would be completely automated yet robust to allow usage by inexperienced users. Towards this goal, automated approaches for segmentation of inversions and inference of sample genotypes are introduced and evaluated on chromosomes from flies, mosquitoes, and prairie sunflowers. 
    more » « less
  6. Study of α-V70I-substituted nitrogenase MoFe protein identified Fe6 of FeMo-cofactor (Fe 7 S 9 MoC-homocitrate) as a critical N 2 binding/reduction site. Freeze-trapping this enzyme during Ar turnover captured the key catalytic intermediate in high occupancy, denoted E 4 (4H), which has accumulated 4[e − /H + ] as two bridging hydrides, Fe2–H–Fe6 and Fe3–H–Fe7, and protons bound to two sulfurs. E 4 (4H) is poised to bind/reduce N 2 as driven by mechanistically-coupled H 2 reductive-elimination of the hydrides. This process must compete with ongoing hydride protonation (HP), which releases H 2 as the enzyme relaxes to state E 2 (2H), containing 2[e − /H + ] as a hydride and sulfur-bound proton; accumulation of E 4 (4H) in α-V70I is enhanced by HP suppression. EPR and 95 Mo ENDOR spectroscopies now show that resting-state α-V70I enzyme exists in two conformational states, both in solution and as crystallized, one with wild type (WT)-like FeMo-co and one with perturbed FeMo-co. These reflect two conformations of the Ile residue, as visualized in a reanalysis of the X-ray diffraction data of α-V70I and confirmed by computations. EPR measurements show delivery of 2[e − /H + ] to the E 0 state of the WT MoFe protein and to both α-V70I conformations generating E 2 (2H) that contains the Fe3–H–Fe7 bridging hydride; accumulation of another 2[e − /H + ] generates E 4 (4H) with Fe2–H–Fe6 as the second hydride. E 4 (4H) in WT enzyme and a minority α-V70I E 4 (4H) conformation as visualized by QM/MM computations relax to resting-state through two HP steps that reverse the formation process: HP of Fe2–H–Fe6 followed by slower HP of Fe3–H–Fe7, which leads to transient accumulation of E 2 (2H) containing Fe3–H–Fe7. In the dominant α-V70I E 4 (4H) conformation, HP of Fe2–H–Fe6 is passively suppressed by the positioning of the Ile sidechain; slow HP of Fe3–H–Fe7 occurs first and the resulting E 2 (2H) contains Fe2–H–Fe6. It is this HP suppression in E 4 (4H) that enables α-V70I MoFe to accumulate E 4 (4H) in high occupancy. In addition, HP suppression in α-V70I E 4 (4H) kinetically unmasks hydride reductive-elimination without N 2 -binding, a process that is precluded in WT enzyme. 
    more » « less
  7. Abstract Sufficient low-level storm-relative flow is a necessary ingredient for sustained supercell thunderstorms and is connected to supercell updraft width. Assuming a supercell exists, the role of low-level storm-relative flow in regulating supercells’ low-level mesocyclone intensity is less clear. One possibility considered in this article is that storm-relative flow controls mesocyclone and tornado width via its modulation of overall updraft extent. This hypothesis relies on a previously postulated positive correspondence between updraft width, mesocyclone width, and tornado width. An alternative hypothesis is that mesocyclone characteristics are primarily regulated by horizontal streamwise vorticity irrespective of storm-relative flow. A matrix of supercell simulations was analyzed to address the aforementioned hypotheses, wherein horizontal streamwise vorticity and storm-relative flow were independently varied. Among these simulations, mesocyclone width and intensity were strongly correlated with horizontal streamwise vorticity, and comparatively weakly correlated with storm-relative flow, supporting the second hypothesis. Accompanying theory and trajectory analysis offers the physical explanation that, when storm-relative flow is large and updrafts are wide, vertically tilted streamwise vorticity is projected over a wider area but with a lesser average magnitude than when these parameters are small. These factors partially offset one another, degrading the correspondence of storm-relative flow with updraft circulation and rotational velocity, which are the mesocyclone attributes most closely tied to tornadoes. These results refute the previously purported connections between updraft width, mesocyclone width, and tornado width, and emphasize horizontal streamwise vorticity as the primary control on low-level mesocyclones in sustained supercells. Significance Statement The intensity of a supercell thunderstorm’s low-level rotation, known as the “mesocyclone,” is thought to influence tornado likelihood. Mesocyclone intensity depends on many environmental attributes that are often correlated with one another and difficult to disentangle. This study used a large body of numerical simulations to investigate the influence of the speed of low-level air entering a supercell (storm-relative flow), the horizontal spin of the ambient air entering the thunderstorm (streamwise vorticity), and the width of the storm’s updraft. Our results suggest that the rotation of the mesocyclone in supercells is primarily influenced by streamwise vorticity, with comparatively weaker connections to storm-relative flow and updraft width. These findings provide important clarification in our scientific understanding of how a storm’s environment influences the rate of rotation of its mesocyclone, and the associated tornado threat. 
    more » « less
  8. Understanding how Nature accomplishes the reduction of inert nitrogen gas to form metabolically tractable ammonia at ambient temperature and pressure has challenged scientists for more than a century. Such an understanding is a key aspect toward accomplishing the transfer of the genetic determinants of biological nitrogen fixation to crop plants as well as for the development of improved synthetic catalysts based on the biological mechanism. Over the past 30 years, the free-living nitrogen-fixing bacterium Azotobacter vinelandii emerged as a preferred model organism for mechanistic, structural, genetic, and physiological studies aimed at understanding biological nitrogen fixation. This review provides a contemporary overview of these studies and places them within the context of their historical development. 
    more » « less
  9. Background Large (>1 Mb), polymorphic inversions have substantial impacts on population structure and maintenance of genotypes. These large inversions can be detected from single nucleotide polymorphism (SNP) data using unsupervised learning techniques like PCA. Construction and analysis of a feature matrix from millions of SNPs requires large amount of memory and limits the sizes of data sets that can be analyzed. Methods We propose using feature hashing construct a feature matrix from a VCF file of SNPs for reducing memory usage. The matrix is constructed in a streaming fashion such that the entire VCF file is never loaded into memory at one time. Results When evaluated on Anopheles mosquito and Drosophila fly data sets, our approach reduced memory usage by 97% with minimal reductions in accuracy for inversion detection and localization tasks. Conclusion With these changes, inversions in larger data sets can be analyzed easily and efficiently on common laptop and desktop computers. Our method is publicly available through our open-source inversion analysis software, Asaph. 
    more » « less
  10. Abstract This article introduces an analytic formula for entraining convective available potential energy (ECAPE) with an entrainment rate that is determined directly from an environmental sounding, rather than prescribed by the formula user. Entrainment is connected to the background environment using an eddy diffusivity approximation for lateral mixing, updraft geometry assumptions, and mass continuity. These approximations result in a direct correspondence between the storm-relative flow and the updraft radius and an inverse scaling between the updraft radius squared and entrainment rate. The aforementioned concepts, combined with the assumption of adiabatic conservation of moist static energy, yield an explicit analytic equation for ECAPE that depends entirely on state variables in an atmospheric profile and a few constant parameters with values that are established in past literature. Using a simplified Bernoulli-like equation, the ECAPE formula is modified to account for updraft enhancement via kinetic energy extracted from the cloud’s background environment. CAPE and ECAPE can be viewed as predictors of the maximum vertical velocitywmaxin an updraft. Hence, these formulas are evaluated usingwmaxfrom past numerical modeling studies. Both of the new formulas improve predictions ofwmaxsubstantially over commonly used diagnostic parameters, including undiluted CAPE and ECAPE with a constant prescribed entrainment rate. The formula that incorporates environmental kinetic energy contribution to the updraft correctly predicts instances of exceedance ofbywmax, and provides a conceptual explanation for why such exceedance is rare among past simulations. These formulas are potentially useful in nowcasting and forecasting thunderstorms and as thunderstorm proxies in climate change studies. Significance StatementSubstantial mixing occurs between the upward-moving air currents in thunderstorms (updrafts) and the surrounding comparatively dry environmental air, through a process called entrainment. Entrainment controls thunderstorm intensity via its diluting effect on the buoyancy of air within updrafts. A challenge to representing entrainment in forecasting and predictions of the intensity of updrafts in future climates is to determine how much entrainment will occur in a given thunderstorm environment without a computationally expensive high-resolution simulation. To address this gap, this article derives a new formula that computes entrainment from the properties of a single environmental profile. This formula is shown to predict updraft vertical velocity more accurately than past diagnostics, and can be used in forecasting and climate prediction to improve predictions of thunderstorm behavior and impacts. 
    more » « less