skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Autoregressive HMM resolves biomolecular transitions from passive optical tweezer force measurements
Optical tweezer (OT) single-molecule force spectroscopy is a powerful method to map out the energy landscape of biological complexes and has found increasing applications in academic and pharmaceutical research. The dominant method to extract molecular conformation transitions from the thermal diffusion-broadened trajectories of the microscopic OT probes attached to the single molecule of interest is through hidden Markov models (HMMs). In standard applications, the HMMs assume a white noise spectrum of the probes superimposed onto the molecular signal. Here, we demonstrate, through theoretical derivation, computer modeling and experimental measurements that this standard white noise HMM (wnHMM) misses key features of real OT data. The deviation is most pronounced at higher frequencies because the white noise model does not account for the overdamped nature of particle diffusion in an OT harmonic potential in aqueous environments. To address this, we derive how to incorporate autoregression between consecutive data points into a HMM, and demonstrate through modeling and experiment that such an autoregressive HMM (arHMM) captures real OT data behavior across all frequency ranges. Through analysis of real OT data we recorded on a single DNA hairpin undergoing folding and unfolding transitions, we show that the wnHMM extracts lifetimes that are at least a factor of 2 faster and less consistent than the arHMM results, which match expectations and prior measurements. Overall, our work suggests that arHMM should be the default model choice for analysis OT single-molecule transitions and that its use will improve the fidelity and accuracy of single-molecule force spectroscopy measurements.  more » « less
Award ID(s):
2117585
PAR ID:
10568567
Author(s) / Creator(s):
;
Publisher / Repository:
CellPress
Date Published:
Journal Name:
Biophysical Journal
ISSN:
0006-3495
Subject(s) / Keyword(s):
biophysics, machine learning, single molecule force measurements, optical tweezers
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    In this paper, we present a novel training method based on Baum-Welch algorithm for hidden Markov models (HMM), named as Comprehensive HMM (CompHMM), which changes the traditional approach of training HMM from positive examples only to be able to utilize both positive and negative examples in training HMMs. By comparison, our method outperformed the standard Baum-Welch method and another HMM discriminative training method significantly through both synthetic and real data in membership prediction task. 
    more » « less
  2. Abstract Hyperbolic metamaterial (HMM) is a unique type of anisotropic material that can exhibit metal and dielectric properties at the same time. This unique characteristic results in it having unbounded isofrequency surface contours, leading to exotic phenomena such as spontaneous emission enhancement and applications such as super-resolution imaging. However, at optical frequencies, HMM must be artificially engineered and always requires a metal constituent, whose intrinsic loss significantly limits the experimentally accessible wave vector values, thus negatively impacting the performance of these applications. The need to reduce loss in HMM stimulated the development of the second-generation HMM, termed active HMM, where gain materials are utilized to compensate for metal’s intrinsic loss. With the advent of topological photonics that allows robust light transportation immune to disorders and defects, research on HMM also entered the topological regime. Tremendous efforts have been dedicated to exploring the topological transition from elliptical to hyperbolic dispersion and topologically protected edge states in HMM, which also prompted the invention of lossless HMM formed by all-dielectric material. Furthermore, emerging twistronics can also provide a route to manipulate topological transitions in HMMs. In this review, we survey recent progress in topological effects in HMMs and provide prospects on possible future research directions. 
    more » « less
  3. Single-molecule force spectroscopy is a powerful tool for studying protein folding. Over the last decade, a key question has emerged: how are changes in intrinsic biomolecular dynamics altered by attachment to μm-scale force probes via flexible linkers? Here, we studied the folding/unfolding of α3D using atomic force microscopy (AFM)–based force spectroscopy. α3D offers an unusual opportunity as a prior single-molecule fluorescence resonance energy transfer (smFRET) study showed α3D’s configurational diffusion constant within the context of Kramers theory varies with pH. The resulting pH dependence provides a test for AFM-based force spectroscopy’s ability to track intrinsic changes in protein folding dynamics. Experimentally, however, α3D is challenging. It unfolds at low force (<15 pN) and exhibits fast-folding kinetics. We therefore used focused ion beam–modified cantilevers that combine exceptional force precision, stability, and temporal resolution to detect state occupancies as brief as 1 ms. Notably, equilibrium and nonequilibrium force spectroscopy data recapitulated the pH dependence measured using smFRET, despite differences in destabilization mechanism. We reconstructed a one-dimensional free-energy landscape from dynamic data via an inverse Weierstrass transform. At both neutral and low pH, the resulting constant-force landscapes showed minimal differences (∼0.2 to 0.5kBT) in transition state height. These landscapes were essentially equal to the predicted entropic barrier and symmetric. In contrast, force-dependent rates showed that the distance to the unfolding transition state increased as pH decreased and thereby contributed to the accelerated kinetics at low pH. More broadly, this precise characterization of a fast-folding, mechanically labile protein enables future AFM-based studies of subtle transitions in mechanoresponsive proteins. 
    more » « less
  4. null (Ed.)
    Abstract Background Hidden Markov models (HMM) are a powerful tool for analyzing biological sequences in a wide variety of applications, from profiling functional protein families to identifying functional domains. The standard method used for HMM training is either by maximum likelihood using counting when sequences are labelled or by expectation maximization, such as the Baum–Welch algorithm, when sequences are unlabelled. However, increasingly there are situations where sequences are just partially labelled. In this paper, we designed a new training method based on the Baum–Welch algorithm to train HMMs for situations in which only partial labeling is available for certain biological problems. Results Compared with a similar method previously reported that is designed for the purpose of active learning in text mining, our method achieves significant improvements in model training, as demonstrated by higher accuracy when the trained models are tested for decoding with both synthetic data and real data. Conclusions A novel training method is developed to improve the training of hidden Markov models by utilizing partial labelled data. The method will impact on detecting de novo motifs and signals in biological sequence data. In particular, the method will be deployed in active learning mode to the ongoing research in detecting plasmodesmata targeting signals and assess the performance with validations from wet-lab experiments. 
    more » « less
  5. Accurate multiple sequence alignment is challenging on many data sets, including those that are large, evolve under high rates of evolution, or have sequence length heterogeneity. While substantial progress has been made over the last decade in addressing the first two challenges, sequence length heterogeneity remains a significant issue for many data sets. Sequence length heterogeneity occurs for biological and technological reasons, including large insertions or deletions (indels) that occurred in the evolutionary history relating the sequences, or the inclusion of sequences that are not fully assembled. Ultra-large alignments using Phylogeny-Aware Profiles (UPP) (Nguyen et al. 2015) is one of the most accurate approaches for aligning data sets that exhibit sequence length heterogeneity: it constructs an alignment on the subset of sequences it considers ‘‘full-length,’’ represents this ‘‘backbone alignment’’ using an ensemble of hidden Markov models (HMMs), and then adds each remaining sequence into the backbone alignment based on an HMM selected for that sequence from the ensemble. Our new method, WeIghTed Consensus Hmm alignment (WITCH), improves on UPP in three important ways: first, it uses a statistically principled technique to weight and rank the HMMs; second, it uses k > 1 HMMs from the ensemble rather than a single HMM; and third, it combines the alignments for each of the selected HMMs using a consensus algorithm that takes the weights into account. We show that this approach provides improved alignment accuracy compared with UPP and other leading alignment methods, as well as improved accuracy for maximum likelihood trees based on these alignments. 
    more » « less