skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Molecular dynamics and machine learning stratify motion-dependent activity profiles of S-layer destabilizing nanobodies
Abstract Nanobody (Nb)-induced disassembly of surface array protein (Sap) S-layers, a two-dimensional paracrystalline protein lattice from Bacillus anthracis, has been presented as a therapeutic intervention for lethal anthrax infections. However, only a subset of existing Nbs with affinity to Sap exhibit depolymerization activity, suggesting that affinity and epitope recognition are not enough to explain inhibitory activity. In this study, we performed all-atom molecular dynamics simulations of each Nb bound to the Sap binding site and trained a collection of machine learning classifiers to predict whether each Nb induces depolymerization. We used feature importance analysis to filter out unnecessary features and engineered remaining features to regularize the feature landscape and encourage learning of the depolymerization mechanism. We find that, while not enforced in training, a gradient-boosting decision tree is able to reproduce the experimental activities of inhibitory Nbs while maintaining high classification accuracy, whereas neural networks were only able to discriminate between classes. Further feature analysis revealed that inhibitory Nbs restrain Sap motions toward an inhibitory conformational state described by domain–domain clamping and induced twisting of domains normal to the lattice plane. We believe these motions drive Sap lattice depolymerization and can be used as design targets for improved Sap-inhibitory Nbs. Finally, we expect our method of study to apply to S-layers that serve as virulence factors in other pathogens, paving the way forward for Nb therapeutics that target depolymerization mechanisms.  more » « less
Award ID(s):
2244331
PAR ID:
10559458
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
PNAS Nexus
Volume:
3
Issue:
12
ISSN:
2752-6542
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The molecular basis of protein thermal stability is only partially understood and has major significance for drug and vaccine discovery. The lack of datasets and standardized benchmarks considerably limits learning-based discovery methods. We present \texttt{HotProtein}, a large-scale protein dataset with \textit{growth temperature} annotations of thermostability, containing K amino acid sequences and K folded structures from different species with a wide temperature range. Due to functional domain differences and data scarcity within each species, existing methods fail to generalize well on our dataset. We address this problem through a novel learning framework, consisting of () Protein structure-aware pre-training (SAP) which leverages 3D information to enhance sequence-based pre-training; () Factorized sparse tuning (FST) that utilizes low-rank and sparse priors as an implicit regularization, together with feature augmentations. Extensive empirical studies demonstrate that our framework improves thermostability prediction compared to other deep learning models. Finally, we introduce a novel editing algorithm to efficiently generate positive amino acid mutations that improve thermostability. Codes are available in https://github.com/VITA-Group/HotProtein. 
    more » « less
  2. The now prevalent Omicron variant and its subvariants/sub-lineages have led to a significant increase in COVID-19 cases and raised serious concerns about increased risk of infectivity, immune evasion, and reinfection. Heparan sulfate (HS), located on the surface of host cells, plays an important role as a co-receptor for virus–host cell interaction. The ability of heparin and HS to compete for binding of the SARS-CoV-2 spike (S) protein to cell surface HS illustrates the therapeutic potential of agents targeting protein–glycan interactions. In the current study, phylogenetic tree of variants and mutations in S protein receptor-binding domain (RBD) of Omicron BA.2.12.1, BA.4 and BA.5 were described. The binding affinity of Omicron S protein RBD to heparin was further investigated by surface plasmon resonance (SPR). Solution competition studies on the inhibitory activity of heparin oligosaccharides and desulfated heparins at different sites on S protein RBD–heparin interactions revealed that different sub-lineages tend to bind heparin with different chain lengths and sulfation patterns. Furthermore, blind docking experiments showed the contribution of basic amino acid residues in RBD and sulfo groups and carboxyl groups on heparin to the interaction. Finally, pentosan polysulfate and mucopolysaccharide polysulfate were evaluated for inhibition on the interaction of heparin and S protein RBD of Omicron BA.2.12.1, BA.4/BA.5, and both showed much stronger inhibition than heparin. 
    more » « less
  3. Abstract Nucleotide-binding site (NBS) domain genes are one of the superfamily of resistance genes involved in plant responses to pathogens. The current study identified 12,820 NBS-domain-containing genes across 34 species covering from mosses to monocots and dicots. These identified genes are classified into 168 classes with several novel domain architecture patterns encompassing significant diversity among plant species. Several classical (NBS, NBS-LRR, TIR-NBS, TIR-NBS-LRR, etc.) and species-specific structural patterns (TIR-NBS-TIR-Cupin_1-Cupin_1, TIR-NBS-Prenyltransf, Sugar_tr-NBSetc.) were discovered. We observed 603 orthogroups (OGs) with some core (most common orthogroups; OG0, OG1, OG2,etc.) and unique (highly specific to species; OG80, OG82,etc.) OGs with tandem duplications. The expression profiling presented the putative upregulation of OG2, OG6,and OG15in different tissues under various biotic and abiotic stresses in susceptible and tolerant plants to cotton leaf curl disease (CLCuD). The genetic variation between susceptible (Coker 312) and tolerant (Mac7)Gossypium hirsutumaccessions identified several unique variants inNBSgenes of Mac7 (6583 variants) and Coker312 (5173 variants). The protein–ligand and proteins-protein interaction showed a strong interaction of some putativeNBSproteins with ADP/ATP and different core proteins of the cotton leaf curl disease virus. The silencing ofGaNBS(OG2) in resistant cotton through virus-induced gene silencing (VIGS) demonstrated its putative role in virus tittering. The presented study will be further helpful in understanding the plant adaptation mechanism. 
    more » « less
  4. Liu, Tao (Ed.)
    Neural crest cells (NCC) are multipotent migratory stem cells that originate from the neural tube during early vertebrate embryogenesis. NCCs give rise to a variety of cell types within the developing organism, including neurons and glia of the sympathetic nervous system. It has been suggested that failure in correct NCC differentiation leads to several diseases, including neuroblastoma (NB). During normal NCC development, MYCN is transiently expressed to promote NCC migration, and its downregulation precedes neuronal differentiation. Overexpression of MYCN has been linked to high-risk and aggressive NB progression. For this reason, understanding the effect overexpression of this oncogene has on the development of NCC-derived sympathoadrenal progenitors (SAP), which later give rise to sympathetic nerves, will help elucidate the developmental mechanisms that may prime the onset of NB. Here, we found that overexpressing human EGFP-MYCN within SAP lineage cells in zebrafish led to the transient formation of an abnormal SAP population, which displayed expanded and elevated expression of NCC markers while paradoxically also co-expressing SAP and neuronal differentiation markers. The aberrant NCC signature was corroborated within vivotime-lapse confocal imaging in zebrafish larvae, which revealed transient expansion ofsox10reporter expression in MYCN overexpressing SAPs during the early stages of SAP development. In these aberrant MYCN overexpressing SAP cells, we also found evidence of dampened BMP signaling activity, indicating that BMP signaling disruption occurs following elevated MYCN expression. Furthermore, we discovered that pharmacological inhibition of BMP signaling was sufficient to create an aberrant NCC gene signature in SAP cells, phenocopying MYCN overexpression. Together, our results suggest that MYCN overexpression in SAPs disrupts their differentiation by eliciting abnormal NCC gene expression programs, and dampening BMP signaling response, having developmental implications for the priming of NBin vivo. 
    more » « less
  5. Machine learning is an important tool in the study of the phase behavior from molecular simulations. In this work, we use un-supervised machine learning methods to study the phase behavior of two off-lattice models, a binary Lennard-Jones (LJ) mixture and the Widom–Rowlinson (WR) non-additive hard-sphere mixture. The majority of previous work has focused on lattice models, such as the 2D Ising model, where the values of the spins are used as the feature vector that is input into the machine learning algorithm, with considerable success. For these two off-lattice models, we find that the choice of the feature vector is crucial to the ability of the algorithm to predict a phase transition, and this depends on the particular model system being studied. We consider two feature vectors, one where the elements are distances of the particles of a given species from a probe (distance-based feature) and one where the elements are +1 if there is an excess of particles of the same species within a cut-off distance and −1 otherwise (affinity-based feature). We use principal component analysis and t-distributed stochastic neighbor embedding to investigate the phase behavior at a critical composition. We find that the choice of the feature vector is the key to the success of the unsupervised machine learning algorithm in predicting the phase behavior, and the sophistication of the machine learning algorithm is of secondary importance. In the case of the LJ mixture, both feature vectors are adequate to accurately predict the critical point, but in the case of the WR mixture, the affinity-based feature vector provides accurate estimates of the critical point, but the distance-based feature vector does not provide a clear signature of the phase transition. The study suggests that physical insight into the choice of input features is an important aspect for implementing machine learning methods. 
    more » « less