Abstract Therapeutic antibody development requires selection and engineering of molecules with high affinity and other drug-like biophysical properties. Co-optimization of multiple antibody properties remains a difficult and time-consuming process that impedes drug development. Here we evaluate the use of machine learning to simplify antibody co-optimization for a clinical-stage antibody (emibetuzumab) that displays high levels of both on-target (antigen) and off-target (non-specific) binding. We mutate sites in the antibody complementarity-determining regions, sort the antibody libraries for high and low levels of affinity and non-specific binding, and deep sequence the enriched libraries. Interestingly, machine learning models trained on datasets with binary labels enable predictions of continuous metrics that are strongly correlated with antibody affinity and non-specific binding. These models illustrate strong tradeoffs between these two properties, as increases in affinity along the co-optimal (Pareto) frontier require progressive reductions in specificity. Notably, models trained with deep learning features enable prediction of novel antibody mutations that co-optimize affinity and specificity beyond what is possible for the original antibody library. These findings demonstrate the power of machine learning models to greatly expand the exploration of novel antibody sequence space and accelerate the development of highly potent, drug-like antibodies.
more »
« less
Optimization of multi-site nicking mutagenesis for generation of large, user-defined combinatorial libraries
Abstract Generating combinatorial libraries of specific sets of mutations are essential for addressing protein engineering questions involving contingency in molecular evolution, epistatic relationships between mutations, as well as functional antibody and enzyme engineering. Here we present optimization of a combinatorial mutagenesis method involving template-based nicking mutagenesis, which allows for the generation of libraries with >99% coverage for tens of thousands of user-defined variants. The non-optimized method resulted in low library coverage, which could be rationalized by a model of oligonucleotide annealing bias resulting from the nucleotide mismatch free-energy difference between mutagenic oligo and template. The optimized method mitigated this thermodynamic bias using longer primer sets and faster annealing conditions. Our updated method, applied to two antibody fragments, delivered between 99.0% (32451/32768 library members) to >99.9% coverage (32757/32768) for our desired libraries in 2 days and at an approximate 140-fold sequencing depth of coverage.
more »
« less
- Award ID(s):
- 2030221
- PAR ID:
- 10345156
- Date Published:
- Journal Name:
- Protein Engineering, Design and Selection
- Volume:
- 34
- ISSN:
- 1741-0126
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Heterologous tRNAs used for noncanonical amino acid (ncAA) mutagenesis in mammalian cells typically show poor activity. We recently introduced a virus‐assisted directed evolution strategy (VADER) that can enrich improved tRNA mutants from naïve libraries in mammalian cells. However, VADER was limited to processing only a few thousand mutants; the inability to screen a larger sequence space precluded the identification of highly active variants with distal synergistic mutations. Here, we report VADER2.0, which can process significantly larger mutant libraries. It also employs a novel library design, which maintains base‐pairing between distant residues in the stem regions, allowing us to pack a higher density of functional mutants within a fixed sequence space. VADER2.0 enabled simultaneous engineering of the entire acceptor stem ofM. mazeipyrrolysyl tRNA (tRNAPyl), leading to a remarkably improved variant, which facilitates more efficient incorporation of a wider range of ncAAs, and enables facile development of viral vectors and stable cell‐lines for ncAA mutagenesis.more » « less
-
Broadly neutralizing antibodies (bnAbs) that neutralize diverse variants of a particular virus are of considerable therapeutic interest. Recent advances have enabled us to isolate and engineer these antibodies as therapeutics, but eliciting them through vaccination remains challenging, in part due to our limited understanding of how antibodies evolve breadth. Here, we analyze the landscape by which an anti-influenza receptor binding site (RBS) bnAb, CH65, evolved broad affinity to diverse H1 influenza strains. We do this by generating an antibody library of all possible evolutionary intermediates between the unmutated common ancestor (UCA) and the affinity-matured CH65 antibody and measure the affinity of each intermediate to three distinct H1 antigens. We find that affinity to each antigen requires a specific set of mutations – distributed across the variable light and heavy chains – that interact non-additively (i.e., epistatically). These sets of mutations form a hierarchical pattern across the antigens, with increasingly divergent antigens requiring additional epistatic mutations beyond those required to bind less divergent antigens. We investigate the underlying biochemical and structural basis for these hierarchical sets of epistatic mutations and find that epistasis between heavy chain mutations and a mutation in the light chain at the V H -V L interface is essential for binding a divergent H1. Collectively, this is the first work to comprehensively characterize epistasis between heavy and light chain mutations and shows that such interactions are both strong and widespread. Together with our previous study analyzing a different class of anti-influenza antibodies, our results implicate epistasis as a general feature of antibody sequence-affinity landscapes that can potentiate and constrain the evolution of breadth.more » « less
-
null (Ed.)Antibody therapeutics and vaccines are among our last resort to end the raging COVID-19 pandemic. They, however, are prone to over 5000 mutations on the spike (S) protein uncovered by a Mutation Tracker based on over 200 000 genome isolates. It is imperative to understand how mutations will impact vaccines and antibodies in development. In this work, we first study the mechanism, frequency, and ratio of mutations on the S protein which is the common target of most COVID-19 vaccines and antibody therapies. Additionally, we build a library of 56 antibody structures and analyze their 2D and 3D characteristics. Moreover, we predict the mutation-induced binding free energy (BFE) changes for the complexes of S protein and antibodies or ACE2. By integrating genetics, biophysics, deep learning, and algebraic topology, we reveal that most of the 462 mutations on the receptor-binding domain (RBD) will weaken the binding of S protein and antibodies and disrupt the efficacy and reliability of antibody therapies and vaccines. A list of 31 antibody disrupting mutants is identified, while many other disruptive mutations are detailed as well. We also unveil that about 65% of the existing RBD mutations, including those variants recently found in the United Kingdom (UK) and South Africa, will strengthen the binding between the S protein and human angiotensin-converting enzyme 2 (ACE2), resulting in more infectious COVID-19 variants. We discover the disparity between the extreme values of RBD mutation-induced BFE strengthening and weakening of the bindings with antibodies and angiotensin-converting enzyme 2 (ACE2), suggesting that SARS-CoV-2 is at an advanced stage of evolution for human infection, while the human immune system is able to produce optimized antibodies. This discovery, unfortunately, implies the vulnerability of current vaccines and antibody drugs to new mutations. Our predictions were validated by comparison with more than 1400 deep mutations on the S protein RBD. Our results show the urgent need to develop new mutation-resistant vaccines and antibodies and to prepare for seasonal vaccinations.more » « less
-
Covalent DNA protein crosslinks (DPCs) are common lesions that block replication. We examine here the consequence of DPCs on mutagenesis involving replicational template-switch reactions in Escherichia coli. 5-Azacytidine (5-azaC) is a potent mutagen for template-switching. This effect is dependent on DNA cytosine methylase (Dcm), implicating the Dcm-DNA covalent complex trapped by 5-azaC as the initiator for mutagenesis. The leading strand of replication is more mutable than the lagging strand, which can be explained by blocks to the replicative helicase and/or fork regression. We find that template-switch mutagenesis induced by 5-azaC does not require double strand break repair via RecABCD; the ability to induce the SOS response is anti-mutagenic. Mutants in recB, but not recA, exhibit high constitutive rates of template-switching, and we suggest that RecBCD-mediated DNA degradation prevents template-switching associated with fork regression. A mutation in the DnaB fork helicase also promotes high levels of template-switching. We also find that other DPC-inducers, formaldehyde (a non-specific crosslinker) and ciprofloxacin (a topoisomerase II poison) are also strong mutagens for template-switching with similar genetic properties. Induction of mutations and genetic rearrangements that occur by template-switching may constitute a previously unrecognized component of the genotoxicity and genetic instability promoted by DPCs.more » « less
An official website of the United States government

