EnsembleDesign: messenger RNA design minimizing ensemble free energy via probabilistic lattice parsing

Dai, Ning (ORCID:0009000099257874); Zhou, Tianshuo (ORCID:0009000848040825); Tang, Wei Yu (ORCID:0009000811419479); Mathews, David H (ORCID:0000000229076557); Huang, Liang (ORCID:0000000164447045)

doi:10.1093/bioinformatics/btaf245

Abstract MotivationThe task of designing optimized messenger RNA (mRNA) sequences has received much attention in recent years, thanks to breakthroughs in mRNA vaccines during the COVID-19 pandemic. Because most previous work aimed to minimize the minimum free energy (MFE) of the mRNA in order to improve stability and protein expression, which only considers one particular structure per mRNA sequence, millions of alternative conformations in equilibrium are neglected. More importantly, we prefer an mRNA to populate multiple stable structures and be flexible among them during translation when the ribosome unwinds it. ResultsTherefore, we consider a new objective to minimize the ensemble free energy of an mRNA, which includes all possible structures in its Boltzmann ensemble. However, this new problem is much harder to solve than the original MFE optimization. To address the increased complexity of this problem, we introduce EnsembleDesign, a novel algorithm that employs continuous relaxation to optimize the expected ensemble free energy over a distribution of candidate sequences. EnsembleDesign extends both the lattice representation of the design space and the dynamic programming algorithm from LinearDesign to their probabilistic counterparts. Our algorithm consistently outperforms LinearDesign in terms of ensemble free energy, especially on long sequences. Interestingly, as byproducts, our designs also enjoy lower average unpaired probabilities (which correlates with degradation) and flatter Boltzmann ensembles (more flexibility between conformations). Availability and implementationOur code is available on: https://github.com/LinearFold/EnsembleDesign.

More Like this