Type 1 polyketides are a major class of natural products used as antiviral, antibiotic, antifungal, antiparasitic, immunosuppressive, and antitumor drugs. Analysis of public microbial genomes leads to the discovery of over sixty thousand type 1 polyketide gene clusters. However, the molecular products of only about a hundred of these clusters are characterized, leaving most metabolites unknown. Characterizing polyketides relies on bioactivity-guided purification, which is expensive and time-consuming. To address this, we present Seq2PKS, a machine learning algorithm that predicts chemical structures derived from Type 1 polyketide synthases. Seq2PKS predicts numerous putative structures for each gene cluster to enhance accuracy. The correct structure is identified using a variable mass spectral database search. Benchmarks show that Seq2PKS outperforms existing methods. Applying Seq2PKS to Actinobacteria datasets, we discover biosynthetic gene clusters for monazomycin, oasomycin A, and 2-aminobenzamide-actiphenol.
more » « less- Award ID(s):
- 2117640
- PAR ID:
- 10517357
- Publisher / Repository:
- Nature Publishing Group
- Date Published:
- Journal Name:
- Nature Communications
- Volume:
- 15
- Issue:
- 1
- ISSN:
- 2041-1723
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Nature serves as a rich source of molecules with immense chemical diversity. Aptly named, these ‘natural products’ boast a wide variety of environmental, medicinal and industrial applications. Type II polyketides, in particular, confer substantial medicinal benefits, including antibacterial, antifungal, anticancer and anti-inflammatory properties. These molecules are produced by enzyme assemblies known as type II polyketide synthases (PKSs), which use domains such as the ketosynthase chain-length factor and acyl carrier protein to produce polyketides with varying lengths, cyclization patterns and oxidation states. In this work, we use a novel bioinformatic workflow to identify biosynthetic gene clusters (BGCs) that code for the core type II PKS enzymes. This method does not rely on annotation and thus was able to unearth previously ‘hidden’ type II PKS BGCs. This work led us to identify over 6000 putative type II PKS BGCs spanning a diverse set of microbial phyla, nearly double those found in most recent studies. Notably, many of these newly identified BGCs were found in non-actinobacteria, which are relatively underexplored as sources of type II polyketides. Results from this work lay an important foundation for future bioprospecting and engineering efforts that will enable sustainable access to diverse and structurally complex molecules with medicinally relevant properties.more » « less
-
Abstract Fungal polyketides display remarkable structural diversity and bioactivity, and therefore the biosynthesis and engineering of this large class of molecules is therapeutically significant. Here, we successfully recode, construct and characterize the biosynthetic pathway of bikaverin, a tetracyclic polyketide with antibiotic, antifungal and anticancer properties, in
S. cerevisiae . We use a green fluorescent protein (GFP) mapping strategy to identify the low expression of Bik1 (polyketide synthase) as a major bottleneck step in the pathway, and a promoter exchange strategy is used to increase expression of Bik1 and bikaverin titer. Then, we use an enzyme-fusion strategy to directly couple the monooxygenase (Bik2) and methyltransferase (Bik3) to efficiently channel intermediates between modifying enzymes, leading to an improved titer of bikaverin at 202.75 mg/L with flask fermentation (273-fold higher than the initial titer). This study demonstrates that the biosynthesis of complex fungal polyketides can be established and efficiently engineered inS. cerevisiae , highlighting the potential for natural product synthesis and large-scale fermentation in yeast. -
Bacteria have historically been a rich source of natural products ( e.g. polyketides and non-ribosomal peptides) that possess medically-relevant activities. Despite extensive discovery programs in both industry and academia, a plethora of biosynthetic pathways remain uncharacterized and the corresponding molecular products untested for potential bioactivities. This knowledge gap comes in part from the fact that many putative natural product producers have not been cultured in conventional laboratory settings in which the corresponding products are produced at detectable levels. Next-generation sequencing technologies are further increasing the knowledge gap by obtaining metagenomic sequence information from complex communities where production of the desired compound cannot be isolated in the laboratory. For these reasons, many groups are turning to synthetic biology to produce putative natural products in heterologous hosts. This strategy depends on the ability to heterologously express putative biosynthetic gene clusters and produce relevant quantities of the corresponding products. Actinobacteria remain the most abundant source of natural products and the most promising heterologous hosts for natural product discovery and production. However, researchers are discovering more natural products from other groups of bacteria, such as myxobacteria and cyanobacteria. Therefore, phylogenetically similar heterologous hosts have become promising candidates for synthesizing these novel molecules. The downside of working with these microbes is the lack of well-characterized genetic tools for optimizing expression of gene clusters and product titers. This review examines heterologous expression of natural product gene clusters in terms of the motivations for this research, the traits desired in an ideal host, tools available to the field, and a survey of recent progress.more » « less
-
Animal cytoplasmic fatty acid synthase (FAS) represents a unique family of enzymes that are classically thought to be most closely related to fungal polyketide synthase (PKS). Recently, a widespread family of animal lipid metabolic enzymes has been described that bridges the gap between these two ubiquitous and important enzyme classes: the animal FAS–like PKSs (AFPKs). Although very similar in sequence to FAS enzymes that produce saturated lipids widely found in animals, AFPKs instead produce structurally diverse compounds that resemble bioactive polyketides. Little is known about the factors that bridge lipid and polyketide synthesis in the animals. Here, we describe the function of EcPKS2 from
Elysia chlorotica , which synthesizes a complex polypropionate natural product found in this mollusc. EcPKS2 starter unit promiscuity potentially explains the high diversity of polyketides found in and among molluscan species. Biochemical comparison of EcPKS2 with the previously described EcPKS1 reveals molecular principles governing substrate selectivity that should apply to related enzymes encoded within the genomes of photosynthetic gastropods. Hybridization experiments combining EcPKS1 and EcPKS2 demonstrate the interactions between the ketoreductase and ketosynthase domains in governing the product outcomes. Overall, these findings enable an understanding of the molecular principles of structural diversity underlying the many molluscan polyketides likely produced by the diverse AFPK enzyme family. -
Type I modular polyketide synthases are homodimeric multidomain assembly line enzymes that synthesize a variety of polyketide natural products by performing polyketide chain extension and β-keto group modification reactions. We determined the 2.4-angstrom-resolution x-ray crystal structure and the 3.1-angstrom-resolution cryo–electron microscopy structure of the Lsd14 polyketide synthase, stalled at the transacylation and condensation steps, respectively. These structures revealed how the constituent domains are positioned relative to each other, how they rearrange depending on the step in the reaction cycle, and the specific interactions formed between the domains. Like the evolutionarily related mammalian fatty acid synthase, Lsd14 contains two reaction chambers, but only one chamber in Lsd14 has the full complement of catalytic domains, indicating that only one chamber produces the polyketide product at any given time.more » « less