<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Engineering Sequestration-Based Biomolecular Classifiers with Shared Resources</title></titleStmt>
			<publicationStmt>
				<publisher>American Chemical Society</publisher>
				<date>10/18/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10632733</idno>
					<idno type="doi">10.1021/acssynbio.4c00270</idno>
					<title level='j'>ACS Synthetic Biology</title>
<idno>2161-5063</idno>
<biblScope unit="volume">13</biblScope>
<biblScope unit="issue">10</biblScope>					

					<author>Hossein Moghimianavval</author><author>Ignacio Gispert</author><author>Santiago R Castillo</author><author>Olaf_B_W H Corning</author><author>Allen P Liu</author><author>Christian Cuba_Samaniego</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Constructing molecular classifiers that enable cells to recognize linear and nonlinear input patterns would expand the biocomputational capabilities of engineered cells, thereby unlocking their potential in diagnostics and therapeutic applications. While several biomolecular classifier schemes have been designed, the effects of biological constraints such as resource limitation and competitive binding on the function of those classifiers have been left unexplored. Here, we first demonstrate the design of a sigma factor-based perceptron as a molecular classifier working based on the principles of molecular sequestration between the sigma factor and its antisigma molecule. We then investigate how the output of the biomolecular perceptron, i.e., its response pattern or decision boundary, is affected by the competitive binding of sigma factors to a pool of shared and limited resources of core RNA polymerase. Finally, we reveal the influence of sharing limited resources on multilayer perceptron neural networks and outline design principles that enable the construction of nonlinear classifiers using sigma-based biomolecular neural networks in the presence of competitive resource-sharing effects.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">Introduction</head><p>Cellular biocomputation is prevalent in nature with examples including activation of genetic circuits during cell proliferation, decision-making in immune response, and a myriad of phosphorylation-based signaling pathways for determining correct response to exogenous signals <ref type="bibr">(1 -3 )</ref>. The foundation of such computational processes is often laid on molecular interactions such as protein dimerization or ligand-receptor binding. Thus, the inputs of the computational modules in biological systems are typically the concentration of certain monomeric molecules or ligands and, similarly, the outputs are the concentration of specific dimeric or multimeric molecules.</p><p>Drawing inspiration from natural systems, synthetic biologists have been striving to engineer biocomputational schemes in top-down as well as bottom-up synthetic biological systems. While the majority of biocomputational designs rely on utilizing genetic circuits to engineer basic logic gates and simple computational tasks <ref type="bibr">(4 -6 )</ref>, a few studies have demonstrated engineering protein-based circuits that utilize proteolytic or phosphorylation reactions to generate an output <ref type="bibr">(7 -9 )</ref>. Although such biocomputational modules enable simple tasks such as biosensing of chemical species and basic computation, they typically generate digital (0 or 1 or "on or off") responses. Furthermore, encoding more sophisticated processing using logic gates demands intricate architectures with many logic gates and computational layers, rendering them convoluted for practical applications for complex tasks.</p><p>Therefore, constructing simple signal processing units inside living systems that can perform intricate computational tasks such as classification and decision-making are of great interest (Fig. <ref type="figure">1A</ref>, left). Implementing molecular classifiers in living cells would enable the creation of ultra-sensitive biosensors, programming accurate cellular responses through molecular circuits, and enhanced discrimination of inputs through combinatorial sensing <ref type="bibr">(10 )</ref>. For example, a simple linear classifier (Fig. <ref type="figure">1A</ref>, middle) equips a cell with a signal processing system that ideally allows output generation only in certain input regimes (where x 1 and x 2 approach 1 in the example). Further, combining different molecular classifiers results in more complex, non-linear computation, thus expanding the capabilities of cellular biocomputation.</p><p>Binary classification using a linear decision boundary was demonstrated in the field of artificial intelligence (AI) in 1958 <ref type="bibr">(11 )</ref>. A simple computational unit called 'perceptron' performs binary classification by computing the linear combination of weighted inputs and passing the summed weighted inputs through an activation function. The most popular activation function in modern perceptrons is a thresholding function called Rectified Linear Unit known as ReLU, in which the output is larger than zero when the input crosses a threshold (Fig. <ref type="figure">1B</ref>). The collective processing of inputs by many layers of multiple perceptrons, known as deep neural networks or artificial neural networks (ANNs), can result in the recapitulation of any continuous function <ref type="bibr">(12 )</ref>, thus making ANNs capable of performing complicated tasks such as non-linear classification and accurate prediction <ref type="bibr">(13 -16 )</ref>.</p><p>The simple architecture of a perceptron has motivated many efforts towards the creation of a biological perceptron as the signal processing unit for linear classification. <ref type="bibr">(10 )</ref> The construction of a single biological perceptron can also pave the way for implementing nonlinear input classification in living cells by utilizing multiple perceptrons, thus creating biomolecular neural networks (BNNs). A biological perceptron must demonstrate the principal characteristics of an ANN (or computer-based) perceptron, i.e., the biological perceptron must include controllable molecular elements that determine the perceptron's input weights and activation function.</p><p>However, as opposed to ANN perceptrons, biological perceptrons face challenges in linear classification due to biological constraints such as limited resources, competitive binding, and non-specific binding between molecules. Resource constraints in biological systems have been shown to significantly impact the function of biocomputational modules based on molecular interactions. <ref type="bibr">(17 -21 )</ref> For example, the competition between housekeeping sigma factor RpoD with minor sigma factors such as RpoN, RpoH, and RpoF in Escherichia coli (E. coli) determines the response of E. coli to conditions like nitrogen-deficiency, heat shock, or the need for chemotaxis, respectively, by regulating expression of genes important for growth and survival of bacteria. <ref type="bibr">(22 )</ref> Additionally, competitive binding of only a few promiscuous ligands to various receptors has been demonstrated to accommodate a wide range of signaling activities in multicellular organisms <ref type="bibr">(23 )</ref>. Similarly, it was recently shown that competitive protein dimerization allows small networks made of monomeric proteins to encode an extensive range of homo-or hetero-dimeric outputs through precise adjustments in the concentration of network monomers <ref type="bibr">(24 )</ref>. Likewise, competition between transcription factors that bind to the same RNA polymerase may affect the output of engineered genetic circuits in bacteria. <ref type="bibr">(18 , 19 , 25 )</ref>.These resource constraints can cause perturbations to the biological perceptron function, thus influencing its decision boundary (Fig. <ref type="figure">1A</ref>, right). Subsequently, BNNs made from the biological perceptron with an altered output will also generate decision E: Schematic design of a multi-layer perceptron in a sigma-based system that poses two major limitations: sharing limited resources and competitive binding. These complexities influence the decision boundary of the multi-layer neural network.</p><p>boundaries that may not closely follow their ideal design.</p><p>A few biological perceptrons have been demonstrated by using inducible gene expression networks <ref type="bibr">(26 -29 )</ref>, enzymatic processing of different metabolites <ref type="bibr">(30 )</ref>, principles of DNA strand-displacement <ref type="bibr">(31 -34 )</ref>, and DNA-processing enzymes <ref type="bibr">(35 )</ref>. Relying on the sequestration of two interacting proteins, a biomolecular classifier with tunable positive and negative weights was designed computationally <ref type="bibr">(36 )</ref> and tested experimentally to achieve nonlinear classification in mammalian cells <ref type="bibr">(29 )</ref>. Similarly, a phosphorylation-based neural network with positive and negative weights that perform non-linear classification (i.e., recapitulating XNOR and XOR) was designed by Cuba Samaniego et al. <ref type="bibr">(37 )</ref> Recently, a protein-based neural network that achieves linear classification was implemented by exploiting coiled-coil dimerization of engineered peptides <ref type="bibr">(38 )</ref>. Although these studies utilize different approaches to create a biological perceptron, they all rely on competitive interactions between input-processing molecules with a shared pool of limited resources (e.g., RNA polymerases, ribosomes, <ref type="bibr">(21 , 25 , 39 )</ref>, co-activators, transcription factors, <ref type="bibr">(40 )</ref> and enzymes like Cas proteins <ref type="bibr">(41 )</ref> or proteases <ref type="bibr">(42 )</ref>). However, the effects of resource constrains on the function of these perceptrons have remained unexplored.</p><p>Here, we develop a mathematical model to simulate a biological perceptron based on sigma factors -transcription factors which outnumber the RNA polymerase and hence compete for binding to it <ref type="bibr">(18 , 22 , 43 -45 )</ref>-that bacteria naturally use to regulate gene expression. Leveraging competitive dimerization between a sigma and either its corresponding anti-sigma molecule or RNA polymerase, we demonstrate the design of a simple perceptron with a non-linear activation function capable of realizing positive and negative weights (Fig. <ref type="figure">1C</ref>). We then impose two physiological requirements on our model to account for both competition between the input sigma factor and other present sigma factors as well as the limited available resources (i.e., core RNA polymerase <ref type="bibr">(18 , 44 , 45 )</ref>). We show that resource sharing reveals its effect on the function of the perceptron by suppressing the output while introducing a slight perturbation to the linear decision boundary. Lastly, since engineering non-linear decision boundaries require multi-layer perceptron networks (as depicted in Fig. <ref type="figure">1D</ref>), we explore designing sequestration-reliant multi-layer sigma-based perceptron networks in the presence of perturbations caused by sharing limited resources (Fig. <ref type="figure">1E</ref>). We demonstrate that resource sharing leads to deviations from ideal design that affect the output of the multi-layer perceptron network. However, despite the non-ideal function of perceptrons due to resource sharing and limited resources, we outline simple design principles for encoding non-linear response patterns such that they closely resemble their ideal design. Our analyses of biological perceptron and BNN function in the presence of resource constraints can also be utilized to model molecular classifiers in silico for more precise prediction or design of outputs of these biocomputational systems.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">Results</head><p>Throughout the manuscript, we indicate chemical species with capital letters (e.g., X) and their concentration with the corresponding lowercase letters (e.g., x). Where possible, reaction and network outputs are normalized throughout the manuscript to allow dimension-less comparison between different conditions. Table <ref type="table">1</ref> at the end of the Results section summarizes all parameters used in the manuscript for mathematical analyses and computational simulations.</p><p>2.1 Design of a rectified linear activation unit (ReLU) based on sigma factor-anti-sigma factor interaction in the presence of shared limited resources Molecular sequestration is the stoichiometric binding between two species that results in the formation of a dimeric complex. An example of molecular sequestration is the interaction between sigma factors and their corresponding anti-sigma proteins that leads to the formation of a complex that is unable to promote gene expression (Fig. <ref type="figure">1B</ref>). Such interaction can be modeled assuming the sigma factor S 1 and anti-sigma factor A 1 are produced from species X 1 and X 2 at rate constants w 1 and w 2 , respectively. Additionally, the produced proteins S 1 and A 1 degrade at rate &#948; and the sequestration occurs with rate constant &#947; 1 (as shown in Fig. <ref type="figure">2A</ref>). We summarize the list of chemical reactions as follows:</p><p>Note that for simplicity, we assume binding reactions are irreversible. This assumption does not affect the steady-state output of the reactions although it cannot be used to analyze the dynamic behavior of the system (see SI section 3 for derivation of the steady-state output of the system assuming reversible reactions). Depending on the sequestration rate (&#947; 1 ), the output of the system, s1 , which stands for the steady-state amount of S 1 , follows the input (w 1 x 1 ) in different patterns (Fig. <ref type="figure">2B</ref>). To evaluate two regimes of output at the steady state, we define a dimensionless positive parameter &#958; = &#948; 2 w 1 x 1 &#947; 1 . In the fast sequestration regime (where the binding affinity of S 1 and A 1 is large) when 0 &lt; &#958; &#8810; 1, modeling the interaction between S 1 and A 1 (see section 1.1 in SI for mathematical derivation) leads to a thresholding function <ref type="bibr">(36 )</ref> (a function that generates positive outputs only when the input is larger than a threshold) between the output s 1 and input x 1 shown in equation <ref type="bibr">(1)</ref> and depicted in Fig. <ref type="figure">2B</ref>.</p><p>Equation ( <ref type="formula">1</ref>) describes a non-linear relationship between s1 and x 1 where s1 is non-zero and proportional to x 1 only when x 1 is larger than a threshold. On the other hand, when &#958; &#8811; 1, due to the slow kinetics of S 1 -A 1 binding, the relationship between s1 and x 1 becomes non-linear for the whole range of x 1 and the thresholding behavior is lost (Fig. <ref type="figure">2B</ref>).</p><p>The assumption of fast sequestration of a sigma factor by its corresponding anti-sigma molecule is valid as previous studies on various anti-sigma molecules have found their rapid effect on transcriptional activity of their target sigma factors <ref type="bibr">(46 -48 )</ref>. In addition, molecular controllers based on fast sigma-anti-sigma interaction have been constructed and tested <ref type="bibr">(49 -51 )</ref>, thereby providing evidence for fast sequestration assumption.</p><p>In equation (1), x 1 and x 2 represent the concentration of input species that generate S 1 and A 1 with production rates of w 1 and w 2 , respectively. At steady state and a fast sequestration regime, equation (1) converges asymptotically to a ReLU function. Therefore, we named the relationship between s1 with the inputs x 1 and x 2 the Asymptotic ReLU (AReLU) function (Fig. <ref type="figure">2C</ref>). Thus, the sequestration relationship between S 1 and A 1 resembles a simple perceptron with an AReLU activation function and weights of w 1 and -w 2 for inputs x 1 and x 2 , respectively. The quasi-linear relationship regime between s1 and x 1 (that resembles a soft ReLU function <ref type="bibr">(52 )</ref>) indicates that the steady-state available amount of S 1 is simply the difference between the total steady-state amounts of S 1 and A 1 , similar to other sequestration-based calculators demonstrated previously <ref type="bibr">(53 )</ref>. In other words, the outcome of the sequestration chemical reaction simply calculates the subtraction of total S 1 and A 1 and does not produce any product if S 1 is lower than A 1 . However, in slow sequestration, the amount of s1 depends on the input x 1 through a non-linear relationship without displaying thresholding behavior (Fig. <ref type="figure">2B</ref>).</p><p>In natural systems, sigma factors either bind to their corresponding anti-sigma or the RNA polymerase (RNApol) core. In the latter case, the RNApol-sigma factor complex binds to a specific sigma promoter in the DNA sequence and drives the expression of the downstream genes. When free S 1 is present, it binds to the RNApol to initiate transcription.</p><p>Therefore, when designing sigma-based neural networks, since the amount of RNApol-sigma factor complex directly influences protein expression, the interaction between S 1 and the RNApol core should be considered. Assuming that the sigma factor S 1 binds to a limited amount of available RNApol core C, with total concentration denoted as c tot , at rate &#947; 2 , the total amount of RNApol-sigma factor complex, denoted as C 1 , can be calculated by solving the system of ordinary differential equations (ODEs) representing the following chemical reactions which are illustrated in Fig. <ref type="figure">2D</ref>:</p><p>To analyze the input-output relationship of the chemical reactions that consider limited resources, we introduce a dimensionless variable r = &#947; 2 &#947; 1 referred to as competitive binding ratio that represents the ratio of sigma factor-RNApol affinity to the sequestration rate.</p><p>Solving the ODEs representing the above reactions (see section 1.2 in SI for derivation) reveals that the relationship between steady-state C 1 denoted as c1 (and consequently its normalized value denoted as c1 n = c1 /c tot ) and x 1 depends on the competitive binding ratio.</p><p>When r &#8811; 1 (slow competitive binding regime) the input-output relationship deviates from the ideal behavior (thresholding function) observed in Fig. <ref type="figure">2B</ref>. However, when r &#8810; 1 (fast The ratio of RNApol-S 1 complex formation rate to sequestration rate (referred to as competitive binding ratio r) influences the input-output relationship in the sequestration system. Lower r allows construction of a thresholding function. F: In fast sequestration regime and slow complex formation (r &#8594; 0 referred to as fast competitive binding regime), the output of the sequestration reactions in the presence of limited resources resembles an asymptotic saturated ReLU (ASReLU). G: Schematic representation of chemical reactions depicting molecular sequestration of S 1 and A 1 as well as complex formation of S 1 with a limited amount of core RNA polymerase (C) in the presence of a competing sigma factor (S 2 ).</p><p>The inputs are concentrations of species X 1 and X 2 that generate S 1 and A 1 with rates w 1 and w 2 , respectively. The output is steady-state normalized concentration of S 1 -C dimer (C 1 ) shown as c1 c tot . H: The effect of concentration of species &#945; that generates the competing sigma factor S 2 on the activation function of sequestration reaction. The addition of competitive binding reduces the amount of total C available, thus lowering the saturation level of the ASReLU function. I: The effect of concentration of species (&#946;) that generates A 2 that sequesters the competing S 2 on the activation function of sequestration reaction. Higher sequestration of S 2 results in a higher concentration of available C, thus increasing the saturation level of the ASReLU function.</p><p>competitive binding regime), c1 follows the thresholding function (Fig. <ref type="figure">2E</ref>). Hence, when input x 1 is lower than the threshold, there is no response, but when input is higher than the threshold, the response is non-zero and ultimately saturates at 1 (c tot ). Equation ( <ref type="formula">2</ref>) describes the input-output function in the fast competitive binding regime</p><p>Further, equation ( <ref type="formula">2</ref>) is closely similar to AReLU, equation <ref type="bibr">(1)</ref>, with the difference of having a limit on the output (c tot ) which is introduced due to the limited resources. Thus, the effect of limited resources, in this case, total RNApol core, causes the deviation of c1 from a linear trend when x 1 and x 2 vary. Nevertheless, in the fast competitive binding regime even with a limited RNApol core, the activation function still performs as a subtraction calculator although it is saturated at c tot (Fig. <ref type="figure">2F</ref>). Equation ( <ref type="formula">2</ref>) characterizes this nonlinear relationship of c1 with inputs x 1 and x 2 . We call this activation function asymptotic For simplicity, here we assume that the competing sigma factors have identical affinities for their corresponding anti-sigma proteins as well as C. The effect of different kinetics will be investigated in the next section. The following chemical reactions represent the model:</p><p>Solving at steady state, the set of ODEs modeling above equations yields equation (</p><p>(for mathematical derivation, see SI section 1.3), which describes an expression of c1 as a function of inputs (x 1 and x 2 ) and c tot . Although it is challenging to find a closed-form expression of c1 at the steady state, we can find an expression of c1 as a function of the inputs x 1 , x 2 , and c2 (unknown). This expression is useful to understand the effect of coupling between the two competing sigma factors on their binding to the RNApol. Equation <ref type="bibr">(3)</ref> shows that the available RNApol is depleted by the other sigma factor reflected in the term c tot -c2 . Additionally, the amount of depletion of available resources will depend on the production rates of both the competing sigma factor and its anti-sigma protein. Yet, in the fast sequestration regime (&#958; &#8594; 0), and fast competitive binding (r &#8594; 0), the system converges to an ASReLU function with a lower saturation magnitude captured by c tot -c2 . lim r,&#958;&#8594;0 c1 = max 0, min c tot -c2 ,</p><p>Since c2 only influences the ASReLU saturation level, the performance of the ASReLU function in calculating the difference between S 1 and A 1 as a thresholding function is sustained under the effect of a competing sigma factor (Fig. <ref type="figure">2H</ref>). However, the limit of output c1 (the saturation level) further decreases as the amount of competing sigma factor s 2 increases which corresponds with higher &#945; (Fig. <ref type="figure">2H</ref>). This trend reverses back to the non-competing ASReLU (equation ( <ref type="formula">2</ref>)) when the amount of anti-sigma factor A 2 increases with higher &#946; since less S 2 is available to compete with S 1 (Fig. <ref type="figure">2I</ref>).</p><p>Overall, molecular sequestration in a sigma factor-dependent translation system models variations of the ReLU function. In ideal conditions where the RNApol core is unlimited, this dependency is perfectly quasi-linear and reflects the difference between S 1 and A 1 , enabling an AReLU function in a fast sequestration regime. However, in the presence of limited as well as shared resources, the trend between the sigma factor-RNApol complex and the differential value of S 1 and A 1 deviates from a linear trend and this deviation becomes more intense as the total amount of free S 2 increases. This relationship in fast sequestration and competitive binding regimes is captured by the ASReLU function. However, since ASReLU still represents the difference of S 1 and A 1 , we reasoned that the activation function can be used to create a perceptron, called sigma-based perceptron hereafter, that can pave the way for constructing multi-layer neural networks for generating non-linear outputs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2">Design and analysis of a sigma-based perceptron with ASReLU activation function and sharing limited resources</head><p>After confirming the output of the sigma-based perceptron with an ASReLU activation function, we seek to determine whether this system demonstrates typical characteristics of a single node or perceptron in a neural network in the presence of a competing node. We test various input ranges of the perceptron and its competing counterparts to investigate the perceptron's linear decision boundary as well as weight-tuning for manipulating its decision boundary. Thus, we model the binding of a sigma factor (S 1 ) to RNApol in the presence of its anti-sigma (A 1 ) as well as a competing sigma-anti-sigma pair (S 2 and A 2 ) and look at the steady-state total amount of RNApol bound to S1 (denoted as c1 ) as the output of the node in response to a wide range of inputs (Figs. <ref type="figure">3A</ref> and <ref type="figure">B</ref>).</p><p>First, we investigate the output of the single node in isolation (without competing factors)</p><p>to understand the effect of binding kinetics between S 1 and A 1 (&#947; 1 ) as well as S 1 and C (&#947; 2 )</p><p>on the perceptron's decision boundary. Our findings show that the competitive binding ratio plays a critical role in determining the response pattern or the decision boundary of the perceptron. Equation ( <ref type="formula">4</ref>) below provides insight on how r and total resources c tot affect the decision boundary (see SI section 1.2 for derivation):</p><p>The last term in the equation ( <ref type="formula">4</ref>) simply introduces a bias to the decision boundary which only influences the response amplitude while leaving the response pattern, dictated by weights w 1 and w 2 , intact. Additionally, the coefficient of x 2 depends on both variables w 2 and c1 c1 +r(c tot -c 1 ) . Therefore, the competitive binding ratio, depending on its magnitude, can change the slope of the decision boundary. When r &#8594; 0 (Fig. <ref type="figure">3C</ref>, right), the coefficient of</p><p>in equation ( <ref type="formula">4</ref>) converges to 1. Hence, in this regime, the pattern of decision boundary remains linear across the input ranges although its bias changes depending on the inputs.</p><p>However, as r becomes greater, the effect of limited resources on the decision boundary strengthens (Fig. <ref type="figure">3C</ref>, <ref type="figure">left</ref>). In such a condition, the decision boundary still remains linear, but since c1 is a function of x 1 and x 2 , the slope of the decision boundary varies across the range of the inputs as an inverse function of c tot -c1 . Intuitively, when r &#8594; 0, the sequestration rate &#947; 1 is higher than &#947; 2 . Thus, the available amount of S 1 becomes equal to x 1 w 1 -x 2 w 2 which corresponds to excess amount of S 1 after binding to A 1 . In this case, regardless of the binding rate between S 1 and the RNApol C, the output will be a simple linear function of excess S 1 . On the other hand, when the sequestration rate &#947; 1 is slower than the binding rate between S 1 and C (&#947; 2 ), the dynamics of binding between S 1</p><p>and A 1 in conjunction with binding of S 1 and C dictates non-linearity on the output of the sequestration and complex formation reactions (seen in Fig. <ref type="figure">2F</ref>). The decision boundary of the perceptron, consequently, consists of linear decision boundaries with slopes that change as the inputs vary. Nevertheless, as long as r &#8810; 1, the perceptron still functions similarly to the ideal condition (when r &#8594; 0) and generates a decision boundary which resembles the ideal linear condition and can be used for the construction of more complicated architectures.</p><p>Importantly, in the construction of our model, we assumed that the sequestration species S 1 and A 1 are generated from an orthogonal RNA polymerase. i.e., production of S 1 and A 1 does not consume the shared resources C. Such a scenario corresponds to systems in which input species promote the production of S 1 and A 1 through polymerases such as T7</p><p>or SP6 RNA polymerase (we call this scenario uncoupled input-output relationship) instead of a housekeeping sigma factor (e.g., sigma 70. We call this scenario coupled input-output relationship). To investigate the effect of resource consumption by input species (i.e., if S 1 and A 1 are generated by complexes of housekeeping sigma factors with core RNAPol), we modified our model to include coupling between the resource consumption of S 1 and input species X 1 and X 2 (Fig. <ref type="figure">S2A</ref>). Assuming that housekeeping sigma factor S C is responsible for the production of S 1 and A 1 via S C -C complexes denoted as C S and C A , respectively, our model shows the effect of resource consumption by S C in different input regimes (Figs. <ref type="figure">S2B</ref> and <ref type="figure">C</ref>). Specifically, when [x 1 &#8594; 1, x 2 &#8594; 1], both C S and C A use the limited resource C, thus causing a reduction in the production amount of sigma factor and anti-sigma protein compared to the uncoupled case. Consequently, this effect leaves less resources for S 1 . Therefore, the response magnitude of the perceptron with a coupled inputoutput relationship becomes lower than the uncoupled case while the decision boundary remains similar (Figs. <ref type="figure">S2D</ref> and <ref type="figure">E</ref>). Therefore, while the perceptron still demonstrates the linear decision boundary when input species consume limited resources, sharing limited resources between the perceptron sigma factor and the perceptron inputs suppresses the output magnitude. Therefore, in practice, production from input species through orthogonal RNA polymerases like T7 and SP6 would be preferred as it circumvents the limitations of the input-output coupling.</p><p>Next, we seek to investigate how the total amount of RNApol, C, affects the perceptron function. Since C does not play a role in the dynamics of the ASReLU function, its variation reveals itself as a simple increase or decrease in perceptron response amplitude (Fig. <ref type="figure">3D</ref>).</p><p>This effect occurs because the normalized output of the ASReLU, cn 1 , is completely independent of C and varies only by x 1 and x 2 (see equation <ref type="bibr">(23)</ref> in the SI). Therefore, a single perceptron node in the absence of any competing sigma factors and in the presence of limited resources still acts as a linear classifier. However, the dynamic effects of anti-sigma-sigma binding on the system effectively change the weights of the perceptron decision boundary.</p><p>Such changes, however, are minimal when r &#8810; 1.</p><p>So far, we have focused on the decision boundary of the biological perceptron made by a sigma factor and its corresponding anti-sigma protein. Next, we consider whether the presence of another sigma factor (S 2 ) and its anti-sigma protein (A 2 ) changes the pattern of the perceptron decision boundary. Equation ( <ref type="formula">5</ref>) provides an expression for the output of the perceptron as a function of inputs, c1 , and c2 . (See SI section 1.3 for derivation.)</p><p>Equation ( <ref type="formula">5</ref>) can be interpreted as an alternative form of equation ( <ref type="formula">4</ref>) if c tot in equation ( <ref type="formula">4</ref>) is replaced with c tot -c2 . In other words, since the presence of a competing perceptron shrinks the amount of available resources, it amplifies the bias introduced to the perceptron output (by lowering the magnitude of the denominator in the last term in equation ( <ref type="formula">5</ref>)) and strengthens the input-dependent slope variation in decision boundary of the main perceptron (by increasing the coefficient of w 2 x 2 &#948; in equation ( <ref type="formula">5</ref>)). Note that c1 in equation ( <ref type="formula">5</ref>) is a function of inputs x 1 and x 2 . Therefore, the magnitude of introduced bias and change in the slope of the decision boundary will be varied in different input regimes. However, if r &#8810; 1, the effect of competition and resource sharing on tuning decision boundary becomes negligible as the coefficient of w 2 x 2 &#948; in equation ( <ref type="formula">5</ref>) converges to 1.</p><p>Our simulations for a perceptron in a fast competitive binding regime over a range of different input concentrations verify our mathematical analysis by demonstrating that in the presence of another sigma factor (produced by input &#945; 2 ) competing to bind to the RNApol, the perceptron response is suppressed due to the bias introduced by the competing perceptron which lowers the amount of available RNApol (Fig. <ref type="figure">3E</ref>). The response pattern, on the other hand, remains mainly intact due to the low competitive binding ratio.</p><p>We also investigate how anti-sigma A 2 (produced by input &#946; 2 ) influences the response of the perceptron in a resource-sharing system. Binding of A 2 to S 2 disables S 2 from binding to RNApol, thereby increasing the total amount of RNApol available for S 1 to bind. Therefore, we expect that the introduction of A 2 to the system suppresses the bias effect of S 2 on the perceptron response pattern (last term in equation ( <ref type="formula">5</ref>)). Indeed, our simulations show that an increase in production of A 2 increases the perceptron output (Fig. <ref type="figure">3F</ref>), thus confirming our hypothesis. G: Linear decision boundary of a single node in the absence of a competing factor can be tuned by adjusting the weight of the input even when r = 0.1. H: Tuning the linear decision boundary of the perceptron in the presence of a competing node can still be realized by changing the input weight. The amplitude of the response, however, is suppressed due to the lower availability of C.</p><p>Our mathematical analysis of the perceptron output in the presence of a competing perceptron leads to an expression for c1 that is dependent on inputs and c2 (equation <ref type="bibr">(48)</ref> in the SI), indicating that kinetics of binding between S 2 and A 2 would reveal its effect on perceptron output by simply varying its response amplitude while leaving its response pattern intact as long as the perceptron functions in the fast competitive binding regime</p><p>see SI section 1.3 for mathematical derivation. See Figs. S3 and S4 for simulations over a wide range of different S 2 and A 2 concentrations and different kinetics of S 2 -A 2 binding, respectively.).</p><p>Lastly, as weight tuning is a fundamental characteristic of nodes in neural networks that defines their individual decision boundaries, we determine the feasibility of tuning the weights applied to our biological perceptron in the presence or absence of resource sharing.</p><p>We first analyze an individual node with a limited amount of C and observe the change in the output pattern as w 1 increased (Fig. <ref type="figure">3G</ref>). Similar to perceptrons used in ANNs, adjusting the weight applied to the input results in a discernible change in the slope of the response pattern which is in agreement with our mathematical analysis (see 1.3 in SI). Aligned with our expectation, introducing a second node S 2 that imposes resources sharing on the system preserves the weight-tuning characteristic of the perceptron and only affects its response amplitude by lowering the saturation limit (Fig. <ref type="figure">3H</ref>).</p><p>In addition to steady-state analysis, we investigated the transient response of the sigmabased perceptron to study its function before it reaches its stable decision boundary. We observed that the perceptron output reaches the steady state shortly after the beginning of the simulation regardless of the parameters defining the concentration of species or the kinetics of the sequestration reaction (Fig. <ref type="figure">S5A</ref>). Additionally, the perceptron decision boundary forms even in the transient state (Fig. <ref type="figure">S5B</ref>). While in all scenarios with different values of &#945; and &#946; in either slow or fast sequestration regimes, the decision boundary is similar to the steady-state response pattern (i.e., maximum response generation at [x 1 &#8594; 1, x 2 &#8594; 0]), the response magnitude is smaller than the steady-state response.</p><p>Taken together, our simulations elucidate the effect of a competing sigma factor on the perceptron output magnitude in the fast competitive binding regime and also demonstrate the weight-tuning ability of perceptron with or without resource sharing. Therefore, we concluded that by operating in a fast competitive binding regime, sequestration-based perceptrons can demonstrate linear decision boundaries despite sharing limited resources which in turn allows the construction of multi-layer perceptron networks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3">Biomolecular neural networks generate non-linear classifiers in the presence of shared resources</head><p>While a single sigma-based perceptron generates a tunable decision boundary even with sharing limited resources (Fig. <ref type="figure">3</ref>), most biologically relevant biocomputation processes such as competitive ligand binding and protein dimerization generate sophisticated non-linear responses that rely on the protein-protein interactions and the concentration of competing dimerizing proteins or ligands <ref type="bibr">(23 , 24 )</ref>. Therefore, we aim to utilize the sigma-based perceptron as a basic building block of more intricate networks made of multiple perceptrons that are capable of generating non-linear outputs.</p><p>First, we investigate a two-node network where each node representing a sigma-based perceptron receives two inputs with unique weights (Fig. <ref type="figure">4A</ref>). This simple network allows us to study the effect of sharing limited resources on the output of nodes in the same layer (see SI section 2.1 for mathematical representation of the network). We note that the output of each node or perceptron represents the molecular complex made by binding of the perceptron sigma factor to the shared resources RNApol (as depicted in Fig. <ref type="figure">2D</ref>) which in turn can promote expression of another sigma factor (corresponding to positive weight) or another anti-sigma molecule (corresponding to negative weight). Therefore c i stands for the complex between S i and C as the output of each node in the networks illustrated in Fig. <ref type="figure">4</ref>.</p><p>In the absence of resource sharing, each node's output is a linear combination of its inputs (Fig. <ref type="figure">3C</ref>). Therefore, with given weights denoted in Fig. <ref type="figure">4A</ref>, linear patterns for outputs of nodes 1 and 2 are observed (Fig. <ref type="figure">4B</ref>). However, when the binding competition between sigma factors is taken into account, the effect of limited resources and competition on the network output is elucidated as the attenuation of outputs for each node in certain input regimes (Fig. <ref type="figure">4C</ref>) resulting in a non-linear response pattern. Notably, the response of each node is maximally affected where the competing node is consuming most of the resources. For instance, in isolation, the output of node 2 (C 2 ) has its highest expression level when [x 1 , x 2 ] &#8594; 1 (Fig. <ref type="figure">4B</ref>, bottom). Consequently, the response pattern of C 1 is highly diminished in that region (Fig. <ref type="figure">4C</ref>) due to the effect of resource sharing with node 2. In a similar fashion, the output pattern of C 2 demonstrates its highest attenuation where C 1 is expressed the most in isolation (x 1 &#8594; 1), thus leaving a smaller amount of available RNApol (C) for production of C 2 . Hence, the concerted effect of resource sharing in a wide range of inputs imposes non-linearity in the output of nodes in the same layer due to the emergence of a dominant perceptron that consumes most of the resources. This effect can also be deduced from equation <ref type="bibr">(5)</ref> knowing that both c1 and c2 are functions of x 1 and x 2 .</p><p>Therefore, in certain input regimes where either C 1 or C 2 are strongly expressed, the effects of competitive resource sharing detailed in the previous section become strengthened.</p><p>Interestingly, the network output is influenced differently in fast and slow competitive binding regimes. In a fast competitive binding regime, c1 and c2 are linear functions of x 1 and x 2 . Therefore, the outputs C 1 and C 2 display sharper deviations when both are dependent on inputs (Fig. <ref type="figure">4D</ref>) due to the stronger effect of the bias term in equation <ref type="bibr">(5)</ref> imposed by sharp reduction of resources (see Fig. <ref type="figure">S6</ref> for isolated perceptron responses in fast competitive binding regime.). Overall, we conclude that resource sharing can significantly induce non-linearity to the network output pattern and the highest impact of competitive resource sharing reveals itself where the response of isolated nodes overlap with each other, corresponding to input regimes with maximum resource sharing.</p><p>Knowing that overlapping responses can cause non-linearity in the network outputs, we next consider whether we can still implement non-linear networks with decision boundaries that are distorted minimally due to resource sharing. We reason that if the outputs of the same layer do not overlap with each other, the overall output of the network will follow the design principles dictated by the input weights and will not have undesired non-linearity.</p><p>To test this hypothesis, we look at a simple network made of three nodes that reconstitutes a dual region classifier (band-stop filter) represented in Fig. <ref type="figure">4E</ref> (see SI section 2.2 for mathematical representation of the network and Fig. <ref type="figure">S7</ref> for the ideal response of this network in the absence of sharing limited resources). In order to avoid non-linearity induced by resource sharing, we design the network such that the node outputs have minimal overlap with each other (Fig. <ref type="figure">4F</ref>, left and middle panels). We expect that the lack of overlap between node outputs would prevent unwanted non-linearity in the network output (C 3 ).</p><p>Aligned with our expectation, our simulations demonstrate the network response consisting of two separate linear regions despite resource sharing (Fig. <ref type="figure">4F</ref>, right). We also confirmed that an increase in the amount of available resources (C) decreases the effect of competition for resources and does not alter the decision boundary of first layer nodes (Fig. <ref type="figure">4G</ref>, left and middle) as well as overall network (Fig. <ref type="figure">4G</ref>, right). Therefore, we conclude that linear responses can be engineered in a network, despite the presence of resource sharing, by tuning input weights such that the outputs of the nodes in the same layer do not overlap.</p><p>Finally, we seek to create a biomolecular neural network that generates a non-linear classifier. Such non-linear outputs are key features of many biological processes and drivers of cell decision-making <ref type="bibr">(56 -58 )</ref>. We design a simple network with specific input weights based on the interaction of multiple sigma factors and their corresponding anti-sigma molecules that generates a non-linear classifier resembling a band-pass filter (Fig. <ref type="figure">4H</ref>, see SI section 2.3 for ODEs describing the network). As in the previous design, we want the outputs of the first layer to have minimal overlap with each other to avoid non-linearity induced by resource sharing. Our simulations depict that the first layer outputs, C 1 , C 2 , and C 3 have minimal overlap with each other given the particular weights and biases (Fig. <ref type="figure">4I</ref>). Consequently, our simulations demonstrate that network output (C 4 ) creates the non-linear classifier (Fig. <ref type="figure">4J</ref>)</p><p>that is aligned with our ideal design expectations (Fig. <ref type="figure">S8</ref>). Consistent with our findings in the previous section, changing the c tot does not change the non-linear pattern of the decision boundary but tunes its amplitude (Fig. <ref type="figure">S9</ref>).</p><p>We also look at the effect of input bias on the first layer of the output (Fig. <ref type="figure">4J</ref>). Indeed, in accordance with the ideal design (Fig. <ref type="figure">S8</ref>), increasing the bias of node 1 amplifies its steady-state output (c 1 ) which subsequently further sequesters the amount of available S 4 , thus suppressing the network response in its corresponding input regime ([x 1 , x 2 ] &#8594; 0) without significant deviation from design expectations (Fig. <ref type="figure">S8</ref>). Therefore, we showed that even in the presence of resource sharing, we can develop non-linear biological neural networks to realize dual region and non-linear classifiers. We also demonstrated that we can modulate the network response by tuning the model parameters that are independent of limited resources. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Discussion</head><p>Biological signal processing units that reconstitute molecular linear and non-linear classifiers are powerful tools that enable cellular decision-making, precise cell programming, highly discriminatory input processing, and ultra-sensitive molecular biosensors for applications such CAR T-cell engineering or in vitro diagnostics. A biological perceptron ideally allows linear classification while a combination of biological perceptrons can create biomolecular neural networks that compute non-linear classification.</p><p>Among different approaches that are used to construct biochemical neural networks and classifiers, sequestration-based networks are of significant interest because of their simplicity and compatibility with in vivo systems <ref type="bibr">(29 , 36 )</ref>. However, the effect of physiological constraints such as a limitation on resources, as well as competitive binding between elements of molecular classifier networks, specifically sequestration-based networks, on the classification function have remained unexplored.</p><p>We mathematically modeled sequestration-based biochemical neural networks and investigated how sharing limited resources, a ubiquitous feature of physiological systems, influences the function of the neural network decision boundary. We chose a network of sigma factors, their corresponding anti-sigma proteins, and core RNA polymerase as our model system. Our analyses demonstrated that a single perceptron, the basic building block of neural networks, with a ReLU-like activation function is recapitulated by modeling the interaction of one sigma factor with its anti-sigma molecule. We further showed that modifying the model to include a limited amount of core RNApol in the system changes the activation function of the sigma-based perceptron to an asymptotic saturated ReLU. Drawing inspiration from natural systems where multiple sigma factors compete to bind to a limited pool of RNApol, we altered the model to include another sigma factor and found that the decision boundary of the sigma-based perceptron remains the same although its output is suppressed.</p><p>While conditions of biochemical in vitro reactions are primarily controlled, in living organisms, endogenous factors can cause perturbations and deviations from the ideal design.</p><p>For instance, in bacteria, although a certain sigma factor might be designed to control gene expression, multiple other sigma factors compete with the engineered sigma factor to bind to a limited amount of RNApol. To include competitive binding to shared resources, we increased the number of perceptrons, controlled by the same inputs, to two. We found that in specific input regimes where the outputs of the perceptrons interfere with each other, one dominant perceptron emerges and consumes most of the resources whereas the decision boundary of the non-dominant perceptron significantly deviates from its ideal design.</p><p>Given that engineering any kind of non-linear response by neural networks requires multilayer perceptrons, we investigated conditions where despite resource sharing, non-linear decision boundaries could be designed. Knowing that interference of outputs of perceptrons in the same layer raises deviations of the linear decision boundary, we engineered particular multi-layer perceptrons in which non-linear decision boundaries were successfully demonstrated. We showed that despite sharing limited resources, dual region and non-linear classifiers resembling band-stop and band-pass filters can be implemented in sigma-based neural networks using different architectures of 2-layer neural networks with minimal deviation from ideal design.</p><p>Since sharing limited resources is not exclusive to the sigma-based neural networks, our findings can be extended to other biocomputational input-processing systems used in living cells. Taking Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) gene editing as an example, the sequestration of a single guide RNA (sgRNA) strand by its complementary RNA, or anti-guide RNA, is analogous to the interaction of a sigma factor with its anti-sigma protein. If the sgRNA is engineered to drive a CRISPR reaction, the Cas protein will consequently be the limited resource that all sgRNAs will compete to bind Exploiting exogenous sigma factors would prevent unwanted binding of perceptron sigma factors to native molecules as well as interference with biological events in bacteria. In fact, sequestration of Bacillus subtilis sigma and anti-sigma proteins sigW and rsiW, respectively, was previously used in E. coli for construction of a closed loop biomolecular integral feedback control circuit.( <ref type="formula">51</ref>) Interestingly, the output of this circuit resembles the behavior of a sigma-based perceptron presented in this study, thus promising the potential of exogenous sigma factors as building blocks of BNNs. Further, the challenges of using endogenous sigma factors may be less problematic when BNNs are implemented using in vitro bacterial cell-free expression systems due to the absence of endogenous biological processes.</p><p>While we demonstrated designs of linear and non-linear classifiers in this work with predetermined weights that generate desirable decision boundaries, we note that sigma-based neural networks are not capable of learning through common algorithms like backpropagation. i.e., the input weights that directly determine the decision boundary are chosen by the designer. However, utilizing the mathematical analysis presented here, one can implement optimization approaches such as particle swarm optimization or other heuristic algorithms to find the appropriate weights for the generation of the desired decision boundary in silico prior to testing them in vivo or in vitro. Such optimization algorithms can also be implemented to expand the function of BNNs to regimes where non-linear effects of resource constraints or perturbations due to non-specific or unwanted bindings described above appear. Therefore, developing BNNs with biological resource constraints capable of recognizing any arbitrary input pattern using optimization methods is an avenue worth exploring in the future.</p><p>Recently, it was shown that many cancer cell types can be recognized with higher precision if a combination of two antigens is used to identify them instead of traditionally using one biomarker <ref type="bibr">(67 )</ref>. Implementing non-linear output patterns with sequestration-based neural networks could increase the recognition ability of engineered cellular systems like CAR Tcells by equipping them with information-processing neural networks that generate desired outputs only in designed antigen concentration regimes. Similarly, by coupling inputs to the expression of sigma factors and their anti-sigma molecules, more sensitive, precise, and versatile in vitro biosensors for the detection of pathogens, substances, and biomarkers can be constructed.</p><p>With the nanofabrication technology in the semiconductor industry approaching its physical limits of manufacturing smaller and smaller elements <ref type="bibr">(68 )</ref>, alternative computational devices with biological components are gaining increasing interest. However, current biocomputational systems are only in their infancy. Although biocomputation in living systems was initially shown more than two decades ago using genetic circuits, the limited range of computational tasks that genetic circuits can perform as well as the digital nature of their input-outputs makes their application limited. With the recent booming advance of AI, biochemical approaches that recapitulate biological neural networks holding the potential to perform intricate computational tasks have gained attention <ref type="bibr">(3 , 69 )</ref>. Our study provides a general framework for designing biological perceptron or linear classifiers using existing biomolecular tools in the presence of resource constraints that are ubiquitous in physiological conditions. This framework, thus, is the first step towards designing sophisticated biomolecular neural networks that equip engineered cells with high-level computational and decision-making abilities.</p><p>In addition to transformative applications of sigma-based neural networks used in a forward-engineering manner in both cellular and cell-free systems, the fact that sigma and anti-sigma molecules can construct complicated computational modules elucidates the hidden capabilities of these rather simple transcriptional regulation molecules in cells. It was revealed by Park et al. that sigma factors share the resources in a pulsatile manner <ref type="bibr">(18 )</ref>.</p><p>However, here we demonstrated that sharing limited resources influences sigma-based processes beyond time sharing. Bacterial cells have many sigma factors, some of which are activated only when their inputs meet certain conditions. However, how bacteria utilize the principles of molecular sequestration as well as sharing limited resources to respond differently to input combinations awaits future studies. Additionally, if bacteria are able to process certain input patterns using their endogenous sigma-based neural networks, the nature of these patterns and their role in guiding bacteria to make particular decisions remain unclear.</p><p>In conclusion, our investigation demonstrates the effects of resource-sharing on sigma-based sequestration-based neural networks with up to four sigma factors and provides an outline for designing sigma-based non-linear neural networks in bacterial systems.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">Methods</head><p>Sequestration and binding reactions were modeled into ODEs using the law of mass action.</p><p>See SI for the derivation of ODEs describing sigma-antisigma molecular sequestration and the construction of a perceptron based on their behavior. Where possible, analytical expressions are derived for steady-state outputs of reactions. All ODEs were modeled and the steadystate solutions were solved by a custom Python code using the numpy and scipy packages.</p><p>ODEs were integrated using odeint function from scipy package to obtain transient and steady-state response of reactions. The figures were plotted using Bokeh and Matplotlib packages. The codes used for solving ODEs and generating figures can be found here:</p><p>GitHub -mhossein7/BNN_shared_resources</p></div></body>
		</text>
</TEI>
