<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Online Bayesian State Estimation for Real-Time Monitoring of Growth Kinetics in Thin Film Synthesis</title></titleStmt>
			<publicationStmt>
				<publisher>ACS Publications</publisher>
				<date>02/12/2025</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10596536</idno>
					<idno type="doi">10.1021/acs.nanolett.4c05921</idno>
					<title level='j'>Nano Letters</title>
<idno>1530-6984</idno>
<biblScope unit="volume">25</biblScope>
<biblScope unit="issue">6</biblScope>					

					<author>Sumner B Harris</author><author>Ruth Y López_Fajardo</author><author>Alexander A Puretzky</author><author>Kai Xiao</author><author>Feng Bao</author><author>Rama K Vasudevan</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Rapid validation of newly predicted materials through autonomous synthesis requires real-time adaptive control methods that exploit physics knowledge, a capability that is lacking in most systems. Here, we demonstrate an approach to enable realtime control of thin film synthesis by combining in situ optical diagnostics with a Bayesian state estimation method. We developed a physical model for film growth and applied the direct filter (DF) method for real-time estimation of nucleation and growth rates during pulsed laser deposition (PLD). We validated the approach using simulated and experimental reflectivity data for WSe 2 growth and ultimately deployed the algorithm on an autonomous PLD system during the growth of 1T′-MoTe 2 . The DF robustly estimates growth parameters in real time at early stages of growth, down to 15% monolayer area coverage. This fusion of in situ diagnostics, data assimilation, and physical modeling opens new opportunities in adaptive control of synthesis trajectories toward desired material states.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>T he past decade has seen substantial investment in machine learning and high-throughput computational methodologies for materials science in an endeavor to identify new materials with desirable properties. This is exemplified by the Materials Genome Initiative, <ref type="bibr">1</ref> which has paved the way for high-throughput computational screening and data-driven research with automated experimentation. Chemical structures can be efficiently navigated in silico via high-throughput density functional theory or molecular dynamics. The results of these efforts are stored in vast databases such as the Materials Project, <ref type="bibr">2</ref> which machine learning models can utilize to predict candidates for experimental synthesis of new materials.</p><p>Despite years of effort, the key obstacle for realizing the potential of such workflows remains the same: predicting new materials is straightforward compared to experimental validation through some synthesis modality. Recent advances in autonomous experiments have increased throughput for synthesis of thin films, <ref type="bibr">[3]</ref><ref type="bibr">[4]</ref><ref type="bibr">[5]</ref> nanoparticles, <ref type="bibr">6</ref> and bulk powders, <ref type="bibr">7</ref> but few efforts have successfully incorporated theory or literature data to narrow the parameter space and inform the commonly used Bayesian optimization loop with physics knowledge. Moreover, many predicted materials are metastable, <ref type="bibr">8</ref> which requires that synthesis trajectories be carefully controlled to drive the system toward the desired state. Growth trajectories could be controlled with model-based predictive control <ref type="bibr">9</ref> (MPC), which fuses a process model with in situ diagnostics data to predict and adjust a system's trajectory in real time.</p><p>While MPC has been used in chemical synthesis and process control dating back to the 1970s, <ref type="bibr">10,</ref><ref type="bibr">11</ref> it is rarely applied in physical vapor deposition (PVD) techniques due to challenges in deriving physical quantities from diagnostic data in real time and the lack of compatible film growth models. For example, reflection high-energy electron diffraction (RHEED) is ubiquitous in molecular beam epitaxy (MBE) and pulsed laser deposition (PLD) to monitor effective growth rates. <ref type="bibr">12</ref> First MPC-like efforts in PVD date back to 1984 with "phaselocked epitaxy", where the MBE source shutter is controlled based on oscillations in the RHEED signal. <ref type="bibr">13</ref> However, modeling realistic RHEED images from surface structures is an active research area, leading to sporadic efforts to focus on neural network models for prediction and control. Neural networks for MBE control date back to 1993 <ref type="bibr">14</ref> and are typically used with RHEED data to make a predictive model of the future diffraction state <ref type="bibr">15</ref> as well as pattern classification and clustering, implemented mostly postgrowth. <ref type="bibr">[16]</ref><ref type="bibr">[17]</ref><ref type="bibr">[18]</ref> While effective, these approaches still lack interpretability and physics awareness, which are crucial for guiding systems toward the desired states.</p><p>To realize physics-informed control over thin film synthesis, different methods are needed to infer the future state of the system. Optical diagnostics are being increasingly adopted in film growth and can be used with simple and accurate physical models, in contrast to RHEED, and provide similar information related to growth rates and composition. In situ ellipsometry has been used in an MPC scheme to modulate the composition of Si 1-x Ge x films during chemical vapor deposition (CVD). <ref type="bibr">19</ref> In situ Raman has also been shown to be sensitive to composition and strain during CVD <ref type="bibr">20</ref> and PLD <ref type="bibr">21</ref> but has not yet been used for MPC.</p><p>Here, we combine in situ optical diagnostics with a recently developed Bayesian state estimation method toward enabling MPC of thin film nucleation and growth rates during PLD. We developed a simple and interpretable physical model for thin film growth based on the area coverage of discrete layers, which is measurable by laser reflectivity, and apply the direct filter (DF) method of parameter estimation to determine the growth and nucleation rates of thin films in real time during PLD. We deployed our method on an autonomous PLD system and tested the algorithm in real time during growth of 1T&#8242;-MoTe 2 . We show that the DF can accurately estimate the growth model parameters at early stages of synthesis and can determine the nucleation and growth rates of the first monolayer after only &#8764;15% area coverage has been deposited. Any physical model describing film growth in the context of an in situ diagnostic can be adapted to our method, providing a powerful tool to advance adaptive control in thin film synthesis by integrating modern data assimilation techniques with these diagnostics and physical models.</p><p>For the growth of transition metal dichalcogenide (TMD) thin films with PLD, previous work showed that in situ laser reflectivity reveals sub-monolayer growth and nucleation kinetics on SiO 2 /Si substrates, where the multilayer structure enhances reflected contrast. <ref type="bibr">22</ref> Building on this, we developed a general growth model for an arbitrary number of TMD layers and calculated the reflected contrast using the Fresnel equations and fractional area coverage of individual layers. We use the same recursive method for calculating the Fresnel reflection coefficient of a homogeneous layer stack and refer to the previous work for a detailed description. <ref type="bibr">22</ref> Thus, our model describes the time evolution of the fractional area coverage for discrete single layers to simulate the experimentally observed reflected contrast.</p><p>To describe growth kinetics of 2D materials, we employ a two-step kinetic model that considers conversion of A to B through the nucleation and autocatalytic growth steps described by the rate constants k n and k gr , respectively.</p><p>A B, nucleation k n (Step 1)</p><p>This approach is the most general and can be applied to interpret sigmoidal kinetics found in many different processes that exhibit cooperative effects when the initial conversion of A to B affects the subsequent conversion process. As shown by Finney et al., <ref type="bibr">23</ref> this approach can be used to interpret the parameters of the Avrami equation <ref type="bibr">[24]</ref><ref type="bibr">[25]</ref><ref type="bibr">[26]</ref> that was initially developed in the 1940s for kinetics of phase changes as well as its numerous modifications and derivatives (see Finney et al. <ref type="bibr">23</ref> for review), e.g., for kinetics of diamond film deposition by CVD. <ref type="bibr">27</ref> Here, we apply this growth model to the PLD growth of TMDs.</p><p>Figure <ref type="figure">1a</ref> shows a schematic of the PLD synthesis process and the model used to calculate the contrast. During PLD, a solid target of the desired material is irradiated with a pulsed laser, which creates the vapor phase precursors for film growth. The plume species condense on the SiO 2 /Si substrate where monolayer islands begin to nucleate with a rate k n1 . Other plume species diffuse along the substrate surface and attach to existing islands at a growth rate of k gr1 . The initial area available for a monolayer growth is f 0 (t = 0) = 1 with the initial fractional area coverage of the first monolayer f 1 (t = 0) = 0. As growth continues, additional layers nucleate and grow on top of the first layer islands and so on. This process can be generalized by a system of equations (eq 1), where f i is the fractional area coverage of layer number i with nucleation and growth rates of k ni and k gri , respectively, up to a total of N Nano Letters Here, k gri is multiplied by the initial surface area, A 0 = 1, to remove the area dependence and to make both rate dimensions s -1 . The reflected contrast is given by C i = (R i -R 0 )/R 0 where R i and R 0 are the Fresnel reflection coefficients of a layer stack with i layers and the bare substrate, respectively. Thus, we can model the time evolution of contrast C r (t) with eq 2.</p><p>1, 2, ...,</p><p>(1)</p><p>Figure <ref type="figure">1b</ref> shows the simulated area coverage vs time for the first 3 TMD layers, while Figure <ref type="figure">1c</ref> shows the corresponding contrast C r (t). Equation 1 can be solved numerically for a defined number of layers. For simplified studies of a single monolayer growth, eq 1 is solved analytically (eq 3) where f, k n, and k gr are the first monolayer coverage and nucleation and growth coefficients, respectively.</p><p>1</p><p>n gr n gr n gr n gr</p><p>(3)</p><p>This model explains the experimental in situ reflectivity and allows for online or offline fitting methods to estimate the fundamental nucleation and growth rates for individual layers during PLD. Specifically, we aim to use this model with online, Bayesian methods to estimate the nucleation and growth rates with uncertainty to enable real-time monitoring of these quantities during PLD.</p><p>To do this, we consider a state estimation problem where eqs 1 and 2 are the state-space and observation models for film growth, and k ni and k gri are the parameters. Here, we are interested in estimating the parameters of the state-space model, rather than the state itself (the contrast), in real time.</p><p>To do this, we apply the DF method, <ref type="bibr">28</ref> which is a nonlinear particle filtering method that estimates dynamical state-space model parameters.</p><p>Filtering methods in data assimilation fuse information from model simulations and observed data within specific time frames. This fusion aims to refine our understanding of a dynamical system and its associated uncertainties recursively in time. Suppose one has the following state-space model X n+1 = G(X n , &#952;) + w n , n = 1, 2, 3, ..., where G(X n , &#952;) is the dynamical model describing the evolution of the state process X n at time step n defined by the state parameters &#952;, where w n represents additive noise. In addition, we have an observation model Y n+1 = H(X n+1 ) + &#958; n+1 , where H represents the observation function and &#958; n+1 is observation noise. The DF method considers the model parameters &#952; as the only state to estimate. Since the Nano Letters </p><p>Details of the DF implementation used in this work are given in the Supporting Information (Note S1). We first apply DF estimation to simulated data generated using the monolayer model (eq 3) to estimate the known k n and k gr of WSe 2 monolayers to evaluate the efficacy of this approach. We find that the DF can accurately estimate the rate parameters when using a "burn-in" period that sets the artificial noise to &#1013; n = 0 after a short time. Full details of this simulated data study are given in the Supporting Information (Note S2, Figure <ref type="figure">S1</ref>).</p><p>We test the DF method with previously acquired experimental data that was collected during an autonomous WSe 2 growth experiment. <ref type="bibr">5</ref> Figure <ref type="figure">2a-c</ref> show the sequential estimation of k n and k gr along with the predicted contrast for experimental data. We repeated the estimation 10 times with a different random seed to check repeatability using a burn-in time of 8 s. We observe similar convergent behavior to those in the simulated data case (Figure <ref type="figure">S1</ref>) and find the 10-trial average values of k n = (2.9 &#177; 0.3) &#215; 10 -3 s -1 and k gr = 0.147 &#177; 0.009 s -1 . This method enables projection of the contrast curve to future times, which is one possible application during online estimation. We take the parameter estimates at fixed times after the burn-in period and project the contrast forward with the uncertainty. Figure <ref type="figure">2d</ref> shows the results of these projections at 8.3, 11.3, 15.3, and 16.9 s. The uncertainty is high immediately after burn-in but decreases as more data are collected. The projections at 15.3 and 16.9 s (approximately twice the burn-in time) both have a small enough uncertainty that they could be used to reasonably predict the final growth time and rates with enough remaining time to make changes to the experimental parameters and alter the course of growth. The contrast at these two times corresponds to 15% and 19% monolayer area coverage and correspond to a difference of only 2 or 3 laser pulses. At these low coverages and at the lower limit of deposition control via the laser pulse number, we anticipate that intervening in the experiment at this time by changing the repetition rate, laser fluence, or substrate temperature could meaningfully alter the future growth and resulting properties of the film.</p><p>Notably, the predicted contrast deviates from the experimental measurements at later times (Figure <ref type="figure">2c</ref>). This is due to the nucleation of additional layers beyond the first monolayer, whose fractional area coverage begins to significantly contribute to the reflected contrast. We previously found that beyond &#8764;40% monolayer coverage, the second layer begins to nucleate. <ref type="bibr">22</ref> Indeed, the contrast at which the experiment deviates from the monolayer growth model, at &#8764;25 s, corresponds to 45% monolayer coverage, so we can anticipate second layer nucleation.</p><p>To our knowledge, state estimation methods, such as the one developed in this work, have never been deployed on any PLD system. Other Bayesian techniques have been used primarily for sequential design of experiments (Bayesian optimization) in synthesis <ref type="bibr">[3]</ref><ref type="bibr">[4]</ref><ref type="bibr">[5]</ref> to achieve some target metric in a small number of experiments. In this case, the DF method is not used for active learning or experimental design but rather uses the calculated probability distributions of the growth model parameters to accurately estimate their values with a small amount of data. To demonstrate this application in a real PLD environment, we integrated the above monolayer growth model and DF algorithm into the control software for an autonomous PLD system <ref type="bibr">5</ref> and grew ultrathin films of 1T&#8242;-MoTe 2 at different Ar background pressures to test the reliability of real-time parameter estimation under conditions with different deposition rates. We synthesized 2 films at 5 different pressures between vacuum (&lt;1 &#215; 10 -6 Torr) and 70 mTorr with all other deposition variables and DF algorithm parameters held constant; details are given in the Supporting Information. Deposition was automatically terminated when the contrast reached the value expected for a monolayer of 1T&#8242;-MoTe 2 , which was -0.54.</p><p>MoTe 2 can crystallize in three different phases: the hexagonal semiconducting 2H phase, the monoclinic metallic 1T&#8242; phase, or the orthorhombic T d phase, which exhibits quantum type-II Weyl semimetal behavior <ref type="bibr">29</ref> and superconductivity. <ref type="bibr">30</ref> The 2H phase is the most thermodynamically stable, but the energy difference between the 2H and 1T&#8242; phases is the lowest of all TMDs at only &#8764;35 meV. <ref type="bibr">31</ref> The 1T&#8242; phase forms at higher growth temperatures <ref type="bibr">32</ref> and under Tedeficient conditions, <ref type="bibr">33</ref> while the T d phase becomes the most stable at high temperatures <ref type="bibr">34</ref> compared to the 2H phase. Therefore, MoTe 2 is an ideal material system for exploring synthesis methodologies to tune the phase compositions of thin films. Although no direct growth of MoTe 2 by PLD has been reported, there are two studies on PLD-deposited amorphous (Mo,W)Te 2-x alloys at room temperature with post-growth annealing. <ref type="bibr">35,</ref><ref type="bibr">36</ref> Figure <ref type="figure">3a</ref> shows the Raman spectrum of a typical thick film (&#8764;10 layers) of MoTe 2 grown at 200 &#176;C. The spectrum indicates that the films crystallize predominantly in the 1T&#8242; phase, with the main A g modes of 1T&#8242;-MoTe 2 at 161 and 266 cm -1 and broad peaks in the vicinity of the B g mode at 107 cm -1 and A g mode at 110 cm <ref type="bibr">-1</ref> , consistent with literature. <ref type="bibr">37,</ref><ref type="bibr">38</ref> The weaker Raman peaks are not resolvable; therefore, we refer to only the general regions expected for these modes. This spectrum is similar to nanostructured 1T&#8242; films <ref type="bibr">39</ref> and also the Te-deficient films in Sun et al. <ref type="bibr">35</ref> Interestingly, the 2H phase should be more stable than the 1T&#8242; phase below &#8764;300 &#176;C. <ref type="bibr">40</ref> The 1T&#8242; phase is the dominant phase in our case due to Te deficiency, most likely caused by incommensurate evaporation of Mo and Te during ablation. It has been shown that excess Te is required to stabilize the 2H phase and that the 1T&#8242; phase forms otherwise. <ref type="bibr">33,</ref><ref type="bibr">41</ref> Film optimization and properties are not the topics of this work and will be left to future studies. Here, knowledge of the primary crystal phase is required to select the correct optical constants to conduct the DF experiments. We selected the refractive index of 1T&#8242;-MoTe 2 in this case, given in the Supporting Information.</p><p>Figure <ref type="figure">3b</ref> shows the real-time reflected contrast during each growth. The number of pulses needed to grow &#8764;1 monolayer ranges from 120 in a vacuum to 512 at 70 mTorr, with run-torun variation likely caused by changes in the target surface that modulate the ablation yield with each laser pulse. DF parameter estimation was performed in real time without tuning the DF algorithm parameters during the synthesis of all 10 samples, demonstrating its robustness across various growth rates. Figure <ref type="figure">3c</ref> details one sample grown in vacuum, where DF contrast predictions (red) with a 2&#963; uncertainty envelope (&#177;1 standard deviation &#963;) are compared to monolayer area coverage (blue). DF predictions deviate from the experimental data at &#8764;49.5 s, aligning with &#8764;76% monolayer coverage, beyond which we expect significant contributions from additional layers that the model does not consider. Figure <ref type="figure">3d</ref> shows that parameter predictions stabilize after &#8764;19.5 s, with final predicted rates of k n = 3.18 &#215; 10 -3 s -1 and k gr = 8.78 &#215; 10 -2 s -1 . The smaller k gr /k n ratio of 27.6 for 1T&#8242;-MoTe 2 compared to WSe 2 (50.7) can explain the higher monolayer coverage (76% vs 40%) before additional layer nucleation.</p><p>Finally, we compared real-time DF rate predictions with post-growth analysis to determine if the rates are accurately predicted prior to the end of film growth. Using a two-layer model (eqs 1 and 2), we fit all 10 MoTe 2 contrast curves, solving the equations via integration (SciPy 42 odeint) and minimizing MSE as a function of k ni and k gri (ODE fit). Figure <ref type="figure">4a</ref> shows one 50 mTorr 1T&#8242;-MoTe 2 deposition, with real-time DF predictions overlaid with the ODE fit results. Both methods fit the data well, though the DF deviates when the second layer forms, as previously noted. The key difference is that the DF is done in real time as data is received from the detector, whereas the ODE fit requires the whole curve. Figure <ref type="figure">4b</ref>,c display k n and k gr vs Ar pressure for layer 1 using DF estimation at &#8764;25% monolayer coverage (contrast of -0.135) for each deposition compared with the results from the ODE fit results. DF accurately matches the postgrowth ODE fits, even with partial data early in the deposition. Thus, this method is highly effective for tracking growth kinetics in real time, supporting future autonomous optimization during thin film synthesis via PLD.</p><p>We applied an online Bayesian state estimation technique called the direct filter method to estimate the parameters of a thin film growth kinetics model in real time during PLD of ultrathin TMD materials. The DF estimates the growth and nucleation rate parameters of a growth model based on the area coverage of discrete layers, measurable with in situ laser reflectivity. We tested the method on simulated and previously acquired data for WSe 2 growth and ultimately deployed the algorithm on an autonomous PLD system to demonstrate realtime parameter estimation during the automated growth of 1T&#8242;-MoTe 2 . The DF method robustly estimates model parameters at early stages of growth with accuracy consistent with postgrowth analysis. By combining in situ diagnostics with a physical model, real-time control in PLD becomes feasible. Moving forward, achieving real-time control of the physical growth parameters during synthesis can be done through several approaches, depending on the experimental goals, desired level of physical insight, and available experimental automation. Simple reactive methods like proportionalintegral-derivative (PID) control can be used with a single synthesis parameter to maintain a nucleation or growth rate set point. A more advanced and physically insightful method would be to express the rates as explicit functions of the synthesis parameters and define operating bounds for each to achieve a desired nucleation or growth rate. The functional forms of the rates would need to be derived theoretically, or a neural surrogate function could be constructed through active learning experiments. Finally, reinforcement learning could be used to autonomously learn an optimized control strategy over the course of numerous synthesis experiments. Extending this approach to more complex growth models and additional in situ techniques (RHEED, ellipsometry, etc.) enables control of different film qualities, such as crystal phase refractive index. Combining these control strategies with online parameter estimation methods like the DF holds promise for achieving true control of synthesis for thin films and may finally enable tailored synthesis of desired metastable phases that are very difficult or impossible to reliably fabricate using existing methodologies.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>&#9632; ASSOCIATED CONTENT</head></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>https://doi.org/10.1021/acs.nanolett.4c05921 Nano Lett. 2025, 25, 2444-2451</p></note>
		</body>
		</text>
</TEI>
