<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>Global Predictability of Marine Heatwave Induced Rapid Intensification of Tropical Cyclones</title></titleStmt>
			<publicationStmt>
				<publisher>AGU</publisher>
				<date>12/01/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10640505</idno>
					<idno type="doi">10.1029/2024EF004935</idno>
					<title level='j'>Earth's Future</title>
<idno>2328-4277</idno>
<biblScope unit="volume">12</biblScope>
<biblScope unit="issue">12</biblScope>					

					<author>Soheil Radfar</author><author>Ehsan Foroumandi</author><author>Hamed Moftakhari</author><author>Hamid Moradkhani</author><author>Gregory R Foltz</author><author>Alex Sen_Gupta</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[<title>Abstract</title> <p>Prediction of the rapid intensification (RI) of tropical cyclones (TCs) is crucial for improving disaster preparedness against storm hazards. These events can cause extensive damage to coastal areas if occurring close to landfall. Available models struggle to provide accurate RI estimates due to the complexity of underlying physical mechanisms. This study provides new insights into the prediction of a subset of rapidly intensifying TCs influenced by prolonged ocean warming events known as marine heatwaves (MHWs). MHWs could provide sufficient energy to supercharge TCs. Preconditioning by MHW led to RI of recent destructive TCs, Otis (2023), Doksuri (2023), and Ian (2022), with economic losses exceeding $150 billion. Here, we analyze the TC best track and sea surface temperature data from 1981 to 2023 to identify hotspot regions for compound events, where MHWs and RI of tropical cyclones occur concurrently or in succession. Building upon this, we propose an ensemble machine learning model for RI forecasting based on storm and MHW characteristics. This approach is particularly valuable as RI forecast errors are typically largest in favorable environments, such as those created by MHWs. Our study offers insight into predicting MHW TCs, which have been shown to be stronger TCs with potentially higher destructive power. Here, we show that using MHW predictors instead of the conventional method of using sea surface temperature reduces the false alarm rate by 30%. Overall, our findings contribute to coastal hazard risk awareness amidst unprecedented climate warming causing more frequent MHWs.</p>]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Tropical cyclones (TCs) that strike coastal areas have the potential to cause severe flooding, torrential rains, and strong winds often resulting in severe economic disruption, property damage and casualties. According to the list of documented catastrophic events from the Emergency Events Database (EM-DAT), the annual average damage of TCs between 2000 and 2023 exceeds $63 billion <ref type="bibr">(Delforge et al., 2023)</ref>. The extensive impacts can be more severe when TCs undergo rapid intensification (RI) <ref type="bibr">(Lok et al., 2021)</ref>. TC RI is defined as a process wherein the maximum sustained wind speed of a TC increases by at least 30 kt over a 24 hr period <ref type="bibr">(Kaplan &amp; DeMaria, 2003)</ref>.</p><p>Predicting RI remains challenging because as it involves the understanding of internal dynamics, interplay of environmental parameters, and nonlinear processes <ref type="bibr">(Kieu &amp; Zhang, 2009;</ref><ref type="bibr">Neetu et al., 2020;</ref><ref type="bibr">L. Zhang &amp; Oey, 2019)</ref>. RI forecast errors are estimated to be 2-3 times larger due to limited observational data and lack of understanding and prediction of inner-core processes <ref type="bibr">(Bhatia et al., 2022;</ref><ref type="bibr">Trabing &amp; Bell, 2020)</ref>. This positions RI at the forefront of the TC forecasting field <ref type="bibr">(Gall et al., 2013)</ref> and signals the urgent need for enhanced forecasting capabilities and adaptive strategies to mitigate the heightened risks of TCs. To date, researchers have predicted RI using a variety of statistical methods and numerical models (W. <ref type="bibr">Wang et al., 2023;</ref><ref type="bibr">Z. Zhang et al., 2023)</ref>. Among the widely used models in these categories are the Statistical Hurricane Intensity Prediction Scheme Rapid Intensification Index (SHIPS-RII) <ref type="bibr">(Kaplan &amp; DeMaria, 2003;</ref><ref type="bibr">Kaplan et al., 2010)</ref> and the Hurricane Weather Research and Forecasting (HWRF) <ref type="bibr">(Tallapragada et al., 2014)</ref>. In addition, substantial effort has gone into using data assimilation techniques such as ensemble Kalman filter for numerical weather prediction or forecasting models to enhance their RI predictability skill <ref type="bibr">(Hartman et al., 2023;</ref><ref type="bibr">Munsell et al., 2017;</ref><ref type="bibr">Tao et al., 2022)</ref>.</p><p>The primary challenge of data-driven RI forecasting models is choosing predictors that inform RI formation and evolution <ref type="bibr">(Narayanan et al., 2023;</ref><ref type="bibr">Rozoff et al., 2015)</ref>. Therefore, there is a growing interest in developing a more robust set of environmental predictors <ref type="bibr">(Balaguru et al., 2020;</ref><ref type="bibr">Emanuel et al., 2004;</ref><ref type="bibr">Kowch &amp; Emanuel, 2015)</ref>. While high sea surface temperature (SST) and ocean heat content, low vertical wind shear, and abundant atmospheric moisture are generally recognized as favorable environmental conditions for triggering RI <ref type="bibr">(Mawren et al., 2022)</ref>, their coexistence is not necessarily required for RI. For example, Hurricane Irene (1999), Hurricane Edouard (2014) (J. A. <ref type="bibr">Zhang et al., 2017)</ref>, Hurricane Irma (2017) <ref type="bibr">(Fischer et al., 2018)</ref>, and Hurricane Michael (2017) (Le <ref type="bibr">H&#233;naff et al., 2021)</ref> developed and underwent RI despite high vertical wind shear. Similarly <ref type="bibr">(Rathore et al., 2022)</ref>, showed that cyclone Amphan (2020) experienced RI in the Bay of Bangal despite high vertical wind shear. This was largely attributed to the presence of a strong collocated marine heatwave (MHW) before and during the development phase of Amphan. The role of MHW as a precursor to TC RI has been highlighted in recent studies <ref type="bibr">(Dzwonkowski et al., 2020;</ref><ref type="bibr">Mawren et al., 2022;</ref><ref type="bibr">Pun et al., 2023)</ref>. One of the most recent studies over the western North Pacific and North Atlantic basins showed that the TCs that passed over MHWs intensified roughly three times more rapidly than the TCs that did not <ref type="bibr">(Choi et al., 2024)</ref>. Although it is commonly known that SST is a reliable indicator of the RI process <ref type="bibr">(Foltz et al., 2018)</ref>, the influential characteristics of MHW for initiating RI events have not yet been fully explored. Additionally, to the best of our knowledge, there is no research done on utilizing MHW characteristics as predictors of RI events.</p><p>Over the last decade, machine learning (ML) and deep learning (DL) methods have emerged as viable means of predicting TC RI. For example, the long short-term memory (LSTM) and the convolutional neural network (CNN) are among the most widely used DL methods in experimental RI prediction models <ref type="bibr">(Griffin et al., 2024;</ref><ref type="bibr">Li et al., 2017;</ref><ref type="bibr">C.-H. Zhang &amp; Zhang, 2023;</ref><ref type="bibr">Zhou et al., 2022)</ref>. Aside from the computational costs of DL-based models, additional challenges could arise in selecting a candidate architecture and designing layers. By contrast, shallow ML-based algorithms are more popular for RI predictions because of their simplicity and low computational demands. The supervised ML models typically attempt to use individual or ensemble algorithms to solve a classification or regression problem and identify RI occurrences. The first ML effort in this field focused on the application of the Naive Bayesian and logistic regression techniques. <ref type="bibr">Rozoff and Kossin (2011)</ref> used SHIPS to train these models for the probabilistic prediction of rapid intensity change over the North Atlantic and eastern North Pacific Ocean basins. Apart from the logistic regression algorithm, which is the most frequently used algorithm in the literature <ref type="bibr">(Kaplan et al., 2015;</ref><ref type="bibr">Knaff et al., 2023;</ref><ref type="bibr">Ko et al., 2023;</ref><ref type="bibr">Narayanan et al., 2023;</ref><ref type="bibr">Su et al., 2020;</ref><ref type="bibr">Tam et al., 2021)</ref>, several studies have used tree-based algorithms; for instance, Decision Tree <ref type="bibr">(Narayanan et al., 2023;</ref><ref type="bibr">Su et al., 2020)</ref>, Random Forest <ref type="bibr">(Ko et al., 2023;</ref><ref type="bibr">A. Mercer &amp; Grimes, 2017;</ref><ref type="bibr">Su et al., 2020)</ref>, Extremely Randomized Trees <ref type="bibr">(Su et al., 2020)</ref>, and eXtreme Gradient Boosting (XGBoost) <ref type="bibr">(Wei &amp; Yang, 2021)</ref>. Common alternatives included using support vector machines (A. <ref type="bibr">Mercer &amp; Grimes, 2015</ref><ref type="bibr">, 2017;</ref><ref type="bibr">A. E. Mercer et al., 2021)</ref> or artificial neural networks <ref type="bibr">(Ko et al., 2023;</ref><ref type="bibr">A. Mercer &amp; Grimes, 2017)</ref>.</p><p>ML-based methods, as data-driven models, primarily work by utilizing the existing information within the data, making them reliant on the quality and representativeness of the data. Another limitation for predicting RI with more accuracy is the imbalanced data issue, which makes it difficult to train accurate models <ref type="bibr">(DeMaria et al., 2021;</ref><ref type="bibr">Yang et al., 2020)</ref>. In ML, imbalanced data often leads to models being overexposed to the majority samples during training, causing a bias toward those instances and neglect of the minority samples. Considering that RI events are generally limited compared to non-RI situations, resampling techniques should be used to prevent models from being skewed toward the majority class. These techniques adjust the dataset composition, either by augmenting the minority class (referred to as oversampling) or by reducing the majority class (known as undersampling), or by combining both methods. <ref type="bibr">Ko et al. (2023)</ref> adopted a type of the Synthetic Minority Oversampling Technique (SMOTE), called Borderline SMOTE, as an oversampling scheme and edited nearest neighbor as an undersampling method to rebalance the training datasets for RI forecasting. In RI studies, SMOTE-based techniques for synthesizing new samples are more popular than other methods, and it has been demonstrated that SMOTE can successfully address and rectify the imbalance issue <ref type="bibr">(Kim et al., 2024;</ref><ref type="bibr">Li et al., 2017;</ref><ref type="bibr">Shaiba &amp; Hahsler, 2016;</ref><ref type="bibr">Wei et al., 2023;</ref><ref type="bibr">Wei &amp; Yang, 2021;</ref><ref type="bibr">Yang et al., 2020)</ref>.</p><p>The overarching goal of this study is to develop a global predictive ML model for a subset of RI events impacted by nearby MHWs. Notably, the RI forecast errors are largest in favorable environments like high SSTs <ref type="bibr">(Trabing &amp; Bell, 2020)</ref>. Given that MHWs create conditions of anomalously high SSTs, portions of TC tracks influenced by MHWs have a greater potential for significant forecast errors. These conditions not only challenge accurate forecasting but also tend to lead to stronger TC intensities <ref type="bibr">(Choi et al., 2024;</ref><ref type="bibr">Radfar et al., 2024)</ref>. Therefore, it is crucial to further improve models to better predict TC behavior in MHW-affected areas. For this purpose, global RI and MHW events are identified over the 1981-2023 period. According to the framework by <ref type="bibr">(Bevacqua et al., 2021)</ref>, MHWs and RIs may form temporally and spatially compounding events. This means when they occur close together in time and space, their combined presence can facilitate the rapid intensification process of TCs. To identify compound MHW and RI events, the spatiotemporal association of MHW events to the onset of RI events is analyzed. The global pattern of these events is discussed and then used to identify hotspot regions for the formation of MHW-RI events in Section 3.2. Finally, four tree-based algorithms are trained, validated, and tested to build a robust ensemble machine learning algorithm that performs well in predicting MHW-induced RI events globally. The performance of the model in capturing the costly TC events of 2021-2023 is examined in Section 3.3.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Materials and Methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Data Sources</head><p>We use TC best-track data to identify RI events. The dataset is sourced from the publicly available International Best Track Archive for Climate Stewardship (IBTrACS) dataset (K. R. <ref type="bibr">Knapp et al., 2018)</ref>. This dataset contains 3-hourly interpolated TC track data from 1841 to the present. The latitude and longitude values in the dataset are rounded to the nearest 0.1&#176;, which corresponds to approximately 10 km in spatial resolution. The data includes the maximum sustained wind speed, radius, pressure, storm translational speed, and direction. The NOAA Optimum Interpolation Sea Surface Temperature version 2.1 (NOAA OISST v2.1) dataset is used in this study for MHW detection. This product provides the daily SST data <ref type="bibr">(Huang et al., 2021)</ref> from 1 September 1981 through 19 October 2023 at a spatial resolution of 25 km.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Data Preprocessing and Event Detection</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">TC Data</head><p>The IBTrACS dataset is a collection of data from multiple agencies. Therefore, the data needs to be preprocessed in order to remove any inconsistencies, including unifying longitudes and merging wind speed formats from different agencies. For a single latitude and longitude pair, the dataset may contain multiple wind speeds from different agencies, while we need a single data for each grid point. To solve this issue, we select a unique wind speed for each location. We use wind speed provided by the USA agencies where data is available and non-zero. This choice arises from the fact that this type of information in IBTrACS has the best coverage (around 85% coverage for all reported TC track locations). If this wind speed is not available, we find the distance from the row's coordinates to other agencies' reported locations and select the wind speed of the closest agency. Subsequently, the processed data are used to detect RI events according to the United States' National Hurricane Center definition whereby RI occurs if the maximum sustained wind speed increases at least 30 knots in 24 hr <ref type="bibr">(Kaplan &amp; DeMaria, 2003)</ref>. In our analysis, we included only those points in the best track data that were classified as tropical or subtropical cyclones, excluding all other storm types such as extratropical, post-tropical, waves, and disturbances.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">MHW Events</head><p>We conducted a pointwise analysis to characterize global MHW events at the grid points of the NOAA OISST v2.1 dataset, which features a 0.25&#176;grid resolution. The detection period is from 1 September 1981, to 19 October 2023. For this, a MHW event is defined as a period in which SST is above the local and seasonally evolving 90th percentile threshold of SSTs for at least five consecutive days <ref type="bibr">(Hobday et al., 2016)</ref>. For each grid point, we Earth's Future 10.1029/2024EF004935 identified the days with SSTs above the 90th percentile. Additionally, two events less than 3 days apart are considered a single MHW event.</p><p>A 30 year baseline period, from 1991 to 2020, is used in our analysis. The climatological mean and percentile threshold are calculated for each calendar day from the pool of daily SSTs within an 11 day window centered on the certain day to ensure sufficient sample sizes for estimating these values <ref type="bibr">(Oliver et al., 2018</ref><ref type="bibr">(Oliver et al., , 2019))</ref>. The climatological means and thresholds for each day are further smoothed by applying a 31 day moving average centered on each day to the 30 year historical data, to eliminate high-frequency noise (Sen <ref type="bibr">Gupta et al., 2020)</ref>. This MHW characterization is implemented using an R package developed by <ref type="bibr">(Schlegel &amp; Smit, 2018)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Methodology</head><p>The general framework implemented in this study is presented in Figure <ref type="figure">1</ref>. Our goal is to create a global ML model that can predict RI based on TC characteristics and prior MHWs that occurred close to the associated TCs.</p><p>To achieve this, a set of structurally different forecast models is developed to predict if RI occurs for a particular TC over each 24 hr period over the TC lifetime. The models' predictors include characteristics of the TC in question as well as information of ocean state, including any collocated MHWs (the list of all predictors is included in Table <ref type="table">S1</ref> in Supporting Information S1). The models' hyperparameters (model settings) are first tuned using a grid search approach. We then combine the separate forecast models into a single 'ensemble' model to enhance the forecast skill. Finally, we use the model to predict RI globally and regionally (i.e., basin-wise).</p><p>MHWs are identified based on the Hobday definition (Section 2.2.2). We next identify TCs that pass near to MHW events to create a compound MHW-TC dataset. A compound event occurs if a TC track lies within 200 km of a MHW. This distance spans the inner and outer core radii of a TC <ref type="bibr">(Weatherford &amp; Gray, 1988)</ref>. In addition, a temporal condition must also be met. In this regard, a MHW must be present 10 days or less before the TC record time. This is the typical period used for analyzing MHW conditions before the RI phase <ref type="bibr">(Dzwonkowski et al., 2020;</ref><ref type="bibr">Rathore et al., 2022)</ref>.</p><p>Using our database of TCs that pass close to MHWs, we next examine all locations along the track every 3 hr (the temporal resolution of the IBTrACS dataset). Each location/time is considered an independent sample. For each location we determine certain MHW metrics (Table <ref type="table">1</ref>) based on the proximal MHW events. We also determine if RI occurs over the subsequent 24 hr (this is labeled a MHW-RI event). After this step, we have an updated MHW-TC dataset that has two classes (positive class or RI class vs. negative class or non-RI class). This dataset is used as input for our ML models. Based on TC and MHW metrics for each location and time along the TC track we want to predict whether RI occurs or not. Figure <ref type="figure">2</ref> illustrates how we build the compound MHW-TC dataset using the track of Hurricane Ian (2022). The red points represent the MHW locations that meet our spatiotemporal requirements (within 200 km and within 10 days of the TC track), while the yellow stars indicate intensification points, and the blue line shows the path of Hurricane Ian. If MHW information is paired with a TC track location where intensification happened, it is flagged as a positive class; otherwise, it is flagged as a negative class.</p><p>In developing ML algorithms, it is common practice to partition the dataset into training, validation, and test subsets to build the model, refine its performance during training, and evaluate its prediction accuracy on unseen data, respectively. In our analysis, the testing period (2021-2023) covers approximately 20% of the total compounding MHW-TC events in the dataset (Figure <ref type="figure">3</ref>). It should be noted that this timeframe covers the most recent compounding events and provides a clearer representation of the warming patterns observed in recent years. The rest of the dataset is used to train and validate the model. Our analysis shows that the average number of TC Best Track points per year associated with MHWs during 2021-2023 increased to approximately 3036, compared to an average of 2177 per year during the training/validation period . This represents about a 40% increase and highlights the potential impact of rising SSTs and greater prevalence of MHWs in a warmer world. For training and validation, we divide the input data from 1981 to 2020 into five equal time periods (a.k.a. folds). We use four folds for training and one for validation.</p><p>Four tree-based ML classification algorithms (namely XGBoost, Random Forest (RF), ExtraTrees (ET) and LightGBM) are selected to develop a predictive RI binary model (see SI for more details on these algorithms). As discussed earlier, RI forecast is an imbalanced problem, where the number of non-RI events along a TC path is significantly larger than RI cases. In our compound MHW-TC dataset, to rebalance the training datasets and improve predictions of RF and ET algorithms for the minority class (RI class), we rely primarily on the original Earth's Future 10.1029/2024EF004935 Synthetic Minority Over-sampling Technique (a.k.a. SMOTE) proposed by <ref type="bibr">(Chawla et al., 2002)</ref>. In this method instead of simply replicating minority samples, synthetic samples are generated by interpolating along the line connecting several minority class instances within a defined neighborhood <ref type="bibr">(Fern&#225;ndez et al., 2018;</ref><ref type="bibr">L. Wang et al., 2021)</ref>. In our study, following a trial-and-error approach, we set the neighborhood as the 5 closest instances around the sample of interest. The mathematical representation of SMOTE is as follows <ref type="bibr">(Pradipta et al., 2021)</ref>: Earth's Future</p><p>where X i is a minority sample, and X j is one of its 5 nearest neighbors. Their difference is calculated and then multiplied by a random number between 0 and 1, to be added to the vector to the selected sample X i . This implementation results in a synthetic data along the line segment connecting two samples.</p><p>In addition to the SMOTE rebalancing, XGBoost and LightGBM algorithms also have an internal parameter, called "scale_pos_weight", for dealing with imbalanced datasets. The "scale_pos_weight" parameter, determined via hyperparameter tuning, acts as a multiplier for the loss function of the positive class, which is the minority class. The general representation of this parameter in the loss function of a binary classifier is as follows:</p><p>where, L(y i , AE y i ) is the loss for a single instance, y i is the true label for the instance, AE y i is the predicted probability for the positive class (y i = 1), and &#945; is scale_pos_weight parameter that acts as a multiplier for the loss associated with the minority class. We assign a value greater than one in order to make the model prioritize the positive class, imposing heavier penalties for inaccurate classification of this class. This parameter aids these algorithms to be less biased toward negative (or majority) class. We also evaluated the computational cost of using this method instead of SMOTE for XGBoost and LightGBM. In our model configuration, using the scale_pos_weight parameter results in about 20% faster runtime compared to the SMOTE option and also a simpler training process since scale_pos_weight is an internal parameter of these models and can be adjusted iteratively.</p><p>Next, the set of predictors describing the compound MHW-TC relationship for the ML models are determined by performing feature importance analysis. This approach is used to determine which of the potential predictors (i.e., MHW and TC metrics determined previously) are most useful for making RI predictions. We employ the Shapley additive explanations (SHAP) model explainer <ref type="bibr">(Lundberg &amp; Lee, 2017)</ref> to rank the importance of various features and select the most influential ones. This metric measures the contribution of each predictor by comparing the predictive performance of the model in the absence and presence of that predictor. This method is built upon calculating Shapely values as formulated in Equation 3 <ref type="bibr">(Molnar, 2022)</ref>:  Earth's Future 10.1029/2024EF004935</p><p>where, j is the given feature, S is any subset of features that does not include the jth feature, and |S| is its size, N is the number of all features, v (S *{ j} ) represents all combinations with adding jth feature, v(S) is all combinations without j's presence, v() function defines the prediction output of the model, and 3</p><p>denotes summation over all possible combinations without j.</p><p>Every ML algorithm has hyperparameters or a set of algorithm parameters that control its learning process.</p><p>During the tuning process, different values for the hyperparameters are tested to find a set of optimal hyperparameters for a learning algorithm. The list of tuned hyperparameters of the four candidate ML algorithms are presented in Tables S2-S5 in Supporting Information S1. The trained ML models are then evaluated over the 2021-2023 period, which is after the training period and represents an unseen and independent MHW-TC dataset.</p><p>Various performance metrics are used during the validation and evaluation steps. When dealing with imbalanced datasets, a common approach is to assess how well a model performs in terms of precision (P) and recall (R) <ref type="bibr">(Cavaiola et al., 2024)</ref>. In particular, evaluation of RI forecasting models often relies on Probability of Detection (POD) and False Alaram Ratio (FAR) (Figure <ref type="figure">3</ref>) as standard metrics for testing the performance of both operational and proposed models <ref type="bibr">(Narayanan et al., 2023)</ref>. FAR is related to P or the Success Ratio (SR), with P being defined as 1-FAR. Additionally, POD or Sensitivity is equal to R (DeMaria et al., 2021). suggest that a model is valuable if it has a non-zero POD and a FAR of 50% or less, since it detects some RI cases and is at least half of the time correct. Another important metric in RI prediction models is bias score. Bias is related to POD and SR values and is calculated as follows: bias = POD SR</p><p>A bias score of 1.0 indicates that the fraction of RI events predicted by a model is the same as the observed number of RI cases <ref type="bibr">(DeMaria et al., 2021)</ref>. Therefore, bias scores greater than 1 suggest overprediction, while scores less than 1.0 indicate underprediction.</p><p>To achieve a trade-off between both POD and FAR metrics, the models are trained using solely the training sets, and their performance on the validation set is evaluated based on F 1 -score during hyperparameter tuning process. This is accomplished by selecting the hyperparameter set with the highest 5-fold mean F 1 -score for the four candidate ML models. F 1 -score is the harmonic mean of P and R and is calculated as F 1 = 2PR/(P + R). Based on three entries listed in the confusion matrix of Figure <ref type="figure">3</ref>, these metrics can be calculated: "TP" is True Positive and Earth's Future Ensemble learning is a ML technique that leverages the complementary strengths of multiple algorithms to improve overall performance (more detail is included in the SI). We explore the possibility of integrating a combination of the four ML models into an ensemble ML model to benefit from the predictive capability of the combination of the individual models. In an ensemble model, each individual ML algorithm contributes to the final prediction of the model according to its weight. The weights of this ensemble model were automatically assigned based on their "POD-FAR" values. Our algorithm tries different combinations of weights for each model (sum of the weights is 1), calculates the metric, and finally picks the combination that achieves the best score. We select the best ensemble ML model so that it performs on the testing set with a FAR &lt;50% and there is a larger margin between its POD and FAR (i.e., better POD-FAR). The developed ensemble model will be referred to as "LightEX". Next, we evaluate the LightEX's performance across various basins. Here, we select the portion of test period events belonging to each region and run the developed model for them to evaluate the performance of the global model for different TC basins. Finally, we evaluate the actual improvement achieved by using MHW predictors compared to using pre-storm SST as a predictor. The motivation for this comparison is that SST is the most common predictor in the available RI forecast models, and we want to determine whether using MHW predictors instead of SST is sufficiently informative for the model or not.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results and Discussion</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Global Distribution of RI Points</head><p>There are well-known regions around the world that provide favorable conditions for the formation and strengthening of TCs, also known as tropical cyclogenesis <ref type="bibr">(Hsu, 2003)</ref>. These active TC basins are the North Atlantic, East Pacific, Northwest Pacific, North Indian, Southwest Indian, Australian (including two subregions in the southeastern Indian Ocean and the western Oceania), and East Australian (located at the southwestern Pacific). Figure <ref type="figure">4</ref> illustrates the locations for the starting points of historical RI events along with the cumulative distributions of RI starting wind speeds across active TC basins. Results show that RI events generally do not Earth's Future 10.1029/2024EF004935 occur when wind speeds exceed approximately 125 knots in all basins except for the East Pacific where the limit is around 155 knots. The wind speed of 155 knots was observed during Hurricane Patricia, which experienced an RI on 23 October 2015, where the wind speed increased to 185 knots in just 6 hr. Patricia stands out as the most powerful TC ever recorded globally in terms of maximum sustained wind speed <ref type="bibr">(Kimberlain et al., 2016)</ref>.</p><p>Results show that RI events generally start at higher wind speeds over the Northwest Pacific, where the 90th percentile of wind speed at RI initiation is 100 knots, 10-20 knots more than in any other basin. This is also confirmed by a denser population of dark red dots over that basin (Figure <ref type="figure">3</ref>). The tropical Northwest Pacific is the most active TC region, with more than twice as many events as the East Pacific basin. According to the percentile analysis, TCs start undergoing RI between 30 and 90 knots (10th to 90th percentiles).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Global Compound MHW and RI Events</head><p>A comparative analysis of the global compound MHW-RI characteristics in active TC basins is presented in Figure <ref type="figure">5</ref> by depicting the decadal, latitudinal and longitudinal variations in the number of RI events, normalized by the number of TCs, associated with MHWs. These values represent the normalized count of RI events, adjusted by the number of TCs. It is important to note that the frequency of RI events is based on a 3 hr time gap between two consecutive RI onset points because our ML model analyzes all recorded points on TC tracks for RI prediction. Therefore, these numbers represent overlapping RI events. The decadal plots illustrate an increase in the ratio of MHW-induced RI events over most basins. In particular, the number of RI events in the 2020s alone reached or surpassed the 2010s in all basins except the East Australian. For the Australian and the Northwest Pacific basins, we observe a nearly constant frequency since the 1990s at around 0.6-0.7. In the East Pacific, data for the 1990s, 2000s and 2020s exhibit consistently high activity of MHW-induced RI events, with RI frequencies of 06-0.8 per year, which is close to the peak for other basins. In the East Australian basin, peak activity was Earth's Future From the spatial distribution of MHW-induced RI events, it is evident that latitude-wise, the highest RI frequencies are distributed across a wide range of latitudes for the East Pacific, the Southwest Indian, and the East Australian.</p><p>For other basins, we observe a more localized distribution of RI events, with peak frequencies occurring within specific latitude ranges that vary by basin. Closer inspection of the Northern Hemisphere basins reveals that the (10, 30) &#176;N band has the strongest activity. The highest RI frequencies of about 2 are observed in the North Indian and the East Australian basins. On the other hand, the North Atlantic basin has the lowest latitude ratios of less than 1.5, indicating the lower chance of relative RI occurrence among all TCs paths over this region. Considering the longitudes of the events, the highest RI frequencies greater than 2 are observed in the East Pacific, the North Indian, and both Australian basins. Overall, the lowest RI frequencies around 1.5 are in the Northwest Pacific Basin. However, it can be seen that for this basin and the North Atlantic basin, the calculated RI frequencies have a broader distribution across the longitudinal bands, suggesting diverse regional MHW influences. Generally, the spatial distribution results (Figure <ref type="figure">4</ref>) suggest that the Caribbean Sea (near the Lesser Antilles), westernmost Pacific waters, coastlines of India in the Bay of Bengal, eastern coasts of Philippines, northwest coasts of Australia, and east of the Southwest Indian basin are the regional hotspots for the formation of MHW-induced RI events.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Performance of the Predictive ML Model</head><p>Using the SHAP method, we select the most important features to develop ML algorithms more efficiently. Among the total 31 features considered in this analysis (Table <ref type="table">S1</ref> in Supporting Information S1), the SHAP results indicate that the 12 listed in Table <ref type="table">1</ref> are the most influential parameters (i.e., highest mean absolute SHAP) affecting the predictive ability of the ML model. The results suggest that aside from the geographical coordinates of MHW and TC events, the most important storm characteristics are wind speed and distance to land. Additionally, mean absolute intensity is the most important MHW metric.</p><p>Using the most influential predictors, the hyperparameter tuning is performed for four individual algorithms using the grid search method on a High-Performance Computing (HPC) system, equipped with 120 cores (3.0 GHz). The runtimes for executing the models are presented in Figure <ref type="figure">5</ref>. The results show that the LightGBM model runs significantly faster compared to other models (8 times faster than XGBoost, 12 times faster than ET, and 24 times faster than RF) while producing predictions with comparable skill to the other algorithms. This showcases the superiority of LightGBM in terms of computational efficiency for large datasets (in our case, more than 4 million MHW-TC events; see Supplementary Information for further details about sample size). The predictions from these models were then combined to build an ensemble model. For this purpose, the implemented algorithm randomly assigns a weight between 0 and 1 to each candidate model to find the best weights in terms of POD-FAR metric. The best ensemble model adopts a weight set of 0.3, 0.0, 0.1, and 0.6 for XGBoost, RF, ET, and LightGBM, respectively. As can be seen, RF does not add value to the ensemble model therefore it is excluded from the ensemble model using the implemented algorithm. The weighted average of the probabilities assigned to each class by all models is then calculated, and the output class is predicted based on the Argmax of these weighted averages. The resultant performance metrics for the individual models and the ensemble model (LightEX) are presented in Figure <ref type="figure">6</ref>. These metrics support that LightEX exhibits a good balance between detection accuracy and precision (POD = 54% and FAR = 49%), and the results are provided without over-or underprediction (bias = 1.06).</p><p>To measure the applicability of the LightEX model over active TC basins, we performed basin-wise (regional) evaluation across seven active TC basins and evaluated their corresponding POD and FAR values. Figure <ref type="figure">7</ref> demonstrates how the model performed across various basins. It should be noted that we evaluated the performance of the LightEX model using the unseen data (2021-2023 events) from each individual basin without retraining the model for each basin. In the North Atlantic, the model shows a relatively high POD, indicating a good ability to detect RI events. However, FAR is as high as POD and this represents a common challenge for all available RI prediction models, highlighting the urgent need to improve the predictor set. The model exhibits a close but slightly weaker performance over the Northwest Pacific basins. POD of the model in the East Pacific exceeds that of other basins, while FAR remains comparable. This is in agreement with <ref type="bibr">(Narayanan et al., 2023)</ref>, which obtained the best performance for the East Pacific and the lowest for the Atlantic and Southern Hemisphere basins. Similarly, our model resulted in POD = 44% and FAR = 55% for the Australian basin, which shows the Earth's Future 10.1029/2024EF004935</p><p>worst performance among all basins. It is worth noting that some previous studies did not distinguish between Southwest Indian, main and eastern Australian basins and considered them part of the Southern Hemisphere. We believe this is important because the model in this study performs very well in the East Australian and the Southwest Indian basins, which shows the differences between the characteristics of these basins. The high POD value in the East Pacific is partly related to the fact that the probability of RI occurrence (minority class of the ML model) is higher in this basin than in other basins <ref type="bibr">(Bhatia et al., 2022)</ref>. This fact reduces the model's difficulty in dealing with imbalanced classes and provides a broader training set. In addition, since our model relies on SST characteristics, the differences in the SST impacts between basins would greatly affect the performance of the model across basins.  Earth's Future 10.1029/2024EF004935 allocation and risk management in TC-prone regions to make better risk-informed decisions. In this regard, lower FAR values lead to fewer unnecessary preparations or alerts for RI that do not actually occur.</p><p>To get a clearer idea of value added by including MHW features in RI prediction models, it is necessary to compare our models with ML models developed with SST as a predictor. Therefore, we repeated the training and testing processes for all previously mentioned models, this time using SST at the location/time of the TC plus the seven TC predictors from Table <ref type="table">1</ref> (eight predictors in total). The obtained best parameter sets and weights for the new ensemble model are listed in Table <ref type="table">S7</ref> in Supporting Information S1. Figure <ref type="figure">8</ref> compares the POD and FAR metrics for different machine learning models based on MHW characteristics versus only using pre-storm SST (see Supplementary Information for further details about the alternative model). Although we observe a slight improvement in LightEX's POD (&gt;3%), its FAR value deteriorates by 36% (50% compared to 68%). The ML models exhibit a significant increase in false alarms, with FAR values exceeding 60%. For the ET and XGBoost models, the POD values remain at the same level, while we observe a better POD for RF and a lower POD for LightGBM when using SST as a predictor. The relatively high FAR values are unfavorable for RI forecast models and bolster the advantage of MHW predictors compared to widely used SST for controlling false alarms. A practical approach for incorporating MHWs into RI models is to aggregate MHW characteristics within a 200 km radius of the TC center into a single set of predictors. This aggregation ensures operational feasibility while maintaining the essential physical relationships between MHWs and TC intensification. Specifically, areaaccumulated intensity, defined as the sum of intensities weighted by the respective areas of all MHWs within the radius, can serve as a key predictor. Additionally, the total area of MHWs within the 200 km radius provides a metric for the spatial extent of anomalously warm waters influencing the TC. These aggregated metrics allow for a single forecast at each synoptic time, aligning with existing operational RI model frameworks. Follow-up studies can examine the operational value of this new feature set by integrating it into existing models.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions</head><p>This study aims to develop a global predictive ML model for the occurrence of RI events that are influenced by proximal MHWs. Notably, a considerable number of recent catastrophic TCs, such as Otis (2023) in the East Pacific basin, Doksuri (2023) in the Northwestern Pacific basin, and Ian (2022) in the North Atlantic basin, belong to the category of MHW-induced RI events. This motivated us to develop an alternative RI forecast model that considers the underlying characteristics of this category. In addition to storm characteristics such as wind speed and distance to land, mean and maximum absolute SSTs during MHWs emerged as reliable predictors of compounding MHW-TC events. Building upon four decision tree-based models, we developed an ensemble ML model, called LightEX. A practical implication of LightEX is the possibility of developing a global RI prediction model applicable to various active TC basins with an acceptable performance during the recent period characterized by a warming climate. The ensemble model showed promising performance over the most active TC Earth's Future 10.1029/2024EF004935</p><p>basins worldwide, that is, the Pacific and the Atlantic basins. Additionally, LightEX was trained/validated over a long period of time and tested on the 2021-2023 period. Supplementary Figure <ref type="figure">1</ref> in Supporting Information S1 illustrates that for the majority of RI events (around 70%), it takes approximately 24 hr to achieve a 30-knot increase in maximum sustained wind speeds. Therefore, based on the availability of TC and MHW data, the developed model can aid 24 hr forecasts of RI. This is the minimum threshold for classifying the event as RI. It is important to note that if there is a lag in obtaining TC data or if rare RI events with relatively short time spans occur, the performance of the model may be affected.</p><p>We have demonstrated that including MHW characteristics as predictors not only helps achieve favorable POD levels, but also maintains a FAR below the 50% threshold. On the other hand, using SST as a predictor instead of MHW metrics, which is common in the available models, leads to undesirably higher FAR values (more than 30%). This opens up opportunities for further efforts to incorporate specific MHW metrics in the existing models.</p><p>To the best of our knowledge, this is the first study that used LightGBM to classify TC intensity changes. We have shown that LightGBM outperforms other decision tree algorithms with significantly lower computational costs (about one-eighth of XGBoost and one-twentieth of RF). The potential of this ML algorithm can be further explored by implementing it on other dynamic or statistical TC datasets. As the dataset grows larger and/or as more predictors are included, the computational demand required for training can escalate rapidly for other algorithms, whereas LightGBM's optimized gradient boosting framework ensures relatively minimal increases in training time.</p><p>TCs experience MHW and RI along their path tend to become stronger and potentially more destructive a few days <ref type="bibr">(Choi et al., 2024;</ref><ref type="bibr">Radfar et al., 2024)</ref>. During our testing period, there were 51 TCs with a total damage of $360 B (see Supplementary Figure <ref type="figure">2</ref> in Supporting Information S1), of which the RI phase of 28 TCs (55%) was captured by our model. The inability to predict more TC RI events is rooted in the inherent limitations and complexities of ML classification models. The reasons can be categorized into two main areas: (a) input quality, which includes data availability, data quality, and feature representations (see <ref type="bibr">(Narayanan et al., 2023;</ref><ref type="bibr">Rozoff et al., 2015;</ref><ref type="bibr">Trabing &amp; Bell, 2020)</ref> among others), and (b) model inadequacies, which encompass model generalization, algorithmic limitations, and the bias-variance tradeoff (see <ref type="bibr">(Chen et al., 2020;</ref><ref type="bibr">Guan &amp; Burton, 2022;</ref><ref type="bibr">Kim et al., 2024;</ref><ref type="bibr">Trabing &amp; Bell, 2020)</ref> among others). Improving data quality, incorporating more environmental features, and refining algorithms could enhance the model's predictive capabilities.</p><p>Given the objectives of this study, we included MHW characteristics as the predictors of RI. However, other important environmental predictors, such as vertical wind shear, ocean heat content, convective potential energy, divergence at 200 hPa from the storm center, and vorticity, may also be important <ref type="bibr">(Choi et al., 2024;</ref><ref type="bibr">Radfar et al., 2024)</ref>. have discussed the impact of MHWs on surface latent heat flux and precipitation rates, suggesting that these factors also influence the intensity of TCs. Therefore, expanding the feature set to include these and other relevant environmental parameters will allow for a more robust assessment of the proposed approach through a comprehensive representation of the physical underpinnings of RI events. It is also important to address uncertainties that impact forecast capabilities and adapt approaches to deal with error propagation due to intensity forecast errors and insufficient observations <ref type="bibr">(Emanuel &amp; Zhang, 2016;</ref><ref type="bibr">Lok et al., 2021)</ref>. Incorporating forecast data from reliable numerical weather prediction models enables real-time applicability of the model. However, this introduces complexities such as choosing the appropriate model, dealing with track forecast uncertainties, and assessing the sensitivity of results to these factors. These challenges are beyond the scope of our current study. The IBTrACS data also has uncertainties in its reported position and intensity (K. <ref type="bibr">Knapp, 2019)</ref>. The position uncertainty is estimated to be approximately 10-15 km for strong TCs (W s &gt; 100 kt). The range of intensity uncertainties across ocean basins is &#177;7-20% (lower values since 2000). Additionally, different agencies use varying averaging times for maximum wind estimates (usually 1 or 10 min), and wind speeds reported by different agencies are not directly convertible due to procedural and observational differences <ref type="bibr">(Harper et al., 2008;</ref><ref type="bibr">K. R. Knapp et al., 2010)</ref>.</p><p>In terms of MHW detection criteria, we adhered to the commonly used definition of MHWs to be consistent with the common practice when conveying key takeaways. However, there are suggestions to use a shifting temperature baseline rather than a fixed one to avoid detecting more frequent and more intense MHWs <ref type="bibr">(Amaya et al., 2023)</ref>. Also (X. <ref type="bibr">Zhang et al., 2022)</ref>, found that MHWs become shorter and less intense in the absence of smoothing techniques. In light of the circumstances imposed by the unprecedented global warming conditions, future studies should be aware of these alternative definitions and examine how they may influence compounding MHW-TC events.</p><p>Earth's Future 10.1029/2024EF004935</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>23284277, 2024, 12, Downloaded from https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2024EF004935 by Hamed Moftakhari -University Of Alabama , Wiley Online Library on [03/10/2025]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License</p></note>
		</body>
		</text>
</TEI>
