skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: 4-State Hidden Markov Model Using PMDI and Temperature for Climate-Informed Scenario Generation
This repository contains R scripts for implementing a computationally efficient 4-state Hidden Markov Model (HMM) that uses temperature as a covariate to generate ensembles of plausible Palmer Modified Drought Index (PMDI) scenarios across the Western U.S. The model uses paleo PMDI data, which spans from 1500 to 1980 with a matrix grid of 1823 x 481 (e.g., 1823 grid-cells and 481 years). Similarly, paleo temperature data covers the same period, arranged in a matrix grid of 1637 grid cells by 481 years. To address the high dimensionality of the datasets, Principal Component Analysis (PCA) is applied to each variable, and the first six principal components (PCs) from both PMDI and temperature are retained as input to the HMM. The trained HMM is then used to simulate future PMDI scenarios by leveraging bias-corrected CMIP6 temperature projections under the Shared Socioeconomic Pathway (SSP) 2–4.5 scenario. The HMM framework is designed to capture the spatiotemporal variability and regime-shifting behavior of hydroclimatic patterns. It provides critical insights into the spatial correlation of wet and dry conditions across the Western U.S., supporting regional drought risk assessment and long-term water resource planning. For a more detailed description of the model, please refer to the following paper: Tezcan, B., & Garcia, M. (2025). Training a hidden Markov model with PMDI and temperature to create climate informed scenarios. Frontiers in Water, 7, Article 1472695. https://doi.org/10.3389/frwa.2025.1472695  more » « less
Award ID(s):
1942370
PAR ID:
10599269
Author(s) / Creator(s):
;
Publisher / Repository:
HydroShare
Date Published:
Edition / Version:
1
Subject(s) / Keyword(s):
Hidden Markov Model, Climate Change, Drought, Paleoclimate Data, Western U.S.
Format(s):
Medium: X
Institution:
CUASHI
Sponsoring Org:
National Science Foundation
More Like this
  1. Understanding the nature of climatic change impacts on spatial and temporal hydroclimatic patterns is important to the development of timely and spatially explicit adaptation options. However, regime-switching behavior of hydroclimatic variables complicates the modelling process as many traditional time series methods do not capture this behavior. Accurately representing spatial correlation across hydroclimatic regimes is particularly important for water resources planning in large watersheds such as the Colorado River, and regions where interbasin transfers and shared demand nodes link multiple watersheds. Here, we developed a hidden Markov model (HMM) with covariates that generates an ensemble of plausible future regional scenarios of the Palmer modified drought index (PMDI) for any projected temperature sequence. The resulting spatially explicit scenarios represent the historical spatial and temporal patterns of the training data while incorporating non-stationarity by conditioning on temperature. These ensembles can aid water resources managers, infrastructure planners, and government policymakers tasked with building of more resilient water systems. Moreover, these ensembles can be used to generate streamflow ensembles, which, in turn, will be a valuable input to study the impact of climate change on regional hydrology. 
    more » « less
  2. The Po River Basin (PRB) is Italy’s largest river system and provides a vital water supply source for varying demands, including agriculture, energy (hydropower), and water supply. The current (2022) drought has been associated with low winter–early spring (2021–2022) snow accumulation in higher elevations (European Alps) and a lack of late spring–early summer (2022) precipitation, resulting in deficit PRB streamflow. Many local scientists are now estimating a 50- to 100-year (return period) drought for 2022. Given the importance of this river system, information about past (paleo) drought and pluvial periods would provide important information to water managers and planners. Annual streamflow data were obtained for thirteen gauges that were spatially located across the PRB. The Old World Drought Atlas (OWDA) provides annual June–July–August (JJA) self-calibrating Palmer Drought Severity Index (scPDSI) data for 5414 grid points across Europe from 0 to 2012 AD. In lieu of tree-ring chronologies, this dataset was used as a proxy to reconstruct PRB regional streamflow. Singular value decomposition (SVD) was applied to PRB streamflow gauges and gridded scPDSI data for two periods of record, referred to as the short period of record (SPOR), 1980 to 2012 (33 years), and the long period of record (LPOR), 1967 to 2012 (46 years). SVD serves as both a data reduction technique, identifying significant scPDSI grid points within the selected 450 km search radius, and develops a single vector that represents the regional PRB streamflow variability. Due to the high intercorrelations of PRB streamflow gauges, the SVD-generated PRB regional streamflow vector was used as the dependent variable in regression models for both the SPOR and LPOR, while the significant scPDSI grid points (cells) identified by SVD were used as the independent variables. This resulted in two highly skillful regional reconstructions of PRB streamflow from 0 to 2012. Multiple drought and pluvial periods were identified in the paleo record that exceed those observed in the recent historical record, and several of these droughts aligned with paleo streamflow reconstructions of neighboring European watersheds. Future research will utilize the PRB reconstructions to quantify the current (2022) drought, providing a first-time paleo-perspective of drought frequency in the watershed. 
    more » « less
  3. The Adige River Basin (ARB) provides a vital water supply source for varying demands including agriculture (wine production), energy (hydropower) and municipal water supply. Given the importance of this river system, information about past (paleo) drought and pluvial (wet) periods would quantity risk to water managers and planners. Annual streamflow data were obtained for four gauges that were spatially located within the upper ARB. The Old World Drought Atlas (OWDA) provides an annual June–July–August (JJA) self-calibrating Palmer Drought Severity Index (scPDSI) derived from 106 tree-ring chronologies for 5414 grid points across Europe from 0 to 2012 AD. In lieu of tree-ring chronologies, the OWDA dataset was used as a proxy to reconstruct both individual gauge and ARB regional streamflow from 0 to 2012. Principal component analysis (PCA) was applied to the four ARB streamflow gauges to generate one representative vector of regional streamflow. This regional streamflow vector was highly correlated with the four individual gauges, as coefficient of determination (R2) values ranged from 85% to 96%. Prescreening methods included correlating annual streamflow and scPDSI cells (within a 450 km radius) in which significant (p ≤ 0.01 or 99% significance) scPDSI cells were identified. The significant scPDSI cells were then evaluated for temporal stability to ensure practical and reliable reconstructions. Statistically significant and temporally stable scPDSI cells were used as predictors (independent variables) to reconstruct streamflow (predictand or dependent variable) for both individual gauges and at the regional scale. This resulted in highly skillful reconstructions of upper ARB streamflow from 0 to 2012 AD. Multiple drought and pluvial periods were identified in the paleo record that exceed those observed in the recent, historic record. Moreover, this study concurred with streamflow reconstructions in nearby European watersheds. 
    more » « less
  4. null (Ed.)
    In this paper, a deep neural network hidden Markov model (DNN-HMM) is proposed to detect pipeline leakage location. A long pipeline is divided into several sections and the leakage occurs in different section that is defined as different state of hidden Markov model (HMM). The hybrid HMM, i.e., DNN-HMM, consists of a deep neural network (DNN) with multiple layers to exploit the non-linear data. The DNN is initialized by using a deep belief network (DBN). The DBN is a pre-trained model built by stacking top-down restricted Boltzmann machines (RBM) that compute the emission probabilities for the HMM instead of Gaussian mixture model (GMM). Two comparative studies based on different numbers of states using Gaussian mixture model-hidden Markov model (GMM-HMM) and DNN-HMM are performed. The accuracy of the testing performance between detected state sequence and actual state sequence is measured by micro F1 score. The micro F1 score approaches 0.94 for GMM-HMM method and it is close to 0.95 for DNN-HMM method when the pipeline is divided into three sections. In the experiment that divides the pipeline as five sections, the micro F1 score for GMM-HMM is 0.69, while it approaches 0.96 with DNN-HMM method. The results demonstrate that the DNN-HMM can learn a better model of non-linear data and achieve better performance compared to GMM-HMM method. 
    more » « less
  5. Advances in single cell transcriptomics have allowed us to study the identity of single cells. This has led to the discovery of new cell types and high resolution tissue maps of them. Technologies that measure multiple modalities of such data add more detail, but they also complicate data integration. We offer an integrated analysis of the spatial location and gene expression profiles of cells to determine their identity. We propose scHybridNMF (single-cell Hybrid Nonnegative Matrix Factorization), which performs cell type identification by combining sparse nonnegative matrix factorization (sparse NMF) with k-means clustering to cluster high-dimensional gene expression and low-dimensional location data. We show that, under multiple scenarios, including the cases where there is a small number of genes profiled and the location data is noisy, scHybridNMF outperforms sparse NMF, k-means, and an existing method that uses a hidden Markov random field to encode cell location and gene expression data for cell type identification. 
    more » « less