Subseasonal forecasting—predicting temperature and precipitation 2 to 6 weeks ahead—is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remain poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. Here, to counter these errors, we introduce an
This content will become publicly available on December 12, 2023
Status and prospects for drought forecasting: opportunities in artificial intelligence and hybrid physical–statistical forecasting
Despite major improvements in weather and climate modelling and substantial increases in remotely sensed observations, drought prediction remains a major challenge. After a review of the existing methods, we discuss major research gaps and opportunities to improve drought prediction. We argue that current approaches are top-down, assuming that the process(es) and/or driver(s) are known—i.e. starting with a model and then imposing it on the observed events (reality). With the help of an experiment, we show that there are opportunities to develop bottom-up drought prediction models—i.e. starting from the reality (here, observed events) and searching for model(s) and driver(s) that work. Recent advances in artificial intelligence and machine learning provide significant opportunities for developing bottom-up drought forecasting models. Regardless of the type of drought forecasting model (e.g. machine learning, dynamical simulations, analogue based), we need to shift our attention to robustness of theories and outputs rather than event-based verification. A shift in our focus towards quantifying the stability of uncertainty in drought prediction models, rather than the goodness of fit or reproducing the past, could be the first step towards this goal. Finally, we highlight the advantages of hybrid dynamical and statistical models for improving current drought prediction models. This article is part of the Royal Society Science+ meeting issue ‘Drought risk in the Anthropocene’.
more »
« less
- NSF-PAR ID:
- 10423082
- Date Published:
- Journal Name:
- Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- Volume:
- 380
- Issue:
- 2238
- ISSN:
- 1364-503X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. We show that, when applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60–90% (over baseline skills of 0.18–0.25) and precipitation forecasting skill by 40–69% (over baseline skills of 0.11–0.15) in the contiguous U.S. We couple these performance improvements with a practical workflow to explain ABC skill gains and identify higher-skill windows of opportunity based on specific climate conditions. -
Arctic amplification has altered the climate patterns both regionally and globally, resulting in more frequent and more intense extreme weather events in the past few decades. The essential part of Arctic amplification is the unprecedented sea ice loss as demonstrated by satellite observations. Accurately forecasting Arctic sea ice from sub-seasonal to seasonal scales has been a major research question with fundamental challenges at play. In addition to physics-based Earth system models, researchers have been applying multiple statistical and machine learning models for sea ice forecasting. Looking at the potential of data-driven approaches to study sea ice variations, we propose MT-IceNet – a UNet-based spatial and multi-temporal (MT) deep learning model for forecasting Arctic sea ice concentration (SIC). The model uses an encoder-decoder architecture with skip connections and processes multi-temporal input streams to regenerate spatial maps at future timesteps. Using bi-monthly and monthly satellite retrieved sea ice data from NSIDC as well as atmospheric and oceanic variables from ERA5 reanalysis product during 1979-2021, we show that our proposed model provides promising predictive performance for per-pixel SIC forecasting with up to 60% decrease in prediction error for a lead time of 6 months as compared to its state-of-the-art counterparts.more » « less
-
Solar flare prediction is a central problem in space weather forecasting and has captivated the attention of a wide spectrum of researchers due to recent advances in both remote sensing as well as machine learning and deep learning approaches. The experimental findings based on both machine and deep learning models reveal significant performance improvements for task specific datasets. Along with building models, the practice of deploying such models to production environments under operational settings is a more complex and often time-consuming process which is often not addressed directly in research settings. We present a set of new heuristic approaches to train and deploy an operational solar flare prediction system for ≥M1.0-class flares with two prediction modes: full-disk and active region-based. In full-disk mode, predictions are performed on full-disk line-of-sight magnetograms using deep learning models whereas in active region-based models, predictions are issued for each active region individually using multivariate time series data instances. The outputs from individual active region forecasts and full-disk predictors are combined to a final full-disk prediction result with a meta-model. We utilized an equal weighted average ensemble of two base learners’ flare probabilities as our baseline meta learner and improved the capabilities of our two base learners by training a logistic regression model. The major findings of this study are: 1) We successfully coupled two heterogeneous flare prediction models trained with different datasets and model architecture to predict a full-disk flare probability for next 24 h, 2) Our proposed ensembling model, i.e., logistic regression, improves on the predictive performance of two base learners and the baseline meta learner measured in terms of two widely used metrics True Skill Statistic (TSS) and Heidke Skill Score (HSS), and 3) Our result analysis suggests that the logistic regression-based ensemble (Meta-FP) improves on the full-disk model (base learner) by ∼9% in terms TSS and ∼10% in terms of HSS. Similarly, it improves on the AR-based model (base learner) by ∼17% and ∼20% in terms of TSS and HSS respectively. Finally, when compared to the baseline meta model, it improves on TSS by ∼10% and HSS by ∼15%.more » « less
-
Obeid, Iyad Selesnick (Ed.)The Temple University Hospital EEG Corpus (TUEG) [1] is the largest publicly available EEG corpus of its type and currently has over 5,000 subscribers (we currently average 35 new subscribers a week). Several valuable subsets of this corpus have been developed including the Temple University Hospital EEG Seizure Corpus (TUSZ) [2] and the Temple University Hospital EEG Artifact Corpus (TUAR) [3]. TUSZ contains manually annotated seizure events and has been widely used to develop seizure detection and prediction technology [4]. TUAR contains manually annotated artifacts and has been used to improve machine learning performance on seizure detection tasks [5]. In this poster, we will discuss recent improvements made to both corpora that are creating opportunities to improve machine learning performance. Two major concerns that were raised when v1.5.2 of TUSZ was released for the Neureka 2020 Epilepsy Challenge were: (1) the subjects contained in the training, development (validation) and blind evaluation sets were not mutually exclusive, and (2) high frequency seizures were not accurately annotated in all files. Regarding (1), there were 50 subjects in dev, 50 subjects in eval, and 592 subjects in train. There was one subject common to dev and eval, five subjects common to dev and train, and 13 subjects common between eval and train. Though this does not substantially influence performance for the current generation of technology, it could be a problem down the line as technology improves. Therefore, we have rebuilt the partitions of the data so that this overlap was removed. This required augmenting the evaluation and development data sets with new subjects that had not been previously annotated so that the size of these subsets remained approximately the same. Since these annotations were done by a new group of annotators, special care was taken to make sure the new annotators followed the same practices as the previous generations of annotators. Part of our quality control process was to have the new annotators review all previous annotations. This rigorous training coupled with a strict quality control process where annotators review a significant amount of each other’s work ensured that there is high interrater agreement between the two groups (kappa statistic greater than 0.8) [6]. In the process of reviewing this data, we also decided to split long files into a series of smaller segments to facilitate processing of the data. Some subscribers found it difficult to process long files using Python code, which tends to be very memory intensive. We also found it inefficient to manipulate these long files in our annotation tool. In this release, the maximum duration of any single file is limited to 60 mins. This increased the number of edf files in the dev set from 1012 to 1832. Regarding (2), as part of discussions of several issues raised by a few subscribers, we discovered some files only had low frequency epileptiform events annotated (defined as events that ranged in frequency from 2.5 Hz to 3 Hz), while others had events annotated that contained significant frequency content above 3 Hz. Though there were not many files that had this type of activity, it was enough of a concern to necessitate reviewing the entire corpus. An example of an epileptiform seizure event with frequency content higher than 3 Hz is shown in Figure 1. Annotating these additional events slightly increased the number of seizure events. In v1.5.2, there were 673 seizures, while in v1.5.3 there are 1239 events. One of the fertile areas for technology improvements is artifact reduction. Artifacts and slowing constitute the two major error modalities in seizure detection [3]. This was a major reason we developed TUAR. It can be used to evaluate artifact detection and suppression technology as well as multimodal background models that explicitly model artifacts. An issue with TUAR was the practicality of the annotation tags used when there are multiple simultaneous events. An example of such an event is shown in Figure 2. In this section of the file, there is an overlap of eye movement, electrode artifact, and muscle artifact events. We previously annotated such events using a convention that included annotating background along with any artifact that is present. The artifacts present would either be annotated with a single tag (e.g., MUSC) or a coupled artifact tag (e.g., MUSC+ELEC). When multiple channels have background, the tags become crowded and difficult to identify. This is one reason we now support a hierarchical annotation format using XML – annotations can be arbitrarily complex and support overlaps in time. Our annotators also reviewed specific eye movement artifacts (e.g., eye flutter, eyeblinks). Eye movements are often mistaken as seizures due to their similar morphology [7][8]. We have improved our understanding of ocular events and it has allowed us to annotate artifacts in the corpus more carefully. In this poster, we will present statistics on the newest releases of these corpora and discuss the impact these improvements have had on machine learning research. We will compare TUSZ v1.5.3 and TUAR v2.0.0 with previous versions of these corpora. We will release v1.5.3 of TUSZ and v2.0.0 of TUAR in Fall 2021 prior to the symposium. ACKNOWLEDGMENTS Research reported in this publication was most recently supported by the National Science Foundation’s Industrial Innovation and Partnerships (IIP) Research Experience for Undergraduates award number 1827565. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the official views of any of these organizations. REFERENCES [1] I. Obeid and J. Picone, “The Temple University Hospital EEG Data Corpus,” in Augmentation of Brain Function: Facts, Fiction and Controversy. Volume I: Brain-Machine Interfaces, 1st ed., vol. 10, M. A. Lebedev, Ed. Lausanne, Switzerland: Frontiers Media S.A., 2016, pp. 394 398. https://doi.org/10.3389/fnins.2016.00196. [2] V. Shah et al., “The Temple University Hospital Seizure Detection Corpus,” Frontiers in Neuroinformatics, vol. 12, pp. 1–6, 2018. https://doi.org/10.3389/fninf.2018.00083. [3] A. Hamid et, al., “The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts.” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1-3. https://ieeexplore.ieee.org/document/9353647. [4] Y. Roy, R. Iskander, and J. Picone, “The NeurekaTM 2020 Epilepsy Challenge,” NeuroTechX, 2020. [Online]. Available: https://neureka-challenge.com/. [Accessed: 01-Dec-2021]. [5] S. Rahman, A. Hamid, D. Ochal, I. Obeid, and J. Picone, “Improving the Quality of the TUSZ Corpus,” in Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), 2020, pp. 1–5. https://ieeexplore.ieee.org/document/9353635. [6] V. Shah, E. von Weltin, T. Ahsan, I. Obeid, and J. Picone, “On the Use of Non-Experts for Generation of High-Quality Annotations of Seizure Events,” Available: https://www.isip.picone press.com/publications/unpublished/journals/2019/elsevier_cn/ira. [Accessed: 01-Dec-2021]. [7] D. Ochal, S. Rahman, S. Ferrell, T. Elseify, I. Obeid, and J. Picone, “The Temple University Hospital EEG Corpus: Annotation Guidelines,” Philadelphia, Pennsylvania, USA, 2020. https://www.isip.piconepress.com/publications/reports/2020/tuh_eeg/annotations/. [8] D. Strayhorn, “The Atlas of Adult Electroencephalography,” EEG Atlas Online, 2014. [Online]. Availablmore » « less
-
Power outage prediction is important for planning electric power system response, restoration, and maintenance efforts. It is important for utility managers to understand the impact of outages on the local distribution infrastructure in order to develop appropriate maintenance and resilience measures. Power outage prediction models in literature are often limited in scope, typically tailored to model extreme weather related outage events. While these models are sufficient in predicting widespread outages from adverse weather events, they may fail to capture more frequent, non-weather related outages (NWO). In this study, we explore time series models of NWO by incorporating state-of-the-art techniques that leverage the Prophet model in Bayesian optimization and hierarchical forecasting. After defining a robust metric for NWO (non-weather outage count index, NWOCI), time series forecasting models that leverage advanced preprocessing and forecasting techniques in Kats and Prophet, respectively, were built and tested using six years of daily state- and county-level outage data in Massachusetts (MA). We develop a Prophet model with Bayesian True Parzen Estimator optimization (Prophet-TPE) using state-level outage data and a hierarchical Prophet-Bottom-Up model using county-level data. We find that these forecasting models outperform other Bayesian and hierarchical model combinations of Prophet and Seasonal Autoregressive Integrated Moving Average (SARIMA) models in predicting NWOCI at both county and state levels. Our time series trend decomposition reveals a concerning trend in the growth of NWO in MA. We conclude with a discussion of these observations and possible recommendations for mitigating NWO.more » « less