Abstract Machine learning (ML) has been applied to space weather problems with increasing frequency in recent years, driven by an influx of in-situ measurements and a desire to improve modeling and forecasting capabilities throughout the field. Space weather originates from solar perturbations and is comprised of the resulting complex variations they cause within the numerous systems between the Sun and Earth. These systems are often tightly coupled and not well understood. This creates a need for skillful models with knowledge about the confidence of their predictions. One example of such a dynamical system highly impacted by space weather is the thermosphere, the neutral region of Earth’s upper atmosphere. Our inability to forecast it has severe repercussions in the context of satellite drag and computation of probability of collision between two space objects in low Earth orbit (LEO) for decision making in space operations. Even with (assumed) perfect forecast of model drivers, our incomplete knowledge of the system results in often inaccurate thermospheric neutral mass density predictions. Continuing efforts are being made to improve model accuracy, but density models rarely provide estimates of confidence in predictions. In this work, we propose two techniques to develop nonlinear ML regression models to predict thermospheric density while providing robust and reliable uncertainty estimates: Monte Carlo (MC) dropout and direct prediction of the probability distribution, both using the negative logarithm of predictive density (NLPD) loss function. We show the performance capabilities for models trained on both local and global datasets. We show that the NLPD loss provides similar results for both techniques but the direct probability distribution prediction method has a much lower computational cost. For the global model regressed on the Space Environment Technologies High Accuracy Satellite Drag Model (HASDM) density database, we achieve errors of approximately 11% on independent test data with well-calibrated uncertainty estimates. Using an in-situ CHAllenging Minisatellite Payload (CHAMP) density dataset, models developed using both techniques provide test error on the order of 13%. The CHAMP models—on validation and test data—are within 2% of perfect calibration for the twenty prediction intervals tested. We show that this model can also be used to obtain global density predictions with uncertainties at a given epoch.
more »
« less
This content will become publicly available on May 6, 2026
The 2023/24 VIEWS Prediction challenge: Predicting the number of fatalities in armed conflict, with uncertainty
Governmental and nongovernmental organizations have increasingly relied on early-warning systems of conflict to support their decisionmaking. Predictions of war intensity as probability distributions prove closer to what policymakers need than point estimates, as they encompass useful representations of both the most likely outcome and the lower-probability risk that conflicts escalate catastrophically. Point-estimate predictions, by contrast, fail to represent the inherent uncertainty in the distribution of conflict fatalities. Yet, current early warning systems are preponderantly focused on providing point estimates, while efforts to forecast conflict fatalities as a probability distribution remain sparse. Building on the predecessor VIEWS competition, we organize a prediction challenge to encourage endeavours in this direction. We invite researchers across multiple disciplinary fields, from conflict studies to computer science, to forecast the number of fatalities in state-based armed conflicts, in the form of the UCDP ‘best’ estimates aggregated to two units of analysis (country-months and PRIO-GRID-months), with estimates of uncertainty. This article introduces the goal and motivation behind the prediction challenge, presents a set of evaluation metrics to assess the performance of the forecasting models, describes the benchmark models which the contributions are evaluated against, and summarizes the salient features of the submitted contributions.
more »
« less
- Award ID(s):
- 2311142
- PAR ID:
- 10600872
- Author(s) / Creator(s):
- ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »
- Publisher / Repository:
- Sage
- Date Published:
- Journal Name:
- Journal of Peace Research
- ISSN:
- 0022-3433
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Heatwaves are extreme near-surface temperature events that can have substantial impacts on ecosystems and society. Early warning systems help to reduce these impacts by helping communities prepare for hazardous climate-related events. However, state-of-the-art prediction systems can often not make accurate forecasts of heatwaves more than two weeks in advance, which are required for advance warnings. We therefore investigate the potential of statistical and machine learning methods to understand and predict central European summer heatwaves on time scales of several weeks. As a first step, we identify the most important regional atmospheric and surface predictors based on previous studies and supported by a correlation analysis: 2-m air temperature, 500-hPa geopotential, precipitation, and soil moisture in central Europe, as well as Mediterranean and North Atlantic sea surface temperatures, and the North Atlantic jet stream. Based on these predictors, we apply machine learning methods to forecast two targets: summer temperature anomalies and the probability of heatwaves for 1–6 weeks lead time at weekly resolution. For each of these two target variables, we use both a linear and a random forest model. The performance of these statistical models decays with lead time, as expected, but outperforms persistence and climatology at all lead times. For lead times longer than two weeks, our machine learning models compete with the ensemble mean of the European Centre for Medium-Range Weather Forecast’s hindcast system. We thus show that machine learning can help improve subseasonal forecasts of summer temperature anomalies and heatwaves. Significance StatementHeatwaves (prolonged extremely warm temperatures) cause thousands of fatalities worldwide each year. These damaging events are becoming even more severe with climate change. This study aims to improve advance predictions of summer heatwaves in central Europe by using statistical and machine learning methods. Machine learning models are shown to compete with conventional physics-based models for forecasting heatwaves more than two weeks in advance. These early warnings can be used to activate effective and timely response plans targeting vulnerable communities and regions, thereby reducing the damage caused by heatwaves.more » « less
-
Time series forecasting with additional spatial information has attracted a tremendous amount of attention in recent research, due to its importance in various real-world applications on social studies, such as conflict prediction and pandemic forecasting. Conventional machine learning methods either consider temporal dependencies only, or treat spatial and temporal relations as two separate autoregressive models, namely, space-time autoregressive models. Such methods suffer when it comes to long-term forecasting or predictions for large-scale areas, due to the high nonlinearity and complexity of spatio-temporal data. In this paper, we propose to address these challenges using spatio-temporal graph neural networks. Empirical results on Violence Early Warning System (ViEWS) dataset and U.S. Covid-19 dataset indicate that our method significantly improved performance over the baseline approaches.more » « less
-
Abstract Many fish and marine organisms are responding to our planet’s changing climate by shifting their distribution. Such shifts can drive international conflicts and are highly problematic for the communities and businesses that depend on these living marine resources. Advances in climate prediction mean that in some regions the drivers of these shifts can be forecast up to a decade ahead, although forecasts of distribution shifts on this critical time-scale, while highly sought after by stakeholders, have yet to materialise. Here, we demonstrate the application of decadal-scale climate predictions to the habitat and distribution of marine fish species. We show statistically significant forecast skill of individual years that outperform baseline forecasts 3–10 years ahead; forecasts of multi-year averages perform even better, yielding correlation coefficients in excess of 0.90 in some cases. We also demonstrate that the habitat shifts underlying conflicts over Atlantic mackerel fishing rights could have been foreseen. Our results show that climate predictions can provide information of direct relevance to stakeholders on the decadal-scale. This tool will be critical in foreseeing, adapting to and coping with the challenges of a changing future climate, particularly in the most ocean-dependent nations and communities.more » « less
-
Abstract Near‐term ecological forecasts provide resource managers advance notice of changes in ecosystem services, such as fisheries stocks, timber yields, or water quality. Importantly, ecological forecasts can identify where there is uncertainty in the forecasting system, which is necessary to improve forecast skill and guide interpretation of forecast results. Uncertainty partitioning identifies the relative contributions to total forecast variance introduced by different sources, including specification of the model structure, errors in driver data, and estimation of current states (initial conditions). Uncertainty partitioning could be particularly useful in improving forecasts of highly variable cyanobacterial densities, which are difficult to predict and present a persistent challenge for lake managers. As cyanobacteria can produce toxic and unsightly surface scums, advance warning when cyanobacterial densities are increasing could help managers mitigate water quality issues. Here, we fit 13 Bayesian state‐space models to evaluate different hypotheses about cyanobacterial densities in a low nutrient lake that experiences sporadic surface scums of the toxin‐producing cyanobacterium,Gloeotrichia echinulata. We used data from several summers of weekly cyanobacteria samples to identify dominant sources of uncertainty for near‐term (1‐ to 4‐week) forecasts ofG. echinulatadensities. Water temperature was an important predictor of cyanobacterial densities during model fitting and at the 4‐week forecast horizon. However, no physical covariates improved model performance over a simple model including the previous week's densities in 1‐week‐ahead forecasts. Even the best fit models exhibited large variance in forecasted cyanobacterial densities and did not capture rare peak occurrences, indicating that significant explanatory variables when fitting models to historical data are not always effective for forecasting. Uncertainty partitioning revealed that model process specification and initial conditions dominated forecast uncertainty. These findings indicate that long‐term studies of different cyanobacterial life stages and movement in the water column as well as measurements of drivers relevant to different life stages could improve model process representation of cyanobacteria abundance. In addition, improved observation protocols could better define initial conditions and reduce spatial misalignment of environmental data and cyanobacteria observations. Our results emphasize the importance of ecological forecasting principles and uncertainty partitioning to refine and understand predictive capacity across ecosystems.more » « less
An official website of the United States government
