Abstract Series of univariate distributions indexed by equally spaced time points are ubiquitous in applications and their analysis constitutes one of the challenges of the emerging field of distributional data analysis. To quantify such distributional time series, we propose a class of intrinsic autoregressive models that operate in the space of optimal transport maps. The autoregressive transport models that we introduce here are based on regressing optimal transport maps on each other, where predictors can be transport maps from an overall barycenter to a current distribution or transport maps between past consecutive distributions of the distributional time series. Autoregressive transport models and their associated distributional regression models specify the link between predictor and response transport maps by moving along geodesics in Wasserstein space. These models emerge as natural extensions of the classical autoregressive models in Euclidean space. Unique stationary solutions of autoregressive transport models are shown to exist under a geometric moment contraction condition of Wu & Shao [(2004) Limit theorems for iterated random functions. Journal of Applied Probability 41, 425–436)], using properties of iterated random functions. We also discuss an extension to a varying coefficient model for first-order autoregressive transport models. In addition to simulations, the proposed models are illustrated with distributional time series of house prices across U.S. counties and annual summer temperature distributions. 
                        more » 
                        « less   
                    This content will become publicly available on February 13, 2026
                            
                            On the Optimal Prediction of Extreme Events in Heavy‐Tailed Time Series With Applications to Solar Flare Forecasting
                        
                    
    
            ABSTRACT The prediction of extreme events in time series is a fundamental problem arising in many financial, scientific, engineering, and other applications. We begin by establishing a general Neyman–Pearson‐type characterization of optimal extreme event predictors in terms of density ratios. This yields new insights and several closed‐form optimal extreme event predictors for additive models. These results naturally extend to time series, where we study optimal extreme event prediction for both light‐ and heavy‐tailed autoregressive and moving average models. Using a uniform law of large numbers for ergodic time series, we establish the asymptotic optimality of an empirical version of the optimal predictor for autoregressive models. Using multivariate regular variation, we obtain an expression for the optimal extremal precision in heavy‐tailed infinite moving averages, which provides theoretical bounds on the ability to predict extremes in this general class of models. We address the important problem of predicting solar flares by applying our theory and methodology to a state‐of‐the‐art time series consisting of solar soft x‐ray flux measurements. Our results demonstrate the success and limitations in solar flare forecasting of long‐memory autoregressive models and long‐range‐dependent, heavy‐tailed FARIMA models. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10620887
- Publisher / Repository:
- https://doi.org/10.1111/jtsa.12820
- Date Published:
- Journal Name:
- Journal of Time Series Analysis
- ISSN:
- 0143-9782
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract Wildfire risk is greatest during high winds after sustained periods of dry and hot conditions. This paper is a statistical extreme-event risk attribution study that aims to answer whether extreme wildfire seasons are more likely now than under past climate. This requires modeling temporal dependence at extreme levels. We propose the use of transformed-linear time series models, which are constructed similarly to traditional autoregressive–moving-average (ARMA) models while having a dependence structure that is tied to a widely used framework for extremes (regular variation). We fit the models to the extreme values of the seasonally adjusted fire weather index (FWI) time series to capture the dependence in the upper tail for past and present climate. We simulate 10 000 fire seasons from each fitted model and compare the proportion of simulated high-risk fire seasons to quantify the increase in risk. Our method suggests that the risk of experiencing an extreme wildfire season in Grand Lake, Colorado, under current climate has increased dramatically relative to the risk under the climate of the mid-twentieth century. Our method also finds some evidence of increased risk of extreme wildfire seasons in Quincy, California, but large uncertainties do not allow us to reject a null hypothesis of no change.more » « less
- 
            We study offline reinforcement learning (RL) with heavy-tailed reward distribution and data corruption: (i) Moving beyond subGaussian reward distribution, we allow the rewards to have infinite variances; (ii) We allow corruptions where an attacker can arbitrarily modify a small fraction of the rewards and transitions in the dataset. We first derive a sufficient optimality condition for generalized Pessimistic Value Iteration (PEVI), which allows various estimators with proper confidence bounds and can be applied to multiple learning settings. In order to handle the data corruption and heavy-tailed reward setting, we prove that the trimmed-mean estimation achieves the minimax optimal error rate for robust mean estimation under heavy-tailed distributions. In the PEVI algorithm, we plug in the trimmed mean estimation and the confidence bound to solve the robust offline RL problem. Standard analysis reveals that data corruption induces a bias term in the suboptimality gap, which gives the false impression that any data corruption prevents optimal policy learning. By using the optimality condition for the generalized PEVI, we show that as long as the bias term is less than the ``action gap'', the policy returned by PEVI achieves the optimal value given sufficient data.more » « less
- 
            Zhang, Aidong; Rangwala, Huzefa (Ed.)Zero-inflated, heavy-tailed spatiotemporal data is common across science and engineering, from climate science to meteorology and seismology. A central modeling objective in such settings is to forecast the intensity, frequency, and timing of extreme and non-extreme events; yet in the context of deep learning, this objective presents several key challenges. First, a deep learning framework applied to such data must unify a mixture of distributions characterizing the zero events, moderate events, and extreme events. Second, the framework must be capable of enforcing parameter constraints across each component of the mixture distribution. Finally, the framework must be flexible enough to accommodate for any changes in the threshold used to define an extreme event after training. To address these challenges, we propose Deep Extreme Mixture Model (DEMM), fusing a deep learning-based hurdle model with extreme value theory to enable point and distribution prediction of zero-inflated, heavy-tailed spatiotemporal variables. The framework enables users to dynamically set a threshold for defining extreme events at inference-time without the need for retraining. We present an extensive experimental analysis applying DEMM to precipitation forecasting, and observe significant improvements in point and distribution prediction. All code is available at https://github.com/andrewmcdonald27/DeepExtremeMixtureModel.more » « less
- 
            We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kth-moment bound on the Lipschitz constants of sample functions rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G2⋅1n√+Gk⋅(d√nϵ)1−1k under (ϵ,δ)-approximate differential privacy, up to a mild $$\textup{polylog}(\frac{1}{\delta})$$ factor, where G22 and Gkk are the 2nd and kth moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [Lowy and Razaviyayn 2023]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
