skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Interpretable Hawkes Process Spatial Crime Forecasting with TV-Regularization
Interpretable models for criminal justice forecasting are desirable due to the high-stakes nature of the application. While interpretable models have been developed for individual level forecasts of recidivism, interpretable models are lacking for the application of space-time crime hotspot forecasting. Here we introduce an interpretable Hawkes process model of crime that allows forecasts to capture near-repeat effects and spatial heterogeneity while being consumable in the form of easy-to-read score cards. For this purpose we employ penalized likelihood estimation of the point process with a total-variation regularization that enforces the triggering kernel to be piece-wise constant. We derive an efficient expectation-maximization algorithm coupled with forward backward splitting for the TV constraint to estimate the model. We apply our methodology to synthetic data and space-time crime data from Indianapolis. The TV-Hawkes process achieves similar accuracy to standard Hawkes process models of crime while increasing interpretability and transparency.  more » « less
Award ID(s):
1737585 1737996
PAR ID:
10276747
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
IEEE International Conference on Big Data
Volume:
2020
ISSN:
2639-1589
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    As new grid edge technologies emerge—such as rooftop solar panels, battery storage, and controllable water heaters—quantifying the uncertainties of building load forecasts is becoming more critical. The recent adoption of smart meter infrastructures provided new granular data streams, largely unavailable just ten years ago, that can be utilized to better forecast building-level demand. This paper uses Bayesian Structural Time Series for probabilistic load forecasting at the residential building level to capture uncertainties in forecasting. We use sub-hourly electrical submeter data from 120 residential apartments in Singapore that were part of a behavioral intervention study. The proposed model addresses several fundamental limitations through its flexibility to handle univariate and multivariate scenarios, perform feature selection, and include either static or dynamic effects, as well as its inherent applicability for measurement and verification. We highlight the benefits of this process in three main application areas: (1) Probabilistic Load Forecasting for Apartment-Level Hourly Loads; (2) Submeter Load Forecasting and Segmentation; (3) Measurement and Verification for Behavioral Demand Response. Results show the model achieves a similar performance to ARIMA, another popular time series model, when predicting individual apartment loads, and superior performance when predicting aggregate loads. Furthermore, we show that the model robustly captures uncertainties in the forecasts while providing interpretable results, indicating the importance of, for example, temperature data in its predictions. Finally, our estimates for a behavioral demand response program indicate that it achieved energy savings; however, the confidence interval provided by the probabilistic model is wide. Overall, this probabilistic forecasting model accurately measures uncertainties in forecasts and provides interpretable results that can support building managers and policymakers with the goal of reducing energy use. 
    more » « less
  2. Networks and temporal point processes serve as fundamental building blocks for modeling complex dynamic relational data in various domains. We propose the latent space Hawkes (LSH) model, a novel generative model for continuous-time networks of relational events, using a latent space representation for nodes. We model relational events between nodes using mutually exciting Hawkes processes with baseline intensities dependent upon the distances between the nodes in the latent space and sender and receiver specific effects. We demonstrate that our proposed LSH model can replicate many features observed in real temporal networks including reciprocity and transitivity, while also achieving superior prediction accuracy and providing more interpretable fits than existing models. 
    more » « less
  3. We present a framework for spatio-temporal (ST) data modeling, analysis, and forecasting, with a focus on data that is sparse in space and time. Our multi-scaled framework couples two components: a self-exciting point process that models the macroscale statistical behaviors of the ST data and a graph structured recurrent neural network (GSRNN) to discover the microscale patterns of the ST data on the inferred graph. This novel deep neural network (DNN) incorporates the real time interactions of the graph nodes to enable more accurate real time forecasting. The effectiveness of our method is demonstrated on both crime and traffic forecasting. 
    more » « less
  4. Abstract Hierarchical probability models are being used more often than non-hierarchical deterministic process models in environmental prediction and forecasting, and Bayesian approaches to fitting such models are becoming increasingly popular. In particular, models describing ecosystem dynamics with multiple states that are autoregressive at each step in time can be treated as statistical state space models (SSMs). In this paper, we examine this subset of ecosystem models, embed a process-based ecosystem model into an SSM, and give closed form Gibbs sampling updates for latent states and process precision parameters when process and observation errors are normally distributed. Here, we use simulated data from an example model (DALECev) and study the effects changing the temporal resolution of observations on the states (observation data gaps), the temporal resolution of the state process (model time step), and the level of aggregation of observations on fluxes (measurements of transfer rates on the state process). We show that parameter estimates become unreliable as temporal gaps between observed state data increase. To improve parameter estimates, we introduce a method of tuning the time resolution of the latent states while still using higher-frequency driver information and show that this helps to improve estimates. Further, we show that data cloning is a suitable method for assessing parameter identifiability in this class of models. Overall, our study helps inform the application of state space models to ecological forecasting applications where (1) data are not available for all states and transfers at the operational time step for the ecosystem model and (2) process uncertainty estimation is desired. 
    more » « less
  5. Abstract Near‐term, iterative ecological forecasts can be used to help understand and proactively manage ecosystems. To date, more forecasts have been developed for aquatic ecosystems than other ecosystems worldwide, likely motivated by the pressing need to conserve these essential and threatened ecosystems and increasing the availability of high‐frequency data. Forecasters have implemented many different modeling approaches to forecast freshwater variables, which have demonstrated promise at individual sites. However, a comprehensive analysis of the performance of varying forecast models across multiple sites is needed to understand broader controls on forecast performance. Forecasting challenges (i.e., community‐scale efforts to generate forecasts while also developing shared software, training materials, and best practices) present a useful platform for bridging this gap to evaluate how a range of modeling methods perform across axes of space, time, and ecological systems. Here, we analyzed forecasts from the aquatics theme of the National Ecological Observatory Network (NEON) Forecasting Challenge hosted by the Ecological Forecasting Initiative. Over 100,000 probabilistic forecasts of water temperature and dissolved oxygen concentration for 1–30 days ahead across seven NEON‐monitored lakes were submitted in 2023. We assessed how forecast performance varied among models with different structures, covariates, and sources of uncertainty relative to baseline null models. A similar proportion of forecast models were skillful across both variables (34%–40%), although more individual models outperformed the baseline models in forecasting water temperature (10 models out of 29) than dissolved oxygen (6 models out of 15). These top performing models came from a range of classes and structures. For water temperature, we found that forecast skill degraded with increases in forecast horizons, process‐based models, and models that included air temperature as a covariate generally exhibited the highest forecast performance, and that the most skillful forecasts often accounted for more sources of uncertainty than the lower performing models. The most skillful forecasts were for sites where observations were most divergent from historical conditions (resulting in poor baseline model performance). Overall, the NEON Forecasting Challenge provides an exciting opportunity for a model intercomparison to learn about the relative strengths of a diverse suite of models and advance our understanding of freshwater ecosystem predictability. 
    more » « less