skip to main content

Title: Spatial Difference Boundary Detection for Multiple Outcomes Using Bayesian Disease Mapping

Regional aggregates of health outcomes over delineated administrative units (e.g., states, counties, and zip codes), or areal units, are widely used by epidemiologists to map mortality or incidence rates and capture geographic variation. To capture health disparities over regions, we seek “difference boundaries” that separate neighboring regions with significantly different spatial effects. Matters are more challenging with multiple outcomes over each unit, where we capture dependence among diseases as well as across the areal units. Here, we address multivariate difference boundary detection for correlated diseases. We formulate the problem in terms of Bayesian pairwise multiple comparisons and seek the posterior probabilities of neighboring spatial effects being different. To achieve this, we endow the spatial random effects with a discrete probability law using a class of multivariate areally referenced Dirichlet process models that accommodate spatial and interdisease dependence. We evaluate our method through simulation studies and detect difference boundaries for multiple cancers using data from the Surveillance, Epidemiology, and End Results Program of the National Cancer Institute.

more » « less
Author(s) / Creator(s):
; ;
Publisher / Repository:
Oxford University Press
Date Published:
Journal Name:
Medium: X Size: p. 922-944
["p. 922-944"]
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Disease mapping is an important statistical tool used by epidemiologists to assess geographic variation in disease rates and identify lurking environmental risk factors from spatial patterns. Such maps rely upon spatial models for regionally aggregated data, where neighboring regions tend to exhibit similar outcomes than those farther apart. We contribute to the literature on multivariate disease mapping, which deals with measurements on multiple (two or more) diseases in each region. We aim to disentangle associations among the multiple diseases from spatial autocorrelation in each disease. We develop multivariate directed acyclic graphical autoregression models to accommodate spatial and inter‐disease dependence. The hierarchical construction imparts flexibility and richness, interpretability of spatial autocorrelation and inter‐disease relationships, and computational ease, but depends upon the order in which the cancers are modeled. To obviate this, we demonstrate how Bayesian model selection and averaging across orders are easily achieved using bridge sampling. We compare our method with a competitor using simulation studies and present an application to multiple cancer mapping using data from the Surveillance, Epidemiology, and End Results program.

    more » « less
  2. Abstract

    Infectious diseases continue to pose a significant threat to the health of humans globally. While the spread of pathogens transcends geographical boundaries, the management of infectious diseases typically occurs within distinct spatial units, determined by geopolitical boundaries. The allocation of management resources within and across regions (the “governance structure”) can affect epidemiological outcomes considerably, and policy-makers are often confronted with a choice between applying control measures uniformly or differentially across regions. Here, we investigate the extent to which uniform and non-uniform governance structures affect the costs of an infectious disease outbreak in two-patch systems using an optimal control framework. A uniform policy implements control measures with the same time varying rate functions across both patches, while these measures are allowed to differ between the patches in a non-uniform policy. We compare results from two systems of differential equations representing transmission of cholera and Ebola, respectively, to understand the interplay between transmission mode, governance structure and the optimal control of outbreaks. In our case studies, the governance structure has a meaningful impact on the allocation of resources and burden of cases, although the difference in total costs is minimal. Understanding how governance structure affects both the optimal control functions and epidemiological outcomes is crucial for the effective management of infectious diseases going forward.

    more » « less
  3. Abstract

    Multivariate spatially oriented data sets are prevalent in the environmental and physical sciences. Scientists seek to jointly model multiple variables, each indexed by a spatial location, to capture any underlying spatial association for each variable and associations among the different dependent variables. Multivariate latent spatial process models have proved effective in driving statistical inference and rendering better predictive inference at arbitrary locations for the spatial process. High‐dimensional multivariate spatial data, which are the theme of this article, refer to data sets where the number of spatial locations and the number of spatially dependent variables is very large. The field has witnessed substantial developments in scalable models for univariate spatial processes, but such methods for multivariate spatial processes, especially when the number of outcomes are moderately large, are limited in comparison. Here, we extend scalable modeling strategies for a single process to multivariate processes. We pursue Bayesian inference, which is attractive for full uncertainty quantification of the latent spatial process. Our approach exploits distribution theory for the matrix‐normal distribution, which we use to construct scalable versions of a hierarchical linear model of coregionalization (LMC) and spatial factor models that deliver inference over a high‐dimensional parameter space including the latent spatial process. We illustrate the computational and inferential benefits of our algorithms over competing methods using simulation studies and an analysis of a massive vegetation index data set.

    more » « less
  4. Abstract

    Joint modeling of spatially oriented dependent variables is commonplace in the environmental sciences, where scientists seek to estimate the relationships among a set of environmental outcomes accounting for dependence among these outcomes and the spatial dependence for each outcome. Such modeling is now sought for massive data sets with variables measured at a very large number of locations. Bayesian inference, while attractive for accommodating uncertainties through hierarchical structures, can become computationally onerous for modeling massive spatial data sets because of its reliance on iterative estimation algorithms. This article develops a conjugate Bayesian framework for analyzing multivariate spatial data using analytically tractable posterior distributions that obviate iterative algorithms. We discuss differences between modeling the multivariate response itself as a spatial process and that of modeling a latent process in a hierarchical model. We illustrate the computational and inferential benefits of these models using simulation studies and analysis of a vegetation index data set with spatially dependent observations numbering in the millions.

    more » « less
  5. Nearly 20% of tropical forests are within 100 m of a nonforest edge, a consequence of rapid deforestation for agriculture. Despite widespread conversion, roughly 1.2 billion ha of tropical forest remain, constituting the largest terrestrial component of the global carbon budget. Effects of deforestation on carbon dynamics in remnant forests, and spatial variation in underlying changes in structure and function at the plant scale, remain highly uncertain. Using airborne imaging spectroscopy and light detection and ranging (LiDAR) data, we mapped and quantified changes in forest structure and foliar characteristics along forest/oil palm boundaries in Malaysian Borneo to understand spatial and temporal variation in the influence of edges on aboveground carbon and associated changes in ecosystem structure and function. We uncovered declines in aboveground carbon averaging 22% along edges that extended over 100 m into the forest. Aboveground carbon losses were correlated with significant reductions in canopy height and leaf mass per area and increased foliar phosphorus, three plant traits related to light capture and growth. Carbon declines amplified with edge age. Our results indicate that carbon losses along forest edges can arise from multiple, distinct effects on canopy structure and function that vary with edge age and environmental conditions, pointing to a need for consideration of differences in ecosystem sensitivity when developing land-use and conservation strategies. Our findings reveal that, although edge effects on ecosystem structure and function vary, forests neighboring agricultural plantations are consistently vulnerable to long-lasting negative effects on fundamental ecosystem characteristics controlling primary productivity and carbon storage.

    more » « less