skip to main content

Title: Gaussian-Dirichlet Random Fields for Inference over High Dimensional Categorical Observations
We propose a generative model for the spatiotemporal distribution of high-dimensional categorical observations. These are commonly produced by robots equipped with an imaging sensor such as a camera, paired with an image classifier, potentially producing observations over thousands of categories. The proposed approach combines the use of Dirichlet distributions to model sparse co-occurrence relations between the observed categories using a latent variable, and Gaussian processes to model the latent variable’s spatiotemporal distribution. Experiments in this paper show that the resulting model is able to efficiently and accurately approximate the temporal distribution of high dimensional categorical measurements such as taxonomic observations of microscopic organisms in the ocean, even in unobserved (held out) locations, far from other samples. This work’s primary motivation is to enable the deployment of informative path planning techniques over high dimensional categorical fields, which until now have been limited to scalar or low dimensional vector observations.
Authors:
; ;
Award ID(s):
1734400 1655686
Publication Date:
NSF-PAR ID:
10206411
Journal Name:
2020 IEEE International Conference on Robotics and Automation (ICRA)
Page Range or eLocation-ID:
2924 to 2931
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these “partial” observations must be modified to include the missing data mechanism to avoid spurious inference.

    We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data.

    We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk (Cervus elaphus nelsoni) to demonstrate improved inference for the proportionsmore »of sex and stage classes.

    We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.

    « less
  2. Abstract High-dimensional categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable parsimonious models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference in such scenarios, yet is challenging to address when there are complex latent structures. In this article, we propose a class of identifiable multilayer (potentially deep) discrete latent structure models for discrete data, termed Bayesian Pyramids. We establish the identifiability of Bayesian Pyramids by developing novel transparent conditions on the pyramid-shaped deep latent directed graph. The proposed identifiability conditions can ensure Bayesian posterior consistency under suitable priors. As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulation results for this model corroborate the identifiability and estimatability of model parameters. Applications of the methodology to DNA nucleotide sequence data uncover useful discrete latent features that are highly predictive of sequence types. The proposed framework provides a recipe for interpretable unsupervised learning of discrete data and can be a useful alternative to popular machine learning methods.
  3. Abstract

    An Arctic sea ice‐ocean model is run with three uniform horizontal resolutions (6, 4, and 2 km) and identical sea ice and ocean model parameterizations, including an isotropic viscous‐plastic sea ice rheology, a mechanical ice strength parameterization, and an ice ridging parameterization. Driven by the same atmospheric forcing, the three model versions all produce similar spatial patterns and temporal variations of ice thickness and motion fields, resulting in almost identical magnitude and seasonal evolution of total ice volume and mean ice concentration, ice speed, and fractions of ice of various thickness categories over the Arctic Ocean. Increasing model resolution from 6 to 2 km does not significantly improve model performance when compared to NASA IceBridge ice thickness observations. This suggests that the large‐scale sea ice properties of the model are insensitive to varying high resolutions within the multifloe scale (2–10 km), and it may be unnecessary to adjust model parameters constantly with increasingly high resolutions. This is also true with models within the aggregate scale (10–75 km), indicating that model parameters used at coarse resolution may be used at high or multiscale resolution. However, even though the three versions all yield similar mean state of sea ice, they differ in representing anisotropic propertiesmore »of sea ice. While they produce a basic pattern of major sea ice leads similar to satellite observations, their leads are distributed differently in space and time. Without changing model parameters and sea ice spatiotemporal variability, the 2‐km resolution model tends to capture more leads than the other two models.

    « less
  4. Abstract

    Multivariate spatially oriented data sets are prevalent in the environmental and physical sciences. Scientists seek to jointly model multiple variables, each indexed by a spatial location, to capture any underlying spatial association for each variable and associations among the different dependent variables. Multivariate latent spatial process models have proved effective in driving statistical inference and rendering better predictive inference at arbitrary locations for the spatial process. High‐dimensional multivariate spatial data, which are the theme of this article, refer to data sets where the number of spatial locations and the number of spatially dependent variables is very large. The field has witnessed substantial developments in scalable models for univariate spatial processes, but such methods for multivariate spatial processes, especially when the number of outcomes are moderately large, are limited in comparison. Here, we extend scalable modeling strategies for a single process to multivariate processes. We pursue Bayesian inference, which is attractive for full uncertainty quantification of the latent spatial process. Our approach exploits distribution theory for the matrix‐normal distribution, which we use to construct scalable versions of a hierarchical linear model of coregionalization (LMC) and spatial factor models that deliver inference over a high‐dimensional parameter space including the latent spatialmore »process. We illustrate the computational and inferential benefits of our algorithms over competing methods using simulation studies and an analysis of a massive vegetation index data set.

    « less
  5. Abstract

    Many modern sea ice models used in global climate models represent the subgrid‐scale heterogeneity in sea ice thickness with an ice thickness distribution (ITD), which improves model realism by representing the significant impact of the high spatial heterogeneity of sea ice thickness on thermodynamic and dynamic processes. Most models default to five thickness categories. However, little has been done to explore the effects of the resolution of this distribution (number of categories) on sea‐ice feedbacks in a coupled model framework and resulting representation of the sea ice mean state. Here, we explore this using sensitivity experiments in CESM2 with the standard 5 ice thickness categories and 15 ice thickness categories. Increasing the resolution of the ITD in a run with preindustrial climate forcing results in substantially thicker Arctic sea ice year‐round. Analyses show that this is a result of the ITD influence on ice strength. With 15 ITD categories, weaker ice occurs for the same average thickness, resulting in a higher fraction of ridged sea ice. In contrast, the higher resolution of thin ice categories results in enhanced heat conduction and bottom growth and leads to only somewhat increased winter Antarctic sea ice volume. The spatial resolution of themore »ICESat‐2 satellite mission provides a new opportunity to compare model outputs with observations of seasonal evolution of the ITD in the Arctic (ICESat‐2; 2018–2021). Comparisons highlight significant differences from the ITD modeled with both runs over this period, likely pointing to underlying issues contributing to the representation of average thickness.

    « less