skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Multimodal Learning Models for Traffic Datasets
Predictive routing is effective in knowledge transfer. However, it ignores information gained from probability distributions with more than one peak. We introduce traffic multimodal information learning, a new class of transportation decision-making models that can learn and transfer online information from multiple simultaneous observations of a probability distribution with multiple peaks or multiple outcome variables from one time stage to the next. Multimodal learning improves the scientific and engineering value of autonomous vehicles by determining the best routes based on the intended level of exploration, risk, and limits.  more » « less
Award ID(s):
1910397
PAR ID:
10350042
Author(s) / Creator(s):
Date Published:
Journal Name:
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022) Undergraduate Consortium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Recent advances in computer vision for space exploration have handled prediction uncertainties well by approximating multimodal output distribution rather than averaging the distribution. While those advanced multimodal deep learning models could enhance the scientific and engineering value of autonomous systems by making the optimal decisions in uncertain environments, sequential learning of those approximated information has depended on unimodal or bimodal probability distribution. In a sequence of information learning and transfer decisions, the traditional reinforcement learning cannot accommodate the noise in the data that could be useful for gaining information from other locations, thus cannot handle multimodal and multivariate gains in their transition function. Still, there is a lack of interest in learning and transferring multimodal space information effectively to maximally remove the uncertainty. In this study, a new information theory overcomes the traditional entropy approach by actively sensing and learning information in a sequence. Particularly, the autonomous navigation of a team of heterogeneous unmanned ground and aerial vehicle systems in Mars outperforms benchmarks through indirect learning. 
    more » « less
  2. Recent advances in computer vision for space exploration have handled prediction uncertainties well by approximating multimodal output distribution rather than averaging the distribution. While those advanced multimodal deep learning models could enhance the scientific and engineering value of autonomous systems by making the optimal decisions in uncertain environments, sequential learning of those approximated information has depended on unimodal or bimodal probability distribution. In a sequence of information learning and transfer decisions, the traditional reinforcement learning cannot accommodate the noise in the data that could be useful for gaining information from other locations, thus cannot handle multimodal and multivariate gains in their transition function. Still, there is a lack of interest in learning and transferring multimodal space information effectively to maximally remove the uncertainty. In this study, a new information theory overcomes the traditional entropy approach by actively sensing and learning information in a sequence. Particularly, the autonomous navigation of a team of heterogeneous unmanned ground and aerial vehicle systems in Mars outperforms benchmarks through indirect learning. 
    more » « less
  3. We introduce temporal multimodal multivariate learning, a new family of decision making models that can indirectly learn and transfer online information from simultaneous observations of a probability distribution with more than one peak or more than one outcome variable from one time stage to another. We approximate the posterior by sequentially removing additional uncertainties across different variables and time, based on data-physics driven correlation, to address a broader class of challenging time-dependent decision-making problems under uncertainty. Extensive experiments on real-world datasets ( i.e., urban traffic data and hurricane ensemble forecasting data) demonstrate the superior performance of the proposed targeted decision-making over the state-of-the-art baseline prediction methods across various settings. 
    more » « less
  4. Learning multimodal representations is a fundamentally complex research problem due to the presence of multiple heterogeneous sources of information. Although the presence of multiple modalities provides additional valuable information, there are two key challenges to address when learning from multimodal data: 1) models must learn the complex intra-modal and cross-modal interactions for prediction and 2) models must be robust to unexpected missing or noisy modalities during testing. In this paper, we propose to optimize for a joint generative-discriminative objective across multimodal data and labels. We introduce a model that factorizes representations into two sets of independent factors: multimodal discriminative and modality-specific generative factors. Multimodal discriminative factors are shared across all modalities and contain joint multimodal features required for discriminative tasks such as sentiment prediction. Modality-specific generative factors are unique for each modality and contain the information required for generating data. Experimental results show that our model is able to learn meaningful multimodal representations that achieve state-of-the-art or competitive performance on six multimodal datasets. Our model demonstrates flexible generative capabilities by conditioning on independent factors and can reconstruct missing modalities without significantly impacting performance. Lastly, we interpret our factorized representations to understand the interactions that influence multimodal learning. 
    more » « less
  5. For robots to seamlessly interact with humans, we first need to make sure that humans and robots understand one another. Diverse algorithms have been developed to enable robots to learn from humans (i.e., transferring information from humans to robots). In parallel, visual, haptic, and auditory communication interfaces have been designed to convey the robot’s internal state to the human (i.e., transferring information from robots to humans). Prior research often separates these two directions of information transfer, and focuses primarily on either learning algorithms or communication interfaces. By contrast, in this survey we take an interdisciplinary approach to identify common themes and emerging trends that close the loop between learning and communication. Specifically, we survey state-of-the-art methods and outcomes for communicating a robot’s learning back to the human teacher during human-robot interaction. This discussion connects human-in-the-loop learning methods and explainable robot learning with multimodal feedback systems and measures of human-robot interaction. We find that—when learning and communication are developed together—the resulting closed-loop system can lead to improved human teaching, increased human trust, and human-robot co-adaptation. The paper includes a perspective on several of the interdisciplinary research themes and open questions that could advance how future robots communicate their learning to everyday operators. Finally, we implement a selection of the reviewed methods in a case study where participants kinesthetically teach a robot arm. This case study documents and tests an integrated approach for learning in ways that can be communicated, conveying this learning across multimodal interfaces, and measuring the resulting changes in human and robot behavior. 
    more » « less