skip to main content

Title: Streaming Scene Maps for Co-Robotic Exploration in Bandwidth Limited Environments
This paper proposes a bandwidth tunable technique for real-time probabilistic scene modeling and mapping to enable co-robotic exploration in communication constrained environments such as the deep sea. The parameters of the system enable the user to characterize the scene complexity represented by the map, which in turn determines the bandwidth requirements. The approach is demonstrated using an underwater robot that learns an unsupervised scene model of the environment and then uses this scene model to communicate the spatial distribution of various high-level semantic scene constructs to a human operator. Preliminary experiments in an artificially constructed tank environment, as well as simulated missions over a 10m x 10m coral reef using real data, show the tunability of the maps to different bandwidth constraints and science interests. To our knowledge this is the first paper to quantify how the free parameters of the unsupervised scene model impact both the scientific utility of and bandwidth required to communicate the resulting scene model.
; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
2019 International Conference on Robotics and Automation (ICRA)
Page Range or eLocation-ID:
7940 to 7946
Sponsoring Org:
National Science Foundation
More Like this
  1. Unsupervised learning techniques, such as Bayesian topic models, are capable of discovering latent structure directly from raw data. These unsupervised models can endow robots with the ability to learn from their observations without human supervision, and then use the learned models for tasks such as autonomous exploration, adaptive sampling, or surveillance. This paper extends single-robot topic models to the domain of multiple robots. The main difficulty of this extension lies in achieving and maintaining global consensus among the unsupervised models learned locally by each robot. This is especially challenging for multi-robot teams operating in communication-constrained environments, such as marine robots. We present a novel approach for multi-robot distributed learning in which each robot maintains a local topic model to categorize its observations and model parameters are shared to achieve global consensus. We apply a combinatorial optimization procedure that combines local robot topic distributions into a globally consistent model based on topic similarity, which we find mitigates topic drift when compared to a baseline approach that matches topics naively. We evaluate our methods experimentally by demonstrating multi-robot underwater terrain characterization using simulated missions on real seabed imagery. Our proposed method achieves similar model quality under bandwidth-constraints to that achieved by modelsmore »that continuously communicate, despite requiring less than one percent of the data transmission needed for continuous communication.« less
  2. Recent years have seen the introduction of large- scale platforms for experimental wireless research. These platforms, which include testbeds like those of the PAWR program and emulators like Colosseum, allow researchers to prototype and test their solutions in a sound yet realistic wireless environment before actual deployment. Emulators, in particular, enable wire- less experiments that are not site-specific as those on real testbeds. Researchers can choose among different radio frequency (RF) scenarios for real-time emulation of a vast variety of different situations, with different numbers of users, RF bandwidth, antenna counts, hardware requirements, etc. Although very powerful, in that they can emulate virtually any real-world deployment, emulated scenarios are only as useful as how accurately they can capture the targeted wireless channel and environment. Achieving emulation accuracy is particularly challenging, especially for experiments at scale for which emulators require considerable amounts of computational resources. In this paper we propose a framework to create RF scenarios for emulators like Colosseum from rich forms of inputs, like those obtained by measurements through radio equipment or via software (e.g., ray-tracers and electromagnetic field solvers). Our framework optimally scales down the large set of RF data in input to the fewer parameters allowed bymore »the emulator by using efficient clustering techniques and channel impulse response re-sampling. We showcase our method by generating wireless scenarios for Colosseum by using Remcom’s Wireless InSite, a commercial-grade ray-tracer that produces key characteristics of the wireless channel. Examples are provided for line-of-sight and non-line-of-sight scenarios on portions of the Northeastern University main campus.« less
  3. Learning interpretable representations in an unsupervised setting is an important yet a challenging task. Existing unsupervised interpretable methods focus on extracting independent salient features from data. However they miss out the fact that the entanglement of salient features may also be informative. Acknowledging these entanglements can improve the interpretability, resulting in extraction of higher quality and a wider variety of salient features. In this paper, we propose a new method to enable Generative Adversarial Networks (GANs) to discover salient features that may be entangled in an informative manner, instead of extracting only disentangled features. Specifically, we propose a regularizer to punish the disagreement between the extracted feature interactions and a given dependency structure while training. We model these interactions using a Bayesian network, estimate the maximum likelihood parameters and calculate a negative likelihood score to measure the disagreement. Upon qualitatively and quantitatively evaluating the proposed method using both synthetic and real-world datasets, we show that our proposed regularizer guides GANs to learn representations with disentanglement scores competing with the state-of-the-art, while extracting a wider variety of salient features.

  4. The optic nerve transmits visual information to the brain as trains of discrete events, a low-power, low-bandwidth communication channel also exploited by silicon retina cameras. Extracting highfidelity visual input from retinal event trains is thus a key challenge for both computational neuroscience and neuromorphic engineering. Here, we investigate whether sparse coding can enable the reconstruction of high-fidelity images and video from retinal event trains. Our approach is analogous to compressive sensing, in which only a random subset of pixels are transmitted and the missing information is estimated via inference. We employed a variant of the Locally Competitive Algorithm to infer sparse representations from retinal event trains, using a dictionary of convolutional features optimized via stochastic gradient descent and trained in an unsupervised manner using a local Hebbian learning rule with momentum. We used an anatomically realistic retinal model with stochastic graded release from cones and bipolar cells to encode thumbnail images as spike trains arising from ON and OFF retinal ganglion cells. The spikes from each model ganglion cell were summed over a 32 msec time window, yielding a noisy rate-coded image. Analogous to how the primary visual cortex is postulated to infer features from noisy spike trains arising frommore »the optic nerve, we inferred a higher-fidelity sparse reconstruction from the noisy rate-coded image using a convolutional dictionary trained on the original CIFAR10 database. To investigate whether a similar approachworks on non-stochastic data, we demonstrate that the same procedure can be used to reconstruct high-frequency video from the asynchronous events arising from a silicon retina camera moving through a laboratory environment.« less
  5. NLP is currently dominated by language models like RoBERTa which are pretrained on billions of words. But what exact knowledge or skills do Transformer LMs learn from large-scale pretraining that they cannot learn from less data? To explore this question, we adopt five styles of evaluation: classifier probing, information-theoretic probing, unsupervised relative acceptability judgments, unsupervised language model knowledge probing, and fine-tuning on NLU tasks. We then draw learning curves that track the growth of these different measures of model ability with respect to pretraining data volume using the MiniBERTas, a group of RoBERTa models pretrained on 1M, 10M, 100M and 1B words. We find that these LMs require only about 10M to 100M words to learn to reliably encode most syntactic and semantic features we test. They need a much larger quantity of data in order to acquire enough commonsense knowledge and other skills required to master typical downstream NLU tasks. The results suggest that, while the ability to encode linguistic features is almost certainly necessary for language understanding, it is likely that other, unidentified, forms of knowledge are the major drivers of recent improvements in language understanding among large pretrained models.