We study two approaches for predicting an appropriate pose for a robot to take part in group formations typical of social human conversations subject to the physical layout of the surrounding environment. One method is model-based and explicitly encodes key geometric aspects of conversational formations. The other method is data-driven. It implicitly models key properties of spatial arrangements using graph neural networks and an adversarial training regimen. We evaluate the proposed approaches through quantitative metrics designed for this problem domain and via a human experiment. Our results suggest that the proposed methods are effective at reasoning about the environment layout and conversational group formations. They can also be used repeatedly to simulate conversational spatial arrangements despite being designed to output a single pose at a time. However, the methods showed different strengths. For example, the geometric approach was more successful at avoiding poses generated in nonfree areas of the environment, but the data-driven method was better at capturing the variability of conversational spatial formations. We discuss ways to address open challenges for the pose generation problem and other interesting avenues for future work.
more »
« less
Modeling semantics and pragmatics of spatial prepositions via hierarchical common-sense primitives
Understanding spatial expressions and using them appropriately is necessary for seamless and natural human-machine interaction. However, capturing the semantics and appropriate usage of spatial prepositions is notoriously difficult, because of their vagueness and polysemy. Although modern data-driven approaches are good at capturing statistical regularities in the usage, they usually require substantial sample sizes, often do not generalize well to unseen instances and, most importantly, their structure is essentially opaque to analysis, which makes diagnosing problems and understanding their reasoning process difficult. In this work, we discuss our attempt at modeling spatial senses of prepositions in English using a combination of rule-based and statistical learning approaches. Each preposition model is implemented as a tree where each node computes certain intuitive relations associated with the preposition, with the root computing the final value of the prepositional relation itself. The models operate on a set of artificial 3D “room world” environments, designed in Blender, taking the scene itself as an input. We also discuss our annotation framework used to collect human judgments employed in the model training. Both our factored models and black-box baseline models perform quite well, but the factored models will enable reasoned explanations of spatial relation judgements.
more »
« less
- Award ID(s):
- 1940981
- PAR ID:
- 10299975
- Date Published:
- Journal Name:
- Workshop on Spatial Language Understanding and Grounded Communication for Robotics (SpLU-RoboNLP 2021)
- Page Range / eLocation ID:
- 32-41
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Second language learners studying languages with a diverse set of prepositions often find preposition usage difficult to master, which can manifest in second language writing as preposition errors that appear to result from transfer from a native language, or interlingual errors. We envision a digital writing assistant for language learners and teachers that can provide targeted feedback on these errors. To address these errors, we turn to the task of preposition error detection, which remains an open problem despite the many methods that have been proposed. In this paper, we explore various classifiers, with and without neural network-based features, and finetuned BERT models for detecting preposition errors between verbs and their noun arguments.more » « less
-
Despite considerable progress in tropical cyclone (TC) research, our current understanding and prediction capabilities regarding the TC intensity–size relation remain limited. This study systematically analyzes the key characteristics and performance of different types of mathematical models for TC intensity–size relations using the 6-hourly Tropical Cyclone Extended Best Track Dataset spanning 1988 to 2020. The models investigated include statistical, idealized (e.g., Rankine vortex), parametric, and theoretical models. In addition to directly comparing the solutions obtained from individual models to the observed TC records, we assess the models that can produce a unique finite-sized radial profile of surface winds for each TC record—a minimal requirement to ensure that the predicted radial profile of the surface winds would align with the observed profile. The results reveal that a sufficient condition to guarantee a unique radial profile of surface winds is that the associated model can be written as a radial invariant quantity, although it does not guarantee a finite-sized profile. Only the effective absolute angular momentum (eAAM) model, among all the models examined in this study, meets the minimum requirement. Furthermore, the solutions obtained from the eAAM model are well correlated with their observational counterparts (85 to 95%) with little systematic bias and small absolute mean errors that are very close to the observational resolution. The eAAM model’s ability to capture the complex intensity–size relation of observed TCs, in combination with these desirable features, suggests its high potential for gaining a better understanding of the underlying physics governing the observed TC intensity–size relation.more » « less
-
Understanding human perceptions of robot performance is crucial for designing socially intelligent robots that can adapt to human expectations. Current approaches often rely on surveys, which can disrupt ongoing human–robot interactions. As an alternative, we explore predicting people’s perceptions of robot performance using non-verbal behavioral cues and machine learning techniques. We contribute the SEAN TOGETHER Dataset consisting of observations of an interaction between a person and a mobile robot in Virtual Reality, together with perceptions of robot performance provided by users on a 5-point scale. We then analyze how well humans and supervised learning techniques can predict perceived robot performance based on different observation types (like facial expression and spatial behavior features). Our results suggest that facial expressions alone provide useful information, but in the navigation scenarios that we considered, reasoning about spatial features in context is critical for the prediction task. Also, supervised learning techniques outperformed humans’ predictions in most cases. Further, when predicting robot performance as a binary classification task on unseen users’ data, the F1-Score of machine learning models more than doubled that of predictions on a 5-point scale. This suggested good generalization capabilities, particularly in identifying performance directionality over exact ratings. Based on these findings, we conducted a real-world demonstration where a mobile robot uses a machine learning model to predict how a human who follows it perceives it. Finally, we discuss the implications of our results for implementing these supervised learning models in real-world navigation. Our work paves the path to automatically enhancing robot behavior based on observations of users and inferences about their perceptions of a robot.more » « less
-
MLMOD is a software package for incorporating machine learning approaches and models into simulations of microscale mechanics and molecular dynamics in LAMMPS. Recent machine learning approaches provide promising data-driven approaches for learning representations for system behaviors from experimental data and high fidelity simulations. The package facilitates learning and using data-driven models for (i) dynamics of the system at larger spatial-temporal scales (ii) interactions between system components, (iii) features yielding coarser degrees of freedom, and (iv) features for new quantities of interest characterizing system behaviors. MLMOD provides hooks in LAMMPS for (i) modeling dynamics and time-step integration, (ii) modeling interactions, and (iii) computing quantities of interest characterizing system states. The package allows for use of machine learning methods with general model classes including Neural Networks, Gaussian Process Regression, Kernel Models, and other approaches. Here we discuss our prototype C++/Python package, aims, and example usage. The package is integrated currently with the mesocale and molecular dynamics simulation package LAMMPS and PyTorch.more » « less