skip to main content

Search for: All records

Award ID contains: 1637927

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Toward enabling next-generation robots capable of socially intelligent interaction with humans, we present a computational model of interactions in a social environment of multiple agents and multiple groups. The Multiagent Group Perception and Interaction (MGpi) network is a deep neural network that predicts the appropriate social action to execute in a group conversation (e.g., speak, listen, respond, leave), taking into account neighbors' observable features (e.g., location of people, gaze orientation, distraction, etc.). A central component of MGpi is the Kinesic-Proxemic-Message (KPM) gate, that performs social signal gating to extract important information from a group conversation. In particular, KPM gate filters incoming social cues from nearby agents by observing their body gestures (kinesics) and spatial behavior (proxemics). The MGpi network and its KPM gate are learned via imitation learning, using demonstrations from our designed social interaction simulator. Further, we demonstrate the efficacy of the KPM gate as a social attention mechanism, achieving state-of-the-art performance on the task of group identification without using explicit group annotations, layout assumptions, or manually chosen parameters.
  2. We explore the possibility of using a single monocular camera to forecast the time to collision between a suitcase-shaped robot being pushed by its user and other nearby pedestrians. We develop a purely image-based deep learning approach that directly estimates the time to collision without the need of relying on explicit geometric depth estimates or velocity information to predict future collisions. While previous work has focused on detecting immediate collision in the context of navigating Unmanned Aerial Vehicles, the detection was limited to a binary variable (i.e., collision or no collision). We propose a more fine-grained approach to collision forecasting by predicting the exact time to collision in terms of milliseconds, which is more helpful for collision avoidance in the context of dynamic path planning. To evaluate our method, we have collected a novel dataset of over 13,000 indoor video segments each showing a trajectory of at least one person ending in a close proximity (a near collision) with the camera mounted on a mobile suitcase-shaped platform. Using this dataset, we do extensive experimentation on different temporal windows as input using an exhaustive list of state-of-the-art convolutional neural networks (CNNs). Our results show that our proposed multi-stream CNN is themore »best model for predicting time to near-collision. The average prediction error of our time to near-collision is 0.75 seconds across the test videos. The project webpage can be found at https://aashi7.github.io/NearCollision.html.« less
  3. NavCog3 is a smartphone turn-by-turn navigation assistant system we developed specifically designed to enable independent navigation for people with visual impairments. Using off-the-shelf Bluetooth beacons installed in the surrounding environment and a commodity smartphone carried by the user, NavCog3 achieves unparalleled localization accuracy in real-world large-scale scenarios. By leveraging its accurate localization capabilities, NavCog3 guides the user through the environment and signals the presence of semantic features and points of interest in the vicinity (e.g., doorways, shops).To assess the capability of NavCog3 to promote independent mobility of individuals with visual impairments, we deployed and evaluated the system in two challenging real-world scenarios. The first scenario demonstrated the scalability of the system, which was permanently installed in a five-story shopping mall spanning three buildings and a public underground area. During the study, 10 participants traversed three fixed routes, and 43 participants traversed free-choice routes across the environment. The second scenario validated the system’s usability in the wild in a hotel complex temporarily equipped with NavCog3 during a conference for individuals with visual impairments. In the hotel, almost 14.2h of system usage data were collected from 37 unique users who performed 280 travels across the environment, for a total of 30,200m
  4. Navigation assistive technologies have been designed to support individuals with visual impairments during independent mobility by providing sensory augmentation and contextual awareness of their surroundings. Such information is habitually provided through predefned audio-haptic interaction paradigms. However, individual capabilities, preferences and behavior of people with visual impairments are heterogeneous, and may change due to experience, context and necessity. Therefore, the circumstances and modalities for providing navigation assistance need to be personalized to different users, and through time for each user. We conduct a study with 13 blind participants to explore how the desirability of messages provided during assisted navigation varies based on users' navigation preferences and expertise. The participants are guided through two different routes, one without prior knowledge and one previously studied and traversed. The guidance is provided through turn-by-turn instructions, enriched with contextual information about the environment. During navigation and follow-up interviews, we uncover that participants have diversifed needs for navigation instructions based on their abilities and preferences. Our study motivates the design of future navigation systems capable of verbosity level personalization in order to keep the users engaged in the current situational context while minimizing distractions.
  5. Museums are gradually becoming more accessible to blind people, who have shown interest in visiting museums and in appreciating visual art. Yet, their ability to visit museums is still dependent on the assistance they get from their family and friends or from the museum personnel. Based on this observation and on prior research, we developed a solution to support an independent, interactive museum experience that uses the continuous tracking of the user’s location and orientation to enable a seamless interaction between Navigation and Art Appreciation. Accurate localization and context-awareness allow for turn-by-turn guidance (Navigation Mode), as well as detailed audio content when facing an artwork within close proximity (Art Appreciation Mode). In order to evaluate our system, we installed it at The Andy Warhol Museum in Pittsburgh and conducted a user study where nine blind participants followed routes of interest while learning about the artworks. We found that all participants were able to follow the intended path, immediately grasped how to switch between Navigation and Art Appreciation modes, and valued listening to the audio content in front of each artwork. Also, they showed high satisfaction and an increased motivation to visit museums more often
  6. People with visual impairments often have to rely on the assistance of sighted guides in airports, which prevents them from having an independent travel experience. In order to learn about their perspectives on current airport accessibility, we conducted two focus groups that discussed their needs and experiences in-depth, as well as the potential role of assistive technologies. We found that independent navigation is a main challenge and severely impacts their overall experience. As a result, we equipped an airport with a Bluetooth Low Energy (BLE) beacon-based navigation system and performed a real-world study where users navigated routes relevant for their travel experience. We found that despite the challenging environment participants were able to complete their itinerary independently, presenting none to few navigation errors and reasonable timings. This study presents the first systematic evaluation posing BLE technology as a strong approach to increase the independence of visually impaired people in airports.
  7. We explore the possibility of using a single monocular camera to forecast the time to collision between a suitcase-shaped robot being pushed by its user and other nearby pedestrians. We develop a purely image-based deep learning approach that directly estimates the time to collision without the need of relying on explicit geometric depth estimates or velocity information to predict future collisions. While previous work has focused on detecting immediate collision in the context of navigating Unmanned Aerial Vehicles, the detection was limited to a binary variable (i.e., collision or no collision). We propose a more fine-grained approach to collision forecasting by predicting the exact time to collision in terms of milliseconds, which is more helpful for collision avoidance in the context of dynamic path planning. To evaluate our method, we have collected a novel large-scale dataset of over 13,000 indoor video segments each showing a trajectory of at least one person ending in a close proximity (a near collision) with the camera mounted on a mobile suitcase-shaped platform. Using this dataset, we do extensive experimentation on different temporal windows as input using an exhaustive list of state-of-the-art convolutional neural networks (CNNs). Our results show that our proposed multi-stream CNN ismore »the best model for predicting time to near-collision. The average prediction error of our time to near collision is 0.75 seconds across our test environments.« less
  8. We study self-supervised adaptation of a robot's policy for social interaction, i.e., a policy for active communication with surrounding pedestrians through audio or visual signals. Inspired by the observation that humans continually adapt their behavior when interacting under varying social context, we propose Adaptive EXP4 (A-EXP4), a novel online learning algorithm for adapting the robot-pedestrian interaction policy. To address limitations of bandit algorithms in adaptation to unseen and highly dynamic scenarios, we employ a mixture model over the policy parameter space. Specifically, a Dirichlet Process Gaussian Mixture Model (DPMM) is used to cluster the parameters of sampled policies and maintain a mixture model over the clusters, hence effectively discovering policies that are suitable to the current environmental context in an unsupervised manner. Our simulated and real-world experiments demonstrate the feasibility of A-EXP4 in accommodating interaction with different types of pedestrians while jointly minimizing social disruption through the adaptation process. While the A-EXP4 formulation is kept general for application in a variety of domains requiring continual adaptation of a robot's policy, we specifically evaluate the performance of our algorithm using a suitcase-inspired assistive robotic platform. In this concrete assistive scenario, the algorithm observes how audio signals produced by the navigational systemmore »affect the behavior of pedestrians and adapts accordingly. Consequently, we find A-EXP4 to effectively adapt the interaction policy for gently clearing a navigation path in crowded settings, resulting in significant reduction in empirical regret compared to the EXP4 baseline.« less
  9. We present an assistive suitcase system, BBeep, for supporting blind people when walking through crowded environments. BBeep uses pre-emptive sound notifications to help clear a path by alerting both the user and nearby pedestrians about the potential risk of collision. BBeep triggers notifications by tracking pedestrians, predicting their future position in real-time, and provides sound notifications only when it anticipates a future collision. We investigate how different types and timings of sound affect nearby pedestrian behavior. In our experiments, we found that sound emission timing has a significant impact on nearby pedestrian trajectories when compared to different sound types. Based on these findings, we performed a real-world user study at an international airport, where blind participants navigated with the suitcase in crowded areas. We observed that the proposed system significantly reduces the number of imminent collisions.