skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Multi-Modal eHMIs: The Relative Impact of Light and Sound in AV-Pedestrian Interaction
External Human-Machine Interfaces (eHMIs) have been evaluated to facilitate interactions between Automated Vehicles (AVs) and pedestrians. Most eHMIs are, however, visual/ light-based solutions, and multi-modal eHMIs have received little attention to date. We ran an experimental video study (š‘ = 29) to systematically under- stand the effect on pedestrian’s willingness to cross the road and user preferences of a light-based eHMI (light bar on the bumper) and two sound-based eHMIs (bell sound and droning sound), and combinations thereof. We found no objective change in pedestri- ans’ willingness to cross the road based on the nature of eHMI, although people expressed different subjective preferences for the different ways an eHMI may communicate, and sometimes even strong dislike for multi-modal eHMIs. This shows that the modality of the evaluated eHMI concepts had relatively little impact on their effectiveness. Consequently, this lays an important groundwork for accessibility considerations of future eHMIs, and points towards the insight that provisions can be made for taking user preferences into account without compromising effectiveness.  more » « less
Award ID(s):
2212431
PAR ID:
10539120
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
ACM
Date Published:
ISBN:
9798400703300
Page Range / eLocation ID:
1 to 16
Subject(s) / Keyword(s):
Automated vehicle eHMI, vulnerable road user VRU pedestrian vehicle-pedestrian interaction multimodal interface
Format(s):
Medium: X
Location:
Honolulu HI USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Accurate road networks play a crucial role in modern mobile applications such as navigation and last-mile delivery. Most existing studies primarily focus on generating road networks in open areas like main roads and avenues, but little attention has been given to the generation of community road networks in closed areas such as residential areas, which becomes more and more significant due to the growing demand for door-to-door services such as food delivery. This lack of research is primarily attributed to challenges related to sensing data availability and quality. In this paper, we design a novel framework called SmallMap that leverages ubiquitous multi-modal sensing data from last-mile delivery to automatically generate community road networks with low costs. Our SmallMap consists of two key modules: (1) a Trajectory of Interest Detection module enhanced by exploiting multi-modal sensing data collected from the delivery process; and (2) a Dual Spatio-temporal Generative Adversarial Network module that incorporates Trajectory of Interest by unsupervised road network adaptation to generate road networks automatically. To evaluate the effectiveness of SmallMap, we utilize a two-month dataset from one of the largest logistics companies in China. The extensive evaluation results demonstrate that our framework significantly outperforms state-of-the-art baselines, achieving a precision of 90.5%, a recall of 87.5%, and an F1-score of 88.9%, respectively. Moreover, we conduct three case studies in Beijing City for courier workload estimation, Estimated Time of Arrival (ETA) in last-mile delivery, and fine-grained order assignment. 
    more » « less
  2. While node semantics have been extensively explored in social networks, little research attention has been paid to profile edge semantics, i.e., social relations. Ideal edge semantics should not only show that two users are connected, but also why they know each other and what they share in common. However, relations in social networks are often hard to profile, due to noisy multi-modal signals and limited user-generated ground-truth labels. In this work, we aim to develop a unified and principled framework that can profile user relations as edge semantics in social networks by integrating multi-modal signals in the presence of noisy and incomplete data. Our framework is also flexible towards limited or missing supervision. Specifically, we assume a latent distribution of multiple relations underlying each user link, and learn them with multi-modal graph edge variational autoencoders. We encode the network data with a graph convolutional network, and decode arbitrary signals with multiple reconstruction networks. Extensive experiments and case studies on two public DBLP author networks and two internal LinkedIn member networks demonstrate the superior effectiveness and efficiency of our proposed model. 
    more » « less
  3. Over the last decade, research has revealed the high prevalence of cyberbullying among youth and raised serious concerns in society. Information on the social media platforms where cyberbullying is most prevalent (e.g., Instagram, Facebook, Twitter) is inherently multi-modal, yet most existing work on cyberbullying identification has focused solely on building generic classification models that rely exclusively on text analysis of online social media sessions (e.g., posts). Despite their empirical success, these efforts ignore the multi-modal information manifested in social media data (e.g., image, video, user profile, time, and location), and thus fail to offer a comprehensive understanding of cyberbullying. Conventionally, when information from different modalities is presented together, it often reveals complementary insights about the application domain and facilitates better learning performance. In this paper, we study the novel problem of cyberbullying detection within a multi-modal context by exploiting social media data in a collaborative way. This task, however, is challenging due to the complex combination of both cross-modal correlations among various modalities and structural dependencies between different social media sessions, and the diverse attribute information of different modalities. To address these challenges, we propose XBully, a novel cyberbullying detection framework, that first reformulates multi-modal social media data as a heterogeneous network and then aims to learn node embedding representations upon it. Extensive experimental evaluations on real-world multi-modal social media datasets show that the XBully framework is superior to the state-of-the-art cyberbullying detection models. 
    more » « less
  4. null (Ed.)
    Cross-modal retrieval aims to learn discriminative and modal-invariant features for data from different modalities. Unlike the existing methods which usually learn from the features extracted by offline networks, in this paper, we pro- pose an approach to jointly train the components of cross- modal retrieval framework with metadata, and enable the network to find optimal features. The proposed end-to-end framework is updated with three loss functions: 1) a novel cross-modal center loss to eliminate cross-modal discrepancy, 2) cross-entropy loss to maximize inter-class variations, and 3) mean-square-error loss to reduce modality variations. In particular, our proposed cross-modal center loss minimizes the distances of features from objects belonging to the same class across all modalities. Extensive experiments have been conducted on the retrieval tasks across multi-modalities including 2D image, 3D point cloud and mesh data. The proposed framework significantly outperforms the state-of-the-art methods for both cross-modal and in-domain retrieval for 3D objects on the ModelNet10 and ModelNet40 datasets. 
    more » « less
  5. null (Ed.)
    Light-on-dark color schemes, so-called ā€œDark Mode,ā€ are becoming more and more popular over a wide range of display technologies and application fields. Many people who have to look at computer screens for hours at a time, such as computer programmers and computer graphics artists, indicate a preference for switching colors on a computer screen from dark text on a light background to light text on a dark background due to perceived advantages related to visual comfort and acuity, specifically when working in low-light environments. In this article, we investigate the effects of dark mode color schemes in the field of optical see-through head-mounted displays (OST-HMDs), where the characteristic ā€œadditiveā€ light model implies that bright graphics are visible but dark graphics are transparent . We describe two human-subject studies in which we evaluated a normal and inverted color mode in front of different physical backgrounds and different lighting conditions. Our results indicate that dark mode graphics displayed on the HoloLens have significant benefits for visual acuity and usability, while user preferences depend largely on the lighting in the physical environment. We discuss the implications of these effects on user interfaces and applications. 
    more » « less