skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Attention:

The NSF Public Access Repository (PAR) system and access will be unavailable from 10:00 PM ET on Friday, February 6 until 10:00 AM ET on Saturday, February 7 due to maintenance. We apologize for the inconvenience.


Title: Storytelling to Sensemaking: A Systematic Framework for Designing Auditory Description Display for Interactives
Auditory description display is verbalized text typically used to describe live, recorded, or graphical displays to support access for people who are blind or visually impaired. Significant prior research has resulted in guidelines for auditory description for non-interactive or minimally interactive contexts. A lack of auditory description for complex interactive environments remains a tremendous barrier to access for people with visual impairments. In this work, we present a systematic design framework for designing auditory description within complex interactive environments. We illustrate how modular descriptions aligned with this framework can result in an interactive storytelling experience constructed through user interactions. This framework has been used in a set of published and widely used interactive science simulations, and in its generalized form could be applied to a variety of contexts.  more » « less
Award ID(s):
1814220 1621363 1503439
PAR ID:
10216127
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
Page Range / eLocation ID:
1 to 12
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. The evolution of Web Speech has increased the ease of development and public availability of auditory description without the use of screen reader software, broadening its exposure to users who may benefit from spoken descriptions. Building off an existing design framework for auditory description of interactive web media, we have designed an optional Voicing feature instantiated in two PhET Interactive Simulations regularly used by students and educators globally. We surveyed over 2000 educators to investigate their perceptions and preferences of the Web Speech-based Voicing feature and its broad appeal and effectiveness for teaching and learning. We find a general approval by educators of the Voicing feature and more moderate statement ratings than expected to the different preset speech levels we presented to them. We find that educators perceive the feature as beneficial both broadly and for specific populations while some acknowledge particular populations for whom it remains ineffective. Lastly, we identify some variance in the perceptions of the feature based on different aspects of the simulation experience. 
    more » « less
  2. null (Ed.)
    Science simulations are widely used in classrooms to support inquiry-based learning of complex science concepts. These tools typically rely on interactive visual displays to convey relationships. Auditory displays, including verbal description and sonification (non-speech audio), combined with alternative input capabilities, may provide an enhanced experience for learners, particularly learners with visual impairment. We completed semi-structured interviews and usability testing with eight adult learners with visual impairment for two audio-enhanced simulations. We analyzed trends and edge cases in participants' interaction patterns, interpretations, and preferences. Findings include common interaction patterns across simulation use, increased efficiency with second use, and the complementary role that description and sonification play in supporting learning opportunities. We discuss how these control and display layers work to encourage exploration and engagement with science simulations. We conclude with general and specific design takeaways to support the implementation of auditory displays for accessible simulations. 
    more » « less
  3. Effective human-robot interaction is increasingly vital across various domains, including assistive robotics, emotional communication, entertainment, and industrial automation. Visual feedback, a common feature of current interfaces, may not be suitable for all environments. Audio feedback serves as a critical supplementary communication layer in settings where visibility is low or where robotic operations generate extensive data. Sonification, which transforms a robot's trajectory, motion, and environmental signals into sound, enhances users' comprehension of robot behavior. This improvement in understanding fosters more effective, safe, and reliable Human-Robot Interaction (HRI). Demonstrations of auditory data sonification's benefits are evident in real-world applications such as industrial assembly, robot-assisted rehabilitation, and interactive robotic exhibitions, where it promotes cooperation, boosts performance, and heightens engagement. Beyond conventional HRI environments, auditory data sonification shows substantial potential in managing complex robotic systems and intricate structures, such as hyper-redundant robots and robotic teams. These systems often challenge operators with complex joint monitoring, mathematical kinematic modeling, and visual behavior verification. This dissertation explores the sonification of motion in hyper-redundant robots and teams of industrial robots. It delves into the Wave Space Sonification (WSS) framework developed by Hermann, applying it to the motion datasets of protein molecules modeled as hyper-redundant mechanisms with numerous rigid nano-linkages. This research leverages the WSS framework to develop a sonification methodology for protein molecules' dihedral angle folding trajectories. Furthermore, it introduces a novel approach for the systematic sonification of robotic motion across varying configurations. By employing localized wave fields oriented within the robots' configuration space, this methodology generates auditory outputs with specific timbral qualities as robots move through predefined configurations or along certain trajectories. Additionally, the dissertation examines a team of wheeled industrial/service robots whose motion patterns are sonified using sinusoidal vibratory sounds, demonstrating the practical applications and benefits of this innovative approach. 
    more » « less
  4. Mobile robots must navigate efficiently, reliably, and appropriately around people when acting in shared social environments. For robots to be accepted in such environments, we explore robot navigation for the social contexts of each setting. Navigating through dynamic environments solely considering a collision-free path has long been solved. In human-robot environments, the challenge is no longer about efficiently navigating from one point to another. Autonomously detecting the context and adapting to an appropriate social navigation strategy is vital for social robots’ long-term applicability in dense human environments. As complex social environments, museums are suitable for studying such behavior as they have many different navigation contexts in a small space.Our prior Socially-Aware Navigation model considered con-text classification, object detection, and pre-defined rules to define navigation behavior in more specific contexts, such as a hallway or queue. This work uses environmental context, object information, and more realistic interaction rules for complex social spaces. In the first part of the project, we convert real-world interactions into algorithmic rules for use in a robot’s navigation system. Moreover, we use context recognition, object detection, and scene data for context-appropriate rule selection. We introduce our methodology of studying social behaviors in complex contexts, different analyses of our text corpus for museums, and the presentation of extracted social norms. Finally, we demonstrate applying some of the rules in scenarios in the simulation environment. 
    more » « less
  5. We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is naturally generated, making it a suitable candidate for improving the agent's alignment with the user's preference, and for reducing the cost of user edits over time. We propose a learning framework, PRELUDE that infers a description of the user's latent preference based on historic edit data. The inferred user preference descriptions are used to define prompts for generating responses in the future. This avoids fine-tuning the agent, which is costly, challenging to scale with the number of users, and may even degrade its performance on other tasks. Furthermore, learning descriptive preference improves interpretability, allowing the user to view and modify the learned preference. However, user preference can be complex, subtle, and vary based on context, making it challenging to learn. To address this, we propose a simple yet effective algorithm named CIPHER that leverages the LLM to infer the user preference for a given context based on user edits. In the future, CIPHER retrieves inferred preferences from the k-closest contexts in the history, and forms an aggregate preference for response generation. We introduce two interactive environments -- summarization and email writing, and use a GPT-4 simulated user for evaluation. On both tasks, CIPHER outperforms several baselines by achieving the lowest edit distance cost while only having a small overhead in LLM query cost. Our analysis reports that user preferences learned by CIPHER show significant similarity to the ground truth latent preferences. 
    more » « less