Physical activity is an important part of quality life, however people with visual impairments (PVIs) are less likely to participate in physical activity than their sighted peers. One barrier is that exercise instructors may not give accessible verbal instructions. There is a potential for text analysis to determine these phrases, and in response provide more accessible instructions. First, a taxonomy of accessible phrases needs to be developed. To address this problem, we conducted user studies with 10 PVIs exercising along with audio and video aerobic workouts. We analyzed video footage of their exercise along with interviews to determine a preliminary set of phrases that are helpful or confusing. We then conducted an iterative qualitative analysis of six other exercise videos and sought expert feedback to derive our taxonomy. We hope these findings inform systems that analyze instructional phrases for accessibility to PVIs.
more »
« less
This content will become publicly available on July 25, 2026
Vid2Coach: Transforming How-To Videos into Task Assistants
People use videos to learn new recipes, exercises, and crafts. Such videos remain difficult for blind and low vision (BLV) people to follow as they rely on visual comparison. Our observations of visual rehabilitation therapists (VRTs) guiding BLV people to follow how-to videos revealed that VRTs provide both proactive and responsive support including detailed descriptions, non-visual workarounds, and progress feedback. We propose Vid2Coach, a system that transforms how-to videos into wearable camera-based assistants that provide accessible instructions and mixed-initiative feedback. From the video, Vid2Coach generates accessible instructions by augmenting narrated instructions with demonstration details and completion criteria for each step. It then uses retrieval-augmented-generation to extract relevant non-visual workarounds from BLV-specific resources. Vid2Coach then monitors user progress with a camera embedded in commercial smart glasses to provide context-aware instructions, proactive feedback, and answers to user questions. BLV participants (N=8) using Vid2Coach completed cooking tasks with 58.5\% fewer errors than when using their typical workflow and wanted to use Vid2Coach in their daily lives. Vid2Coach demonstrates an opportunity for AI visual assistance that strengthens rather than replaces non-visual expertise.
more »
« less
- Award ID(s):
- 2505865
- PAR ID:
- 10631766
- Publisher / Repository:
- https://doi.org/10.48550/arXiv.2506.00717
- Date Published:
- ISSN:
- 2506.00717
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
People with visual impairments (PVIs) are less likely to participate in physical activity than their sighted peers. One barrier is the lack of accessible group-based aerobic exercise classes, often due to instructors not giving accessible verbal instructions. While there is research in exercise tracking, these tools often require vision or familiarity with the exercise. There are accessible solutions that give personalized verbal feedback in slower-paced exercises, not generalizing to aerobics. In response, we have developed an algorithm that detects shoeprints on a sensor mat using computer vision and a CNN. We can infer whether a person is following along with a step aerobics workout and are designing reactive verbal feedback to guide the person to rejoin the class. Future work will include finishing development and conducting a user study to assess the effectiveness of the reactive verbal feedback.more » « less
-
People who are blind share their images and videos with companies that provide visual assistance technologies (VATs) to gain access to information about their surroundings. A challenge is that people who are blind cannot independently validate the content of the images and videos before they share them, and their visual data commonly contains private content. We examine privacy concerns for blind people who share personal visual data with VAT companies that provide descriptions authored by humans or artifcial intelligence (AI) . We frst interviewed 18 people who are blind about their perceptions of privacy when using both types of VATs. Then we asked the participants to rate 21 types of image content according to their level of privacy concern if the information was shared knowingly versus unknowingly with human- or AI-powered VATs. Finally, we analyzed what information VAT companies communicate to users about their collection and processing of users’ personal visual data through their privacy policies. Our fndings have implications for the development of VATs that safeguard blind users’ visual privacy, and our methods may be useful for other camera-based technology companies and their users.more » « less
-
Background: Personal health technologies, including wearable tracking devices and mobile apps, have great potential to equip the general population with the ability to monitor and manage their health. However, being designed for sighted people, much of their functionality is largely inaccessible to the blind and low-vision (BLV) population, threatening the equitable access to personal health data (PHD) and health care services. Objective: This study aims to understand why and how BLV people collect and use their PHD and the obstacles they face in doing so. Such knowledge can inform accessibility researchers and technology companies of the unique self-tracking needs and accessibility challenges that BLV people experience. Methods: We conducted a web-based and phone survey with 156 BLV people. We reported on quantitative and qualitative findings regarding their PHD tracking practices, needs, accessibility barriers, and work-arounds. Results: BLV respondents had strong desires and needs to track PHD, and many of them were already tracking their data despite many hurdles. Popular tracking items (ie, exercise, weight, sleep, and food) and the reasons for tracking were similar to those of sighted people. BLV people, however, face many accessibility challenges throughout all phases of self-tracking, from identifying tracking tools to reviewing data. The main barriers our respondents experienced included suboptimal tracking experiences and insufficient benefits against the extended burden for BLV people. Conclusions: We reported the findings that contribute to an in-depth understanding of BLV people’s motivations for PHD tracking, tracking practices, challenges, and work-arounds. Our findings suggest that various accessibility challenges hinder BLV individuals from effectively gaining the benefits of self-tracking technologies. On the basis of the findings, we discussed design opportunities and research areas to focus on making PHD tracking technologies accessible for all, including BLV people.more » « less
-
For people with visual impairments, photography is essential in identifying objects through remote sighted help and image recognition apps. This is especially the case for teachable object recognizers, where recognition models are trained on user's photos. Here, we propose real-time feedback for communicating the location of an object of interest in the camera frame. Our audio-haptic feedback is powered by a deep learning model that estimates the object center location based on its proximity to the user's hand. To evaluate our approach, we conducted a user study in the lab, where participants with visual impairments (N=9) used our feedback to train and test their object recognizer in vanilla and cluttered environments. We found that very few photos did not include the object (2% in the vanilla and 8% in the cluttered) and the recognition performance was promising even for participants with no prior camera experience. Participants tended to trust the feedback even though they know it can be wrong. Our cluster analysis indicates that better feedback is associated with photos that include the entire object. Our results provide insights into factors that can degrade feedback and recognition performance in teachable interfaces.more » « less
An official website of the United States government
