skip to main content


Title: Designing and Evaluating a Customizable Head-mounted Vision Enhancement System for People with Low Vision
Award ID(s):
1657315
NSF-PAR ID:
10142661
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
ACM Transactions on Accessible Computing
Volume:
12
Issue:
4
ISSN:
1936-7228
Page Range / eLocation ID:
1 to 46
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper presents a mobile-based solution that integrates 3D vision and voice interaction to assist people who are blind or have low vision to explore and interact with their surroundings. The key components of the system are the two 3D vision modules: the 3D object detection module integrates a deep-learning based 2D object detector with ARKit-based point cloud generation, and an interest direction recognition module integrates hand/finger recognition and ARKit-based 3D direction estimation. The integrated system consists of a voice interface, a task scheduler, and an instruction generator. The voice interface contains a customized user request mapping module that maps the user’s input voice into one of the four primary system operation modes (exploration, search, navigation, and settings adjustment). The task scheduler coordinates with two web services that host the two vision modules to allocate resources for computation based on the user request and network connectivity strength. Finally, the instruction generator computes the corresponding instructions based on the user request and results from the two vision modules. The system is capable of running in real time on mobile devices. We have shown preliminary experimental results on the performance of the voice to user request mapping module and the two vision modules. 
    more » « less
  2. Vision-based localization approaches now underpin newly emerging navigation pipelines for myriad use cases, from robotics to assistive technologies. Compared to sensor-based solutions, vision-based localization does not require pre-installed sensor infrastructure, which is costly, time-consuming, and/or often infeasible at scale. Herein, we propose a novel vision-based localization pipeline for a specific use case: navigation support for end users with blindness and low vision. Given a query image taken by an end user on a mobile application, the pipeline leverages a visual place recognition (VPR) algorithm to find similar images in a reference image database of the target space. The geolocations of these similar images are utilized in a downstream task that employs a weighted-average method to estimate the end user’s location. Another downstream task utilizes the perspective-n-point (PnP) algorithm to estimate the end user’s direction by exploiting the 2D–3D point correspondences between the query image and the 3D environment, as extracted from matched images in the database. Additionally, this system implements Dijkstra’s algorithm to calculate a shortest path based on a navigable map that includes the trip origin and destination. The topometric map used for localization and navigation is built using a customized graphical user interface that projects a 3D reconstructed sparse map, built from a sequence of images, to the corresponding a priori 2D floor plan. Sequential images used for map construction can be collected in a pre-mapping step or scavenged through public databases/citizen science. The end-to-end system can be installed on any internet-accessible device with a camera that hosts a custom mobile application. For evaluation purposes, mapping and localization were tested in a complex hospital environment. The evaluation results demonstrate that our system can achieve localization with an average error of less than 1 m without knowledge of the camera’s intrinsic parameters, such as focal length. 
    more » « less
  3. Fast, reliable, and efficient data transfer across wide-area networks is a predominant bottleneck for dataintensive cloud applications. This paper introduces OneDataShare, which is designed to eliminate the issues plaguing effective cloud-based data transfers of varying file sizes and across incompatible transfer end-points. The vision of OneDataShare is to achieve high-speed data transfer, interoperability between multiple transfer protocols, and accurate estimation of delivery time for advance planning, thereby maximizing user-profit through improved and faster data analysis for business intelligence. The paper elaborates on the desirable features of OneDataShare as a cloud-hosted data transfer scheduling and optimization service, and how it is aligned with the vision of harnessing the power of the cloud and distributed computing. Experimental evaluation and comparison with existing real-life file transfer services show that the transfer throughout achieved by OneDataShare is up to 6.5 times greater compared to other approaches. 
    more » « less
  4. This paper proposes a computer vision-based workflow that analyses Google 360-degree street views to understand the quality of urban spaces regarding vegetation coverage and accessibility of urban amenities such as benches. Image segmentation methods were utilized to produce an annotated image with the amount of vegetation, sky and street coloration. Two deep learning models were used -- Monodepth2 for depth detection and YoloV5 for object detection -- to create a 360-degree diagram of vegetation and benches at a given location. The automated workflow allows non-expert users like planners, designers, and communities to analyze and evaluate urban environments with Google Street Views. The workflow consists of three components: (1) user interface for location selection; (2) vegetation analysis, bench detection and depth estimation; and (3) visualization of vegetation coverage and amenities. The analysis and visualization could inform better urban design outcomes. 
    more » « less