skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Which AI Models Create Accurate Alt Text for Picture Books?
In the last decade, there has been a surge in development and mainstream adoption of Artificial Intelligence (AI) systems that can generate textual image descriptions from images. However, only a few of these, such as Microsoft’s SeeingAI, are specifically tailored to needs of people who are blind screen reader users, and none of these have been brought to bear on the particular challenges faced by parents who desire image descriptions of children’s picture books. Such images have distinct qualities, but there exists no research to explore the current state of the art and opportunities to improve image-to-text AI systems for this problem domain. We conducted a content analysis of the image descriptions generated for a sample of 20 images selected from 17 recently published children’s picture books, using five AI systems: asticaVision, BLIP, SeeingAI, TapTapSee, and VertexAI. We found that descriptions varied widely in their accuracy and completeness, with only 13% meeting both criteria. Overall, our findings suggest a need for AI image-to-text generation systems that are trained on the types, contents, styles, and layouts characteristic of children’s picture book images, towards increased accessibility for blind parents.  more » « less
Award ID(s):
2048145
PAR ID:
10522066
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
California State University, Northridge
Date Published:
Journal Name:
Journal on Technology & Persons with Disabilities
ISSN:
2330-4219
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Co-reading, an activity where adults collaboratively read books with child(ren), is important for literacy learning and forming human connection. However, parents and guardians with visual impairments do not experience the same level of access to resources when co-reading with their child(ren) as their sighted counterparts, especially as regards images in children’s books. Through conducting an interview study with five visually impaired parents/guardians, we illuminate the importance parents place on images in children’s books, how they access visual information in children’s print books, and the potential of smart speakers in assisting their existing co-reading practices. 
    more » « less
  2. People who are blind share their images and videos with companies that provide visual assistance technologies (VATs) to gain access to information about their surroundings. A challenge is that people who are blind cannot independently validate the content of the images and videos before they share them, and their visual data commonly contains private content. We examine privacy concerns for blind people who share personal visual data with VAT companies that provide descriptions authored by humans or artifcial intelligence (AI) . We frst interviewed 18 people who are blind about their perceptions of privacy when using both types of VATs. Then we asked the participants to rate 21 types of image content according to their level of privacy concern if the information was shared knowingly versus unknowingly with human- or AI-powered VATs. Finally, we analyzed what information VAT companies communicate to users about their collection and processing of users’ personal visual data through their privacy policies. Our fndings have implications for the development of VATs that safeguard blind users’ visual privacy, and our methods may be useful for other camera-based technology companies and their users. 
    more » « less
  3. Many images on the Web, including photographs and artistic images, feature spatial relationships between objects that are inaccessible to someone who is blind or visually impaired even when a text description is provided. While some tools exist to manually create accessible image descriptions, this work is time consuming and requires specialized tools. We introduce an approach that automatically creates spatially registered image labels based on how a sighted person naturally interacts with the image. Our system collects behavioral data from sighted viewers of an image, specifically eye gaze data and spoken descriptions, and uses them to generate a spatially indexed accessible image that can then be explored using an audio-based touch screen application. We describe our approach to assigning text labels to locations in an image based on eye gaze. We then report on two formative studies with blind users testing EyeDescribe. Our approach resulted in correct labels for all objects in our image set. Participants were able to better recall the location of objects when given both object labels and spatial locations. This approach provides a new method for creating accessible images with minimum required effort. 
    more » « less
  4. Children’s early understanding of mathematics provides a foundation for later success in school. Identifying ways to enhance mathematical instruction is crucial to understanding the ideal ways to promote academic success. Previous work has identified mathematical language (i.e., the words and concepts related to early mathematical development such as more, same, or similar) as a key mechanism that can be targeted to improve children’s development of early numeracy skills (e.g., counting, cardinality, and addition). Current recommendations suggest a combination of numeracy instruction and quantitative language instruction to promote numeracy skills. However, there is limited direct support of this recommendation. The goal of the proposed study is to compare the unique and combined effects of each type of instruction on children’s numeracy skills in the context of picture book reading. We randomly assigned 234 children (ages 3–5) to one of four conditions where they worked with trained project staff who read picture books targeting: (a) quantitative language only (e.g., more or less), (b) numeracy only (e.g., cardinality, addition), (c) combined [quantitative language + numeracy], or (d) nonnumerical (active control) picture books. Results revealed no significant effects of the quantitative language only or numeracy only conditions, but mixed effects of the combined condition. These findings indicate that more work is needed on how mathematical language and numeracy instruction should best be delivered to preschool children. 
    more » « less
  5. Culbertson, J.; Perfors, A; Rabagliati, H.; Ramenzoni, V. (Ed.)
    Learning to read is a critical skill; yet only a small portion of children in the United States are reading at or above grade level. Attention is one crucial process that affects the acquisition of reading skills. The process involves selectively choosing task relevant information and requires monitoring competing demands. Many books for beginning readers include illustrations, but this design choice may require learners to split their attention between multiple sources of information. This study employed eye tracking to examine whether embedding text within illustrations in children’s e-books inadvertently induces attentional competition. The results showed that spatially separating illustrations from the text in beginning reader books reduces attentional competition and improves children’s reading comprehension. This study shows that changes to the design of books for beginning readers can help promote literacy development in children. 
    more » « less