skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Dragut, Eduard"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Flowcharts are graphical tools for representing complex concepts in concise visual representations. This paper introduces the FlowLearn dataset, a resource tailored to enhance the understanding of flowcharts. FlowLearn contains complex scientific flowcharts and simulated flowcharts. The scientific subset contains 3,858 flowcharts sourced from scientific literature and the simulated subset contains 10,000 flowcharts created using a customizable script. The dataset is enriched with annotations for visual components, OCR, Mermaid code representation, and VQA question-answer pairs. Despite the proven capabilities of Large Vision-Language Models (LVLMs) in various visual understanding tasks, their effectiveness in decoding flowcharts—a crucial element of scientific communication—has yet to be thoroughly investigated. The FlowLearn test set is crafted to assess the performance of LVLMs in flowchart comprehension. Our study thoroughly evaluates state-of-the-art LVLMs, identifying existing limitations and establishing a foundation for future enhancements in this relatively underexplored domain. For instance, in tasks involving simulated flowcharts, GPT-4V achieved the highest accuracy (58\%) in counting the number of nodes, while Claude recorded the highest accuracy (83\%) in OCR tasks. Notably, no single model excels in all tasks within the FlowLearn framework, highlighting significant opportunities for further development. 
    more » « less
  2. Abstract Millions of individuals who have limited or no functional speech use augmentative and alternative communication (AAC) technology to participate in daily life and exercise the human right to communication. While advances in AAC technology lag significantly behind those in other technology sectors, mainstream technology innovations such as artificial intelligence (AI) present potential for the future of AAC. However, a new future of AAC will only be as effective as it is responsive to the needs and dreams of the people who rely upon it every day. AAC innovation must reflect an iterative, collaborative process with AAC users. To do this, we worked collaboratively with AAC users to complete participatory qualitative research about AAC innovation through AI. We interviewed 13 AAC users regarding (1) their current AAC engagement; (2) the barriers they experience in using AAC; (3) their dreams regarding future AAC development; and (4) reflections on potential AAC innovations. To analyze these data, a rapid research evaluation and appraisal was used. Within this article, the themes that emerged during interviews and their implications for future AAC development will be discussed. Strengths, barriers, and considerations for participatory design will also be described. 
    more » « less
    Free, publicly-accessible full text available November 1, 2025
  3. Introduction: Social participation for emerging symbolic communicators on the autism spectrum is often restricted. This is due in part to the time and effort required for both children and partners to use traditional augmentative and alternative communication (AAC) technologies during fast-paced social routines. Innovations in artificial intelligence provide the potential for context-aware AAC technology that can provide just-in-time communication options based on linguistic input from partners to minimize the time and effort needed to use AAC technologies for social participation. Methods: This preliminary study used an alternating treatment design to compare the effects of a context-aware AAC prototype with automated cloze phrase response options to traditional AAC for supporting three young children who were emerging symbolic communicators on the autism spectrum in participating within a social routine. Results: Visual analysis and effect size estimates suggest the context-aware AAC condition resulted in increases in linguistic participation, vocal approximations, and visual attention for all three children. Conclusion: While this study was only an initial exploration and results are preliminary, context-aware AAC technologies have the potential to enhance participation and communication outcomes for young emerging symbolic communicators on the autism spectrum and more research is needed. 
    more » « less
  4. We present SciDMT, an enhanced and expanded corpus for scientific mention detection, offering a significant advancement over existing related resources. SciDMT contains annotated scientific documents for datasets (D), methods (M), and tasks (T). The corpus consists of two components: 1) the SciDMT main corpus, which includes 48 thousand scientific articles with over 1.8 million weakly annotated mention annotations in the format of in-text span, and 2) an evaluation set, which comprises 100 scientific articles manually annotated for evaluation purposes. To the best of our knowledge, SciDMT is the largest corpus for scientific entity mention detection. The corpus’s scale and diversity are instrumental in developing and refining models for tasks such as indexing scientific papers, enhancing information retrieval, and improving the accessibility of scientific knowledge. We demonstrate the corpus’s utility through experiments with advanced deep learning architectures like SciBERT and GPT-3.5. Our findings establish performance baselines and highlight unresolved challenges in scientific mention detection. SciDMT serves as a robust benchmark for the research community, encouraging the development of innovative models to further the field of scientific information extraction 
    more » « less
  5. Purpose:Augmentative and alternative communication (AAC) technology innovation is urgently needed to improve outcomes for children on the autism spectrum who are minimally verbal. One potential technology innovation is applying artificial intelligence (AI) to automate strategies such as augmented input to increase language learning opportunities while mitigating communication partner time and learning barriers. Innovation in AAC research and design methodology is also needed to empirically explore this and other applications of AI to AAC. The purpose of this report was to describe (a) the development of an AAC prototype using a design methodology new to AAC research and (b) a preliminary investigation of the efficacy of this potential new AAC capability. Method:The prototype was developed using a Wizard-of-Oz prototyping approach that allows for initial exploration of a new technology capability without the time and effort required for full-scale development. The preliminary investigation with three children on the autism spectrum who were minimally verbal used an adapted alternating treatment design to compare the effects of a Wizard-of-Oz prototype that provided automated augmented input (i.e., pairing color photos with speech) to a standard topic display (i.e., a grid display with line drawings) on visual attention, linguistic participation, and (for one participant) word learning during a circle activity. Results:Preliminary investigation results were variable, but overall participants increased visual attention and linguistic participation when using the prototype. Conclusions:Wizard-of-Oz prototyping could be a valuable approach to spur much needed innovation in AAC. Further research into efficacy, reliability, validity, and attitudes is required to more comprehensively evaluate the use of AI to automate augmented input in AAC. 
    more » « less
  6. Abstract The recognition of dataset names is a critical task for automatic information extraction in scientific literature, enabling researchers to understand and identify research opportunities. However, existing corpora for dataset mention detection are limited in size and naming diversity. In this paper, we introduce the Dataset Mentions Detection Dataset (DMDD), the largest publicly available corpus for this task. DMDD consists of the DMDD main corpus, comprising 31,219 scientific articles with over 449,000 dataset mentions weakly annotated in the format of in-text spans, and an evaluation set, which comprises 450 scientific articles manually annotated for evaluation purposes. We use DMDD to establish baseline performance for dataset mention detection and linking. By analyzing the performance of various models on DMDD, we are able to identify open problems in dataset mention detection. We invite the community to use our dataset as a challenge to develop novel dataset mention detection models. 
    more » « less