skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on April 16, 2026

Title: Developing a CSViewer for Education Application with Natural Language Interactions
The CSViewer for Analysts application provides access to a comprehensive database collected from the Cayo Santiago rhesus monkey colony with 11000 subjects over the past 86 years. Assorted data selection, visualization and analytical features are added to its new version 1.2, and results from mining newly collected osteological measures revealed new skeletal and dental development models. To expose the intended knowledge model of the CS colony to public audiences, especially to science classes at colleges and schools, a CSViewer for Education edition is planned. Supporting queries in plain English is considered beneficial to help students to seek for answers. This paper presents initial experiments with the Claude language model. A dental checkup dataset is used to and queries in plain English are used to explore the dataset through Claude API and the results were integrated with CSViewer to use its charting features to display dental development trend of the CS monkey population. Further development based on natural language interactions enabling utilization of the generative AI features are to be continued.  more » « less
Award ID(s):
1926402
PAR ID:
10652411
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
SPRINGER NATURE
Date Published:
Subject(s) / Keyword(s):
Knowledge Model, Cayo Santiago Rhesus Colony, AI in Education, Querying Database in Natural Language
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Growth modeling is a key aspect of statistical analysis, particularly in fields such as biology, economics, and social sciences. In primate development studies, ontogeny is a well-known phenomenon that skeletal growth tends to stop at a certain age. Bone dimension measures of over 1200 skeletal sets derived from the Cayo Santiago (CS) rhesus colony were recently collected by a collaborative effort supported through NSF grants. These measures provided a valuable resource for extending a knowledge model for primate skeletal development regarding ontogeny, and variations based on sex and matrilineal lineage. This paper presents initial results of a custom regression model proposed, as well as model comparisons with other popular models used in similar line fitting tasks. Related data analytics and visualization support as implemented in the CSViewer for Analysts system are also described. 
    more » « less
  2. ABSTRACT The Cayo Santiago (CS) rhesus macaque colony has raised a total of over 11,000 animals in a free‐ranging setting very close to the natural environment. The well‐kept individual and family records, as well as social group management data, have been a valuable source for anthropological research. However, the various sources of data have been stored in separation, and there was no straightforward way for researchers to access them directly. Since 2019, an ongoing effort supported through an NSF collaborative grant has been collecting morphology and imagery data from the CS‐derived skeleton collection. One specific aim is to build an integrative database to combine newly collected osteology data (bone measurement) and existing genealogy and demographic information. A second aim is to develop a software application (codenamed as CSViewer for Analysts) to provide user‐friendly interfaces for the research community to access and analyze the data. In this paper, we present a set of results generated by using standard data science tools and techniques, which help construct a holistic view of the CS rhesus colony along multiple dimensions. The matrilineal family lineage and pedigree can be visualized using various tree forms, as well as patrilineal lineages traced back to the mid‐1970s. Social group evolution charts are generated and add new features to the original records. Reproduction patterns are studied in the context of group interaction and animal transfer logs. Cross‐referencing between genealogy and osteology data can also be accomplished. Most of these charts are supported in the CSViewer app with convenient tooltip features to show details as needed. Selection based on attributes like founder line, sex, and birth season can be applied to tailor charts to a research project so that researchers can zoom into a data set that can best support their analytics goals. 
    more » « less
  3. ABSTRACT The Cayo Santiago rhesus colony and its derived skeletal collections provide abundant data made available since its founding in 1938. A project supported by an NSF collaborative grant has been committed to building a database that integrates the genetic and age‐related information of the colony, together with social group interactions and environmental effects, aiming to provide a knowledge model to researchers with insights from this powerful non‐human data repository for analyzing human conditions including growth, development, adaption, resilience, aging, and disease in a contextualized manner. This paper introduces CSViewer for Analysts, a computer application that provides user‐friendly tools for researchers to access the integrated database and to generate a variety of visuals encompassing matrilineal or patrilineal family lines, social groups, time spans, phenotypic measurements, and photos recently collected through this project. Adopting Java‐based technologies and third‐party libraries for data analytics and visualization, CSViewer can help its users select meaningful datasets using various criteria, conduct data analytics and visualization tasks, and manage their “project artifacts” (such as selected datasets, models, and charts, etc.). Version 1.0 of the CSViewer app has been tested by collaborators and in a workshop by a limited number of researchers and science educators since 2023. Based on users' feedback, additional features have been implemented in version 1.1.0, and more features are planned for subsequent subversions with bundles for researchers to download and explore. 
    more » « less
  4. Introduction: Because developing integrated computer science (CS) curriculum is a resource-intensive process, there is interest in leveraging the capabilities of AI tools, including large language models (LLMs), to streamline this task. However, given the novelty of LLMs, little is known about their ability to generate appropriate curriculum content. Research Question: How do current LLMs perform on the task of creating appropriate learning activities for integrated computer science education? Methods: We tested two LLMs (Claude 3.5 Sonnet and ChatGPT 4-o) by providing them with a subset of national learning standards for both CS and language arts and asking them to generate a high-level description of learning activities that met standards for both disciplines. Four humans rated the LLM output – using an aggregate rating approach – in terms of (1) whether it met the CS learning standard, (2) whether it met the language arts learning standard, (3) whether it was equitable, and (4) its overall quality. Results: For Claude AI, 52% of the activities met language arts standards, 64% met CS standards, and the average quality rating was middling. For ChatGPT, 75% of the activities met language arts standards, 63% met CS standards, and the average quality rating was low. Virtually all activities from both LLMs were rated as neither actively promoting nor inhibiting equitable instruction. Discussion: Our results suggest that LLMs are not (yet) able to create appropriate learning activities from learning standards. The activities were generally not usable by classroom teachers without further elaboration and/or modification. There were also grammatical errors in the output, something not common with LLM-produced text. Further, standards in one or both disciplines were often not addressed, and the quality of the activities was often low. We conclude with recommendations for the use of LLMs in curriculum development in light of these findings. 
    more » « less
  5. The more new features that are being added to smartphones, the harder it becomes for users to find them. This is because the feature names are usually short and there are just too many of them for the users to remember the exact words. The users are more comfortable asking contextual queries that describe the features they are looking for, but the standard term frequency-based search cannot process them. This paper presents a novel retrieval system for mobile features that accepts intuitive and contextual search queries. We trained a relevance model via contrastive learning from a pre-trained language model to perceive the contextual relevance between a query embedding and indexed mobile features. Also, to make it efficiently run on-device using minimal resources, we applied knowledge distillation to compress the model without degrading much performance. To verify the feasibility of our method, we collected test queries and conducted comparative experiments with the currently deployed search baselines. The results show that our system outperforms the others on contextual sentence queries and even on usual keyword-based queries. 
    more » « less