NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Taxonomic Reasoning for Rare Arthropods: Combining Dense Image Captioning and RAG for Interpretable Classification

Lesperance, Nathaniel; Ratnasingham, Sujeevan; Taylor, Graham (May 2025, Canadian Artificial Intelligence Association)

In the context of pressing climate change challenges and the significant biodiversity loss among arthropods, automated taxonomic classification from organismal images is a subject of intense research. However, traditional AI pipelines based on deep neural visual architectures such as CNNs or ViTs face limitations such as degraded performance on the long-tail of classes and the inability to reason about their predictions. We integrate image captioning and retrieval-augmented generation (RAG) with large language models (LLMs) to enhance biodiversity monitoring, showing particular promise for characterizing rare and unknown arthropod species. While a naive Vision-Language Model (VLM) excels in classifying images of common species, the RAG model enables classification of rarer taxa by matching explicit textual descriptions of taxonomic features to contextual biodiversity text data from external sources. The RAG model shows promise in reducing overconfidence and enhancing accuracy relative to naive LLMs, suggesting its viability in capturing the nuances of taxonomic hierarchy, particularly at the challenging family and genus levels. Our findings highlight the potential for modern vision-language AI pipelines to support biodiversity conservation initiatives, emphasizing the role of comprehensive data curation and collaboration with citizen science platforms to improve species identification, unknown species characterization and ultimately inform conservation strategies.
more » « less
Full Text Available
Align and Distill: Unifying and Improving Domain Adaptive Object Detection

Kay, Justin; Haucke, Timm; Stathatos, Suzanne; Deng, Siqi; Young, Erik; Perona, Pietro; Beery, Sara; Van_Horn, Grant (March 2025, Transactions on machine learning research)

Object detectors often perform poorly on data that differs from their training set. Domain adaptive object detection (DAOD) methods have recently demonstrated strong results on addressing this challenge. Unfortunately, we identify systemic benchmarking pitfalls that call past results into question and hamper further progress: (a) Overestimation of performance due to underpowered baselines, (b) Inconsistent implementation practices preventing transparent comparisons of methods, and (c) Lack of generality due to outdated backbones and lack of diversity in benchmarks. We address these problems by introducing: (1) A unified benchmarking and implementation framework, Align and Distill (ALDI), enabling comparison of DAOD methods and supporting future development, (2) A fair and modern training and evaluation protocol for DAOD that addresses benchmarking pitfalls, (3) A new DAOD benchmark dataset, CFC-DAOD, increasing the diversity of available DAOD benchmarks, and (4) A new method, ALDI++, that achieves state-of-the-art results by a large margin. ALDI++ outperforms the previous state-of-the-art by +3.5 AP50 on Cityscapes Foggy Cityscapes, +5.7 AP50 on Sim10k Cityscapes (where ours is the only method to outperform a fair baseline), and +0.6 AP50 on CFC-DAOD. ALDI and ALDI++ are architecture-agnostic, setting a new state-of-the-art for YOLO and DETR-based DAOD as well without additional hyperparameter tuning. Our framework, dataset, and method offer a critical reset for DAOD and provide a strong foundation for future research.
more » « less
Full Text Available
Harnessing artificial intelligence to fill global shortfalls in biodiversity knowledge

https://doi.org/10.1038/s44358-025-00022-3

Pollock, Laura J; Kitzes, Justin; Beery, Sara; Gaynor, Kaitlyn M; Jarzyna, Marta A; Mac_Aodha, Oisin; Meyer, Bernd; Rolnick, David; Taylor, Graham W; Tuia, Devis; et al (March 2025, Nature Reviews Biodiversity)

Large, well described gaps exist in both what we know and what we need to know to address the biodiversity crisis. Artificial intelligence (AI) offers new potential for filling these knowledge gaps, but where the biggest and most influential gains could be made remains unclear. To date, biodiversity-related uses of AI have largely focused on tracking and monitoring of wildlife populations. Rapid progress is being made in the use of AI to build phylogenetic trees and species distribution models. However, AI also has considerable unrealized potential in the re-evaluation of important ecological questions, especially those that require the integration of disparate and inherently complex data types, such as images, video, text, audio and DNA. This Review describes the current and potential future use of AI to address seven clearly defined shortfalls in biodiversity knowledge. Recommended steps for AI-based improvements include the re-use of existing image data and the development of novel paradigms, including the collaborative generation of new testable hypotheses. The resulting expansion of biodiversity knowledge could lead to science spanning from genes to ecosystems — advances that might represent our best hope for meeting the rapidly approaching 2030 targets of the Global Biodiversity Framework.
more » « less
Full Text Available

Search for: All records