skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 26, 2026

Title: Simulated soundscapes and transfer learning boost the performance of acoustic classifiers under data scarcity
Abstract The biodiversity crisis necessitates spatially extensive methods to monitor multiple taxonomic groups for evidence of change in response to evolving environmental conditions. Programs that combine passive acoustic monitoring and machine learning are increasingly used to meet this need. These methods require large, annotated datasets, which are time‐consuming and expensive to produce, creating potential barriers to adoption in data‐ and funding‐poor regions. Recently released pre‐trained avian acoustic classification models provide opportunities to reduce the need for manual labelling and accelerate the development of new acoustic classification algorithms through transfer learning. Transfer learning is a strategy for developing algorithms under data scarcity that uses pre‐trained models from related tasks to adapt to new tasks.Our primary objective was to develop a transfer learning strategy using the feature embeddings of a pre‐trained avian classification model to train custom acoustic classification models in data‐scarce contexts. We used three annotated avian acoustic datasets to test whether transfer learning and soundscape simulation‐based data augmentation could substantially reduce the annotated training data necessary to develop performant custom acoustic classifiers. We also conducted a sensitivity analysis for hyperparameter choice and model architecture. We then assessed the generalizability of our strategy to increasingly novel non‐avian classification tasks.With as few as two training examples per class, our soundscape simulation data augmentation approach consistently yielded new classifiers with improved performance relative to the pre‐trained classification model and transfer learning classifiers trained with other augmentation approaches. Performance increases were evident for three avian test datasets, including single‐class and multi‐label contexts. We observed that the relative performance among our data augmentation approaches varied for the avian datasets and nearly converged for one dataset when we included more training examples.We demonstrate an efficient approach to developing new acoustic classifiers leveraging open‐source sound repositories and pre‐trained networks to reduce manual labelling. With very few examples, our soundscape simulation approach to data augmentation yielded classifiers with performance equivalent to those trained with many more examples, showing it is possible to reduce manual labelling while still achieving high‐performance classifiers and, in turn, expanding the potential for passive acoustic monitoring to address rising biodiversity monitoring needs.  more » « less
Award ID(s):
2025755
PAR ID:
10644584
Author(s) / Creator(s):
 ;  ;  ;  ;  ;  ;  ;  ;  
Publisher / Repository:
Wiley
Date Published:
Journal Name:
Methods in Ecology and Evolution
ISSN:
2041-210X
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Landscape‐scale bioacoustic projects have become a popular approach to biodiversity monitoring. Combining passive acoustic monitoring recordings and automated detection provides an effective means of monitoring sound‐producing species' occupancy and phenology and can lend insight into unobserved behaviours and patterns. The availability of low‐cost recording hardware has lowered barriers to large‐scale data collection, but technological barriers in data analysis remain a bottleneck for extracting biological insight from bioacoustic datasets.We provide a robust and open‐source Python toolkit for detecting and localizing biological sounds in acoustic data.OpenSoundscape provides access to automated acoustic detection, classification and localization methods through a simple and easy‐to‐use set of tools. Extensive documentation and tutorials provide step‐by‐step instructions and examples of end‐to‐end analysis of bioacoustic data. Here, we describe the functionality of this package and provide concise examples of bioacoustic analyses with OpenSoundscape.By providing an interface for bioacoustic data and methods, we hope this package will lead to increased adoption of bioacoustics methods and ultimately to enhanced insights for ecology and conservation. 
    more » « less
  2. Abstract The interface between field biology and technology is energizing the collection of vast quantities of environmental data. Passive acoustic monitoring, the use of unattended recording devices to capture environmental sound, is an example where technological advances have facilitated an influx of data that routinely exceeds the capacity for analysis. Computational advances, particularly the integration of machine learning approaches, will support data extraction efforts. However, the analysis and interpretation of these data will require parallel growth in conceptual and technical approaches for data analysis. Here, we use a large hand‐annotated dataset to showcase analysis approaches that will become increasingly useful as datasets grow and data extraction can be partially automated.We propose and demonstrate seven technical approaches for analyzing bioacoustic data. These include the following: (1) generating species lists and descriptions of vocal variation, (2) assessing how abiotic factors (e.g., rain and wind) impact vocalization rates, (3) testing for differences in community vocalization activity across sites and habitat types, (4) quantifying the phenology of vocal activity, (5) testing for spatiotemporal correlations in vocalizations within species, (6) among species, and (7) using rarefaction analysis to quantify diversity and optimize bioacoustic sampling.To demonstrate these approaches, we sampled in 2016 and 2018 and used hand annotations of 129,866 bird vocalizations from two forests in New Hampshire, USA, including sites in the Hubbard Brook Experiment Forest where bioacoustic data could be integrated with more than 50 years of observer‐based avian studies. Acoustic monitoring revealed differences in community patterns in vocalization activity between forests of different ages, as well as between nearby similar watersheds. Of numerous environmental variables that were evaluated, background noise was most clearly related to vocalization rates. The songbird community included one cluster of species where vocalization rates declined as ambient noise increased and another cluster where vocalization rates declined over the nesting season. In some common species, the number of vocalizations produced per day was correlated at scales of up to 15 km. Rarefaction analyses showed that adding sampling sites increased species detections more than adding sampling days.Although our analyses used hand‐annotated data, the methods will extend readily to large‐scale automated detection of vocalization events. Such data are likely to become increasingly available as autonomous recording units become more advanced, affordable, and power efficient. Passive acoustic monitoring with human or automated identification at the species level offers growing potential to complement observer‐based studies of avian ecology. 
    more » « less
  3. Abstract Insect populations are changing rapidly, and monitoring these changes is essential for understanding the causes and consequences of such shifts. However, large‐scale insect identification projects are time‐consuming and expensive when done solely by human identifiers. Machine learning offers a possible solution to help collect insect data quickly and efficiently.Here, we outline a methodology for training classification models to identify pitfall trap‐collected insects from image data and then apply the method to identify ground beetles (Carabidae). All beetles were collected by the National Ecological Observatory Network (NEON), a continental scale ecological monitoring project with sites across the United States. We describe the procedures for image collection, image data extraction, data preparation, and model training, and compare the performance of five machine learning algorithms and two classification methods (hierarchical vs. single‐level) identifying ground beetles from the species to subfamily level. All models were trained using pre‐extracted feature vectors, not raw image data. Our methodology allows for data to be extracted from multiple individuals within the same image thus enhancing time efficiency, utilizes relatively simple models that allow for direct assessment of model performance, and can be performed on relatively small datasets.The best performing algorithm, linear discriminant analysis (LDA), reached an accuracy of 84.6% at the species level when naively identifying species, which was further increased to >95% when classifications were limited by known local species pools. Model performance was negatively correlated with taxonomic specificity, with the LDA model reaching an accuracy of ~99% at the subfamily level. When classifying carabid species not included in the training dataset at higher taxonomic levels species, the models performed significantly better than if classifications were made randomly. We also observed greater performance when classifications were made using the hierarchical classification method compared to the single‐level classification method at higher taxonomic levels.The general methodology outlined here serves as a proof‐of‐concept for classifying pitfall trap‐collected organisms using machine learning algorithms, and the image data extraction methodology may be used for nonmachine learning uses. We propose that integration of machine learning in large‐scale identification pipelines will increase efficiency and lead to a greater flow of insect macroecological data, with the potential to be expanded for use with other noninsect taxa. 
    more » « less
  4. Abstract Changes in land use and climate change threaten global biodiversity and ecosystems, calling for the urgent development of effective conservation strategies. Recognizing landscape heterogeneity, which refers to the variation in natural features within an area, is crucial for these strategies. While remote sensing images quantify landscape heterogeneity, they might fail to detect ecological patterns in moderately disturbed areas, particularly at minor spatial scales. This is partly because satellite imagery may not effectively capture undergrowth conditions due to its resolution constraints. In contrast, soundscape analysis, which studies environmental acoustic signals, emerges as a novel tool for understanding ecological patterns, providing reliable information on habitat conditions and landscape heterogeneity in complex environments across diverse scales and serving as a complement to remote sensing methods.We propose an unsupervised approach using passive acoustic monitoring data and network inference methods to analyse acoustic heterogeneity patterns based on biophony composition. This method uses sonotypes, unique acoustic entities characterized by their specific time‐frequency spaces, to establish the acoustic structure of a site through sonotype occurrences, focusing on general biophony rather than specific species and providing information on the acoustic footprint of a site. From a sonotype composition matrix, we use the Graphical Lasso method, a sparse Gaussian graphical model, to identify acoustic similarities across sites, map ecological complexity relationships through the nodes (sites) and edges (similarities), and transform acoustic data into a graphical representation of ecological interactions and landscape acoustic diversity.We implemented the proposed method across 17 sites within an oil palm plantation in Santander, Colombia. The resulting inferred graphs visualize the acoustic similarities among sites, reflecting the biophony achieved by characterizing the landscape through its acoustic structures. Correlating our findings with ecological metrics like the Bray–Curtis dissimilarity index and satellite imagery indices reveals significant insights into landscape heterogeneity.This unsupervised approach offers a new perspective on understanding ecological and biological interactions and advances soundscape analysis. The soundscape decomposition into sonotypes underscores the method's advantage, offering the possibility to associate sonotypes with species and identify their contribution to the similarity proposed by the graph. 
    more » « less
  5. Many applications of machine learning require a model to make accurate predictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective in many language and vision domains, it remains an open question how to effectively use pre-training on graph datasets. In this paper, we develop a new strategy and self-supervised methods for pre-training Graph Neural Networks (GNNs). The key to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs so that the GNN can learn useful local and global representations simultaneously. We systematically study pre-training on multiple graph classification datasets. We find that naïve strategies, which pre-train GNNs at the level of either entire graphs or individual nodes, give limited improvement and can even lead to negative transfer on many downstream tasks. In contrast, our strategy avoids negative transfer and improves generalization significantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction. 
    more » « less