Abstract Landscape‐scale bioacoustic projects have become a popular approach to biodiversity monitoring. Combining passive acoustic monitoring recordings and automated detection provides an effective means of monitoring sound‐producing species' occupancy and phenology and can lend insight into unobserved behaviours and patterns. The availability of low‐cost recording hardware has lowered barriers to large‐scale data collection, but technological barriers in data analysis remain a bottleneck for extracting biological insight from bioacoustic datasets.We provide a robust and open‐source Python toolkit for detecting and localizing biological sounds in acoustic data.OpenSoundscape provides access to automated acoustic detection, classification and localization methods through a simple and easy‐to‐use set of tools. Extensive documentation and tutorials provide step‐by‐step instructions and examples of end‐to‐end analysis of bioacoustic data. Here, we describe the functionality of this package and provide concise examples of bioacoustic analyses with OpenSoundscape.By providing an interface for bioacoustic data and methods, we hope this package will lead to increased adoption of bioacoustics methods and ultimately to enhanced insights for ecology and conservation.
more »
« less
Simulated soundscapes and transfer learning boost the performance of acoustic classifiers under data scarcity
Abstract The biodiversity crisis necessitates spatially extensive methods to monitor multiple taxonomic groups for evidence of change in response to evolving environmental conditions. Programs that combine passive acoustic monitoring and machine learning are increasingly used to meet this need. These methods require large, annotated datasets, which are time‐consuming and expensive to produce, creating potential barriers to adoption in data‐ and funding‐poor regions. Recently released pre‐trained avian acoustic classification models provide opportunities to reduce the need for manual labelling and accelerate the development of new acoustic classification algorithms through transfer learning. Transfer learning is a strategy for developing algorithms under data scarcity that uses pre‐trained models from related tasks to adapt to new tasks.Our primary objective was to develop a transfer learning strategy using the feature embeddings of a pre‐trained avian classification model to train custom acoustic classification models in data‐scarce contexts. We used three annotated avian acoustic datasets to test whether transfer learning and soundscape simulation‐based data augmentation could substantially reduce the annotated training data necessary to develop performant custom acoustic classifiers. We also conducted a sensitivity analysis for hyperparameter choice and model architecture. We then assessed the generalizability of our strategy to increasingly novel non‐avian classification tasks.With as few as two training examples per class, our soundscape simulation data augmentation approach consistently yielded new classifiers with improved performance relative to the pre‐trained classification model and transfer learning classifiers trained with other augmentation approaches. Performance increases were evident for three avian test datasets, including single‐class and multi‐label contexts. We observed that the relative performance among our data augmentation approaches varied for the avian datasets and nearly converged for one dataset when we included more training examples.We demonstrate an efficient approach to developing new acoustic classifiers leveraging open‐source sound repositories and pre‐trained networks to reduce manual labelling. With very few examples, our soundscape simulation approach to data augmentation yielded classifiers with performance equivalent to those trained with many more examples, showing it is possible to reduce manual labelling while still achieving high‐performance classifiers and, in turn, expanding the potential for passive acoustic monitoring to address rising biodiversity monitoring needs.
more »
« less
- Award ID(s):
- 2025755
- PAR ID:
- 10644584
- Publisher / Repository:
- Wiley
- Date Published:
- Journal Name:
- Methods in Ecology and Evolution
- ISSN:
- 2041-210X
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Changes in land use and climate change threaten global biodiversity and ecosystems, calling for the urgent development of effective conservation strategies. Recognizing landscape heterogeneity, which refers to the variation in natural features within an area, is crucial for these strategies. While remote sensing images quantify landscape heterogeneity, they might fail to detect ecological patterns in moderately disturbed areas, particularly at minor spatial scales. This is partly because satellite imagery may not effectively capture undergrowth conditions due to its resolution constraints. In contrast, soundscape analysis, which studies environmental acoustic signals, emerges as a novel tool for understanding ecological patterns, providing reliable information on habitat conditions and landscape heterogeneity in complex environments across diverse scales and serving as a complement to remote sensing methods.We propose an unsupervised approach using passive acoustic monitoring data and network inference methods to analyse acoustic heterogeneity patterns based on biophony composition. This method uses sonotypes, unique acoustic entities characterized by their specific time‐frequency spaces, to establish the acoustic structure of a site through sonotype occurrences, focusing on general biophony rather than specific species and providing information on the acoustic footprint of a site. From a sonotype composition matrix, we use the Graphical Lasso method, a sparse Gaussian graphical model, to identify acoustic similarities across sites, map ecological complexity relationships through the nodes (sites) and edges (similarities), and transform acoustic data into a graphical representation of ecological interactions and landscape acoustic diversity.We implemented the proposed method across 17 sites within an oil palm plantation in Santander, Colombia. The resulting inferred graphs visualize the acoustic similarities among sites, reflecting the biophony achieved by characterizing the landscape through its acoustic structures. Correlating our findings with ecological metrics like the Bray–Curtis dissimilarity index and satellite imagery indices reveals significant insights into landscape heterogeneity.This unsupervised approach offers a new perspective on understanding ecological and biological interactions and advances soundscape analysis. The soundscape decomposition into sonotypes underscores the method's advantage, offering the possibility to associate sonotypes with species and identify their contribution to the similarity proposed by the graph.more » « less
-
Many applications of machine learning require a model to make accurate predictions on test examples that are distributionally different from training ones, while task-specific labels are scarce during training. An effective approach to this challenge is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective in many language and vision domains, it remains an open question how to effectively use pre-training on graph datasets. In this paper, we develop a new strategy and self-supervised methods for pre-training Graph Neural Networks (GNNs). The key to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs so that the GNN can learn useful local and global representations simultaneously. We systematically study pre-training on multiple graph classification datasets. We find that naïve strategies, which pre-train GNNs at the level of either entire graphs or individual nodes, give limited improvement and can even lead to negative transfer on many downstream tasks. In contrast, our strategy avoids negative transfer and improves generalization significantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction.more » « less
-
Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model’s learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.more » « less
-
Abstract Text augmentation is an effective technique in alleviating overfitting in NLP tasks. In existing methods, text augmentation and downstream tasks are mostly performed separately. As a result, the augmented texts may not be optimal to train the downstream model. To address this problem, we propose a three-level optimization framework to perform text augmentation and the downstream task end-to- end. The augmentation model is trained in a way tailored to the downstream task. Our framework consists of three learning stages. A text summarization model is trained to perform data augmentation at the first stage. Each summarization example is associated with a weight to account for its domain difference with the text classification data. At the second stage, we use the model trained at the first stage to perform text augmentation and train a text classification model on the augmented texts. At the third stage, we evaluate the text classification model trained at the second stage and update weights of summarization examples by minimizing the validation loss. These three stages are performed end-to-end. We evaluate our method on several text classification datasets where the results demonstrate the effectiveness of our method. Code is available at https://github.com/Sai-Ashish/End-to-End-Text-Augmentation.more » « less
An official website of the United States government

