skip to main content


Title: Scaper: A library for soundscape synthesis and augmentation
Sound event detection (SED) in environmental recordings is a key topic of research in machine listening, with applications in noise monitoring for smart cities, self-driving cars, surveillance, bioa-coustic monitoring, and indexing of large multimedia collections. Developing new solutions for SED often relies on the availability of strongly labeled audio recordings, where the annotation includes the onset, offset and source of every event. Generating such precise annotations manually is very time consuming, and as a result existing datasets for SED with strong labels are scarce and limited in size. To address this issue, we present Scaper, an open-source library for soundscape synthesis and augmentation. Given a collection of iso-lated sound events, Scaper acts as a high-level sequencer that can generate multiple soundscapes from a single, probabilistically defined, 'specification'. To increase the variability of the output, Scaper supports the application of audio transformations such as pitch shifting and time stretching individually to every event. To illustrate the potential of the library, we generate a dataset of 10,000 sound-scapes and use it to compare the performance of two state-of-The-Art algorithms, including a breakdown by soundscape characteristics. We also describe how Scaper was used to generate audio stimuli for an audio labeling crowdsourcing experiment, and conclude with a discussion of Scaper's limitations and potential applications.  more » « less
Award ID(s):
1633259
NSF-PAR ID:
10074705
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA-17)
Page Range / eLocation ID:
344-348
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Aquatic environments encompass the world’s most extensive habitats, rich with sounds produced by a diversity of animals. Passive acoustic monitoring (PAM) is an increasingly accessible remote sensing technology that uses hydrophones to listen to the underwater world and represents an unprecedented, non-invasive method to monitor underwater environments. This information can assist in the delineation of biologically important areas via detection of sound-producing species or characterization of ecosystem type and condition, inferred from the acoustic properties of the local soundscape. At a time when worldwide biodiversity is in significant decline and underwater soundscapes are being altered as a result of anthropogenic impacts, there is a need to document, quantify, and understand biotic sound sources–potentially before they disappear. A significant step toward these goals is the development of a web-based, open-access platform that provides: (1) a reference library of known and unknown biological sound sources (by integrating and expanding existing libraries around the world); (2) a data repository portal for annotated and unannotated audio recordings of single sources and of soundscapes; (3) a training platform for artificial intelligence algorithms for signal detection and classification; and (4) a citizen science-based application for public users. Although individually, these resources are often met on regional and taxa-specific scales, many are not sustained and, collectively, an enduring global database with an integrated platform has not been realized. We discuss the benefits such a program can provide, previous calls for global data-sharing and reference libraries, and the challenges that need to be overcome to bring together bio- and ecoacousticians, bioinformaticians, propagation experts, web engineers, and signal processing specialists (e.g., artificial intelligence) with the necessary support and funding to build a sustainable and scalable platform that could address the needs of all contributors and stakeholders into the future. 
    more » « less
  2. Abstract

    Acoustic recordings of soundscapes are an important category of audio data that can be useful for answering a variety of questions, and an entire discipline within ecology, dubbed “soundscape ecology,” has risen to study them. Bird sound is often the focus of studies of soundscapes due to the ubiquitousness of birds in most terrestrial environments and their high vocal activity. Autonomous acoustic recorders have increased the quantity and availability of recordings of natural soundscapes while mitigating the impact of human observers on community behavior. However, such recordings are of little use without analysis of the sounds they contain. Manual analysis currently stands as the best means of processing this form of data for use in certain applications within soundscape ecology, but it is a laborious task, sometimes requiring many hours of human review to process comparatively few hours of recording. For this reason, few annotated data sets of soundscape recordings are publicly available. Further still, there are no publicly available strongly labeled soundscape recordings of bird sounds that contain information on timing, frequency, and species. Therefore, we present the first data set of strongly labeled bird sound soundscape recordings under free use license. These data were collected in the Northeastern United States at Powdermill Nature Reserve, Rector, Pennsylvania, USA. Recordings encompass 385 minutes of dawn chorus recordings collected by autonomous acoustic recorders between the months of April through July 2018. Recordings were collected in continuous bouts on four days during the study period and contain 48 species and 16,052 annotations. Applications of this data set may be numerous and include the training, validation, and testing of certain advanced machine‐learning models that detect or classify bird sounds. There are no copyright or propriety restrictions; please cite this paper when using materials within.

     
    more » « less
  3. Abstract

    Knowledge that can be gained from acoustic data collection in tropical ecosystems is low‐hanging fruit. There is every reason to record and with every day, there are fewer excuses not to do it. In recent years, the cost of acoustic recorders has decreased substantially (some can be purchased for underUS$50, e.g., Hillet al. 2018) and the technology needed to store and analyze acoustic data is continuously improving (e.g., Corrada Bravoet al. 2017, Xieet al. 2017). Soundscape recordings provide a permanent record of a site at a given time and contain a wealth of invaluable and irreplaceable information. Although challenges remain, failure to collect acoustic data now in tropical ecosystems would represent a failure to future generations of tropical researchers and the citizens that benefit from ecological research. In this commentary, we (1) argue for the need to increase acoustic monitoring in tropical systems; (2) describe the types of research questions and conservation issues that can be addressed with passive acoustic monitoring (PAM) using both short‐ and long‐term data in terrestrial and freshwater habitats; and (3) present an initial plan for establishing a global repository of tropical recordings.

     
    more » « less
  4. null (Ed.)
    Smartphones and mobile applications have become an integral part of our daily lives. This is reflected by the increase in mobile devices, applications, and revenue generated each year. However, this growth is being met with an increasing concern for user privacy, and there have been many incidents of privacy and data breaches related to smartphones and mobile applications in recent years. In this work, we focus on improving privacy for audio-based mobile systems. These applications will generally listen to all sounds in the environment and may record privacy-sensitive signals, such as speech, that may not be needed for the application. We present PAMS, a software development package for mobile applications. PAMS integrates a novel sound source filtering algorithm called Probabilistic Template Matching to generate a set of privacy-enhancing filters that remove extraneous sounds using learned statistical "templates" of these sounds. We demonstrate the effectiveness of PAMS by integrating it into a sleep monitoring system, with the intent to remove extraneous speech from breathing, snoring, and other sleep sounds that the system is monitoring. By comparing our PAMS enhanced sleep monitoring system with existing mobile systems, we show that PAMS can reduce speech intelligibility by up to 74.3% while maintaining similar performance in detecting sleeping sounds. 
    more » « less
  5. The ocean’s soundscape is fundamental to marine ecosystems, not only as a source of sensory information critical to many ecological processes but also as an indicator of biodiversity and habitat health. Yet, little is known about how ecoacoustic activity in marine habitats is altered by environmental changes such as temperature. The sounds produced by dense colonies of snapping shrimp dominate temperate and tropical coastal soundscapes worldwide and are a major driver broadband sound pressure level (SPL) patterns. Field recordings of soundscape patterns from the range limit of a snapping shrimp distribution showed that rates of snap production and associated SPL were closely positively correlated to water temperature. Snap rates changed by 15-60% per °C change in regional temperature, accompanied by fluctuations in SPL between 1-2 dB per °C. To test if this relationship was due to a direct effect of temperature, we measured snap rates in controlled experiments using two snapping shrimp species dominant in the Western Atlantic Ocean and Gulf of Mexico ( Alpheus heterochaelis and A. angulosus ). Snap rates were measured for shrimp held at different temperatures (across 10-30 °C range, with upper limit 2°C above current summer mean temperatures) and under different social groupings. Temperature had a significant effect on shrimp snap rates for all social contexts tested (individuals, pairs, and groups). For individuals and shrimp groups, snap production more than doubled between mid-range (20°C) and high (30°C) temperature treatments. Given that snapping shrimp sounds dominate the soundscapes of diverse habitats, including coral reefs, rocky bottoms, seagrass, and oyster beds, the strong influence of temperature on their activity will potentially alter soundscape patterns broadly. Increases in ambient sound levels driven by elevated water temperatures has ecological implications for signal detection, communication, and navigation in key coastal ecosystems for a wide range of organisms, including humans. 
    more » « less