skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.

Title: Classifying Humpback Whale Calls to Song and Non-Song Vocalizations using Bag of Words Descriptor on Acoustic Data
Humpback whale behavior, population distribution and structure can be inferred from long term underwater passive acoustic monitoring of their vocalizations. Here we develop automatic approaches for classifying humpback whale vocalizations into the two categories of song and non-song, employing machine learning techniques. The vocalization behavior of humpback whales was monitored over instantaneous vast areas of the Gulf of Maine using a large aperture coherent hydrophone array system via the passive ocean acoustic waveguide remote sensing technique over multiple diel cycles in Fall 2006. We use wavelet signal denoising and coherent array processing to enhance the signal-to-noise ratio. To build features vector for every time sequence of the beamformed signals, we employ Bag of Words approach to time-frequency features. Finally, we apply Support Vector Machine (SVM), Neural Networks, and Naive Bayes to classify the acoustic data and compare their performances. Best results are obtained using Mel Frequency Cepstrum Coefficient (MFCC) features and SVM which leads to 94% accuracy and 72.73% F1-score for humpback whale song versus non-song vocalization classification, showing effectiveness of the proposed approach for real-time classification at sea.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)
Page Range / eLocation ID:
865 to 870
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. A large variety of sound sources in the ocean, including biological, geophysical, and man-made, can be simultaneously monitored over instantaneous continental-shelf scale regions via the passive ocean acoustic waveguide remote sensing (POAWRS) technique by employing a large-aperture densely-populated coherent hydrophone array system. Millions of acoustic signals received on the POAWRS system per day can make it challenging to identify individual sound sources. An automated classification system is necessary to enable sound sources to be recognized. Here, the objectives are to (i) gather a large training and test data set of fin whale vocalization and other acoustic signal detections; (ii) build multiple fin whale vocalization classifiers, including a logistic regression, support vector machine (SVM), decision tree, convolutional neural network (CNN), and long short-term memory (LSTM) network; (iii) evaluate and compare performance of these classifiers using multiple metrics including accuracy, precision, recall and F1-score; and (iv) integrate one of the classifiers into the existing POAWRS array and signal processing software. The findings presented here will (1) provide an automatic classifier for near real-time fin whale vocalization detection and recognition, useful in marine mammal monitoring applications; and (2) lay the foundation for building an automatic classifier applied for near real-time detection and recognition of a wide variety of biological, geophysical, and man-made sound sources typically detected by the POAWRS system in the ocean. 
    more » « less
  2. An eight-element oil-filled hydrophone array is used to measure the acoustic field in littoral waters. This prototype array was deployed during an experiment between Jeffrey’s Ledge and the Stellwagen Bank region off the coast of Rockport, Massachusetts USA. During the experiment, several humpback whale vocalizations, distant ship tonals and high frequency conventional echosounder pings were recorded. Visual confirmation of humpback moving in bearing relative to the array verifies the directional sensing from array beamforming. During deployment, the array is towed at speeds varying from 4-7 kts in water depths of roughly 100 m with conditions at sea state 2 to 3. This array system consists of a portable winch with array, tow cable and 3 water-resistant boxes housing electronics. This system is deployed and operated by 2 crew members onboard a 13 m commercial fishing vessel during the experiment. Non-acoustic sensor (NAS) information is obtained to provide depth, temperature, and heading data using commercial off the shelf (COTS) components utilizing RS485/232 data communications. Acoustic data sampling was performed at 8 kHz, 30 kHz and 100 kHz with near real-time processing of data and enhanced Signal to Noise Ratio (SNR) from beamforming. The electrical system components are deployed with 3 stacked electronics boxes housing power, data acquisition and data processing components in water resistant compartments. A laptop computer with 8 TB of external storage and an independent Global Positioning System (GPS) antenna is used to run Passive Ocean Acoustic Waveguide Remote Sensing (POAWRS) software providing beamformed spectrogram data and live NAS data with capability of capturing several days of data. The acquisition system consists of Surface Mount Device (SMD) pre-amplifiers with filter to an analog differential pair shipboard COTS acquisition system. Pre-amplifiers are constructed using SMD technology where components are pressure tolerant and potting is not necessary. Potting of connectors, electronics and hydrophones via 3D printed molding techniques will be discussed. Array internal components are manufactured with Thermoplastic Polyurethane (TPU) 3D printed material to dampen array vibrations with forward and aft vibration isolation modules (VIM). Polyurethane foam (PUF) used to scatter breathing waves and dampen contact from wires inside the array without attenuating high frequencies and allowing for significant noise reduction. A single Tygon array section with a length of 7.5 m and diameter of 38 mm contains 8 transducer elements with a spacing of 75 cm (1 kHz design frequency). Pre- amplifiers and NAS modules are affixed using Vectran and steel wire rope positioned by swaged stops along the strength member. The tow cable length is 100 m with a diameter of 22 mm that is potted to a hose adapter to break out 12 braided copper wire twisted pair conductors and terminates the tow cable Vectran braid. This array in its current state of development is a low-cost alternative to obtain quality acoustic data from a towed array system. Used here for observation of whale vocalizations, this type of array also has many applications in military sonar and seismic surveying. Maintenance on the array can be performed without the use of special facilities or equipment for dehosing and conveniently uses castor oil as an environmentally safe pressure compensating and coupling fluid. Array development including selection of transducers, NAS modules, acoustic acquisition system, array materials and method of construction with results from several deployments will be discussed. We also present beamformed spectrograms containing humpback whale downsweep moans and underwater blowing (bubbles) sounds associated with feeding on sand lance (Ammodytes dubius). 
    more » « less
  3. The 2019 ENRICH Voyage (Euphausiids and Nutrient Recycling in Cetacean Hotspots), was conducted from 19 January – 5 March 2019, aboard the RV Investigator. The voyage departed from and returned to Hobart, Tasma-nia, Australia, and conducted most marine science operations in the area between 60°S – 67°S and 138°E – 152°E. As part of the multidisciplinary research programme, a passive acoustic survey for marine mammals was undertaken for the duration of the voyage, with the main goal to monitor for and locate groups of calling Antarctic blue whales (Balaenoptera musculus intermedia). Directional sonobuoys were used at 295 listening stations, which resulted in 828 hours of acoustic recordings. Monitoring also took place for pygmy blue, (B. m. brevicauda), fin, (B. physalus), sperm (Physeter macrocephalus), humpback (Megaptera novaeangliae), sei (B. borealis), and Antarctic minke whales (B. bonarensis); for leopard (Hydrurga leptonyx), crabeater (Lobodon carcinophaga), Ross (Ommatophoca rossii), and Weddell seals (Leptonychotes weddellii), and for odontocete (low frequency whistles) vocalisations during each listening station. Calibrated measurements of the bearing and intensity of the majority of calls from blue and fin whales were obtained in real time. 33,435 calls from Antarctic blue whales were detected at 238 listening stations throughout the voyage, most of them south of 60°S. Southeast Indian Ocean blue whale song was detected primarily between 47° and 55°S while the southwest Pacific blue whale song was recorded between 44° and 48°S. Most baleen whale and seal calls were detected along the continental shelf break in the study region but some were also detected in deeper waters. Marine mammal calls were uncommon on the shelf, which did not have any ice cover during the survey. Calling Antarctic blue whales were tracked and located on multiple occasions to enable closer study of their fine-scale movements and calling behaviour as well as enabling collection of photo ID, behavioural, and photogrammetry data. The passive acoustic data collected during this voyage will allow investigation of the distribution of Antarctic blue whales in relation to environmental correlates measured during ENRICH, with a focus on blue whale prey. 
    more » « less
  4. Multiple mechanized ocean vessels, including both surface ships and submerged vehicles, can be simultaneously monitored over instantaneous continental-shelf scale regions >10,000 km 2 via passive ocean acoustic waveguide remote sensing. A large-aperture densely-sampled coherent hydrophone array system is employed in the Norwegian Sea in Spring 2014 to provide directional sensing in 360 degree horizontal azimuth and to significantly enhance the signal-to-noise ratio (SNR) of ship-radiated underwater sound, which improves ship detection ranges by roughly two orders of magnitude over that of a single hydrophone. Here, 30 mechanized ocean vessels spanning ranges from nearby to over 150 km from the coherent hydrophone array, are detected, localized and classified. The vessels are comprised of 20 identified commercial ships and 10 unidentified vehicles present in 8 h/day of Passive Ocean Acoustic Waveguide Remote Sensing (POAWRS) observation for two days. The underwater sounds from each of these ocean vessels received by the coherent hydrophone array are dominated by narrowband signals that are either constant frequency tonals or have frequencies that waver or oscillate slightly in time. The estimated bearing-time trajectory of a sequence of detections obtained from coherent beamforming are employed to determine the horizontal location of each vessel using the Moving Array Triangulation (MAT) technique. For commercial ships present in the region, the estimated horizontal positions obtained from passive acoustic sensing are verified by Global Positioning System (GPS) measurements of the ship locations found in a historical Automatic Identification System (AIS) database. We provide time-frequency characterizations of the underwater sounds radiated from the commercial ships and the unidentified vessels. The time-frequency features along with the bearing-time trajectory of the detected signals are applied to simultaneously track and distinguish these vessels. 
    more » « less
  5. Quantifying how animals respond to disturbance events bears relevance for understanding consequences to population health. We investigate whether blue whales respond acoustically to naturally occurring episodic noise by examining calling before and after earthquakes (27 040 calls, 32 earthquakes; 27 January–29 June 2016). Two vocalization types were evaluated: New Zealand blue whale song and downswept vocalizations ('D calls'). Blue whales did not alter the number of D calls, D call received level or song intensity following earthquakes (paired t -tests, p > 0.7 for all). Linear models accounting for earthquake strength and proximity revealed significant relationships between change in calling activity surrounding earthquakes and prior calling activity (D calls: R 2 = 0.277, p < 0.0001; song: R 2 = 0.080, p = 0.028); however, these same relationships were true for ‘null’ periods without earthquakes (D calls: R 2 = 0.262, p < 0.0001; song: R 2 = 0.149, p = 0.0002), indicating that the pattern is driven by blue whale calling context regardless of earthquake presence. Our findings that blue whales do not respond to episodic natural noise provide context for interpreting documented acoustic responses to anthropogenic noise sources, including shipping traffic and petroleum development, indicating that they potentially evolved tolerance for natural noise sources but not novel noise from anthropogenic origins. 
    more » « less