In-situ visual observations of marine organisms is crucial to developing behavioural understandings and their relations to their surrounding ecosystem. Typically, these observations are collected via divers, tags, and remotely-operated or human-piloted vehicles. Recently, however, autonomous underwater vehicles equipped with cameras and embedded computers with GPU capabilities are being developed for a variety of applications, and in particular, can be used to supplement these existing data collection mechanisms where human operation or tags are more difficult. Existing approaches have focused on using fully-supervised tracking methods, but labelled data for many underwater species are severely lacking. Semi-supervised trackers may offer alternative tracking solutions because they require less data than fully-supervised counterparts. However, because there are not existing realistic underwater tracking datasets, the performance of semi-supervised tracking algorithms in the marine domain is not well understood. To better evaluate their performance and utility, in this paper we provide (1) a novel dataset specific to marine animals located at
Comprehensive Underwater Object Tracking Benchmark Dataset and Underwater Image Enhancement With GAN
Abstract—Current state-of-the-art object tracking methods have
largely benefited from the public availability of numerous benchmark
datasets. However, the focus has been on open-air imagery
and much less on underwater visual data. Inherent underwater
distortions, such as color loss, poor contrast, and underexposure,
caused by attenuation of light, refraction, and scattering, greatly
affect the visual quality of underwater data, and as such, existing
open-air trackers perform less efficiently on such data. To help
bridge this gap, this article proposes a first comprehensive underwater
object tracking (UOT100) benchmark dataset to facilitate
the development of tracking algorithms well-suited for underwater
environments. The proposed dataset consists of 104 underwater
video sequences and more than 74 000 annotated frames derived
from both natural and artificial underwater videos, with great
varieties of distortions. We benchmark the performance of 20
state-of-the-art object tracking algorithms and further introduce
a cascaded residual network for underwater image enhancement
model to improve tracking accuracy and success rate of trackers.
Our experimental results demonstrate the shortcomings of existing
tracking algorithms on underwater data and how our generative
adversarial network (GAN)-based enhancement model can
be used to improve tracking performance. We also evaluate the
visual quality of our model’s output against existing GAN-based
methods using well-accepted quality metrics and demonstrate that
our model yields better visual data.
Index Terms—Underwater benchmark dataset, underwater
generative adversarial network (GAN), underwater image
enhancement (UIE), underwater object tracking (UOT).
more »
« less
- Award ID(s):
- 1942053
- NSF-PAR ID:
- 10309916
- Date Published:
- Journal Name:
- IEEE Journal of Oceanic Engineering
- ISSN:
- 0364-9059
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract http://warp.whoi.edu/vmat/ , (2) an evaluation of state-of-the-art semi-supervised algorithms in the context of underwater animal tracking, and (3) an evaluation of real-world performance through demonstrations using a semi-supervised algorithm on-board an autonomous underwater vehicle to track marine animals in the wild. -
The fluctuation of the water surface causes refractive distortions that severely downgrade the image of an under- water scene. Here, we present the distortion-guided network (DG-Net) for restoring distortion-free underwater images. The key idea is to use a distortion map to guide net- work training. The distortion map models the pixel displacement caused by water refraction. We first use a physically constrained convolutional network to estimate the distortion map from the refracted image. We then use a gen- erative adversarial network guided by the distortion map to restore the sharp distortion-free image. Since the distortion map indicates correspondences between the distorted image and the distortion-free one, it guides the network to make better predictions. We evaluate our network on several real and synthetic underwater image datasets and show that it out-performs the state-of-the-art algorithms, especially in presence of large distortions. We also show results of complex scenarios, including outdoor swimming pool images captured by drone and indoor aquarium images taken by cellphone camera.more » « less
-
Underwater image enhancement and turbidity removal (dehazing) is a very challenging problem, not only due to the sheer variety of environments where it is applicable, but also due to the lack of high-resolution, labelled image data. In this paper, we present a novel, two-step deep learning approach for underwater image dehazing and colour correction. In iDehaze, we leverage computer graphics to physically model light propagation in underwater conditions. Specifically, we construct a three-dimensional, photorealistic simulation of underwater environments, and use them to gather a large supervised training dataset. We then train a deep convolutional neural network to remove the haze in these images, then train a second network to transform the colour space of the dehazed images onto a target domain. Experiments demonstrate that our two-step iDehaze method is substantially more effective at producing high-quality underwater images, achieving state-of-the-art performance on multiple datasets. Code, data and benchmarks will be open sourced.more » « less
-
null (Ed.)The ocean is a vast three-dimensional space that is poorly explored and understood, and harbors unobserved life and processes that are vital to ecosystem function. To fully interrogate the space, novel algorithms and robotic platforms are required to scale up observations. Locating animals of interest and extended visual observations in the water column are particularly challenging objectives. Towards that end, we present a novel Machine Learning-integrated Tracking (or ML-Tracking) algorithm for underwater vehicle control that builds on the class of algorithms known as tracking-by-detection. By coupling a multi-object detector (trained on in situ underwater image data), a 3D stereo tracker, and a supervisor module to oversee the mission, we show how ML-Tracking can create robust tracks needed for long duration observations, as well as enable fully automated acquisition of objects for targeted sampling. Using a remotely operated vehicle as a proxy for an autonomous underwater vehicle, we demonstrate continuous input from the ML-Tracking algorithm to the vehicle controller during a record, 5+ hr continuous observation of a midwater gelatinous animal known as a siphonophore. These efforts clearly demonstrate the potential that tracking-by-detection algorithms can have on exploration in unexplored environments and discovery of undiscovered life in our ocean.more » « less
-
This paper presents an attention-based, deep learning framework that converts robot camera frames with dynamic content into static frames to more easily apply simultaneous localization and mapping (SLAM) algorithms. The vast majority of SLAM methods have difficulty in the presence of dynamic objects appearing in the environment and occluding the area being captured by the camera. Despite past attempts to deal with dynamic objects, challenges remain to reconstruct large, occluded areas with complex backgrounds. Our proposed Dynamic-GAN framework employs a generative adversarial network to remove dynamic objects from a scene and inpaint a static image free of dynamic objects. The Dynamic-GAN framework utilizes spatial-temporal transformers, and a novel spatial-temporal loss function. The evaluation of Dynamic-GAN was comprehensively conducted both quantitatively and qualitatively by testing it on benchmark datasets, and on a mobile robot in indoor navigation environments. As people appeared dynamically in close proximity to the robot, results showed that large, feature-rich occluded areas can be accurately reconstructed with our attention-based deep learning framework for dynamic object removal. Through experiments we demonstrate that our proposed algorithm has up to 25% better performance on average as compared to the standard benchmark algorithms.more » « less