Abstract In-situ visual observations of marine organisms is crucial to developing behavioural understandings and their relations to their surrounding ecosystem. Typically, these observations are collected via divers, tags, and remotely-operated or human-piloted vehicles. Recently, however, autonomous underwater vehicles equipped with cameras and embedded computers with GPU capabilities are being developed for a variety of applications, and in particular, can be used to supplement these existing data collection mechanisms where human operation or tags are more difficult. Existing approaches have focused on using fully-supervised tracking methods, but labelled data for many underwater species are severely lacking. Semi-supervised trackers may offer alternative tracking solutions because they require less data than fully-supervised counterparts. However, because there are not existing realistic underwater tracking datasets, the performance of semi-supervised tracking algorithms in the marine domain is not well understood. To better evaluate their performance and utility, in this paper we provide (1) a novel dataset specific to marine animals located athttp://warp.whoi.edu/vmat/, (2) an evaluation of state-of-the-art semi-supervised algorithms in the context of underwater animal tracking, and (3) an evaluation of real-world performance through demonstrations using a semi-supervised algorithm on-board an autonomous underwater vehicle to track marine animals in the wild.
more »
« less
Real-Time Multi-Diver Tracking and Re-identification for Underwater Human-Robot Collaboration
Autonomous underwater robots working with teams of human divers may need to distinguish between different divers, e.g., to recognize a lead diver or to follow a specific team member. This paper describes a technique that enables autonomous underwater robots to track divers in real time as well as to reidentify them. The approach is an extension of Simple Online Realtime Tracking (SORT) with an appearance metric (deep SORT). Initial diver detection is performed with a custom CNN designed for realtime diver detection, and appearance features are subsequently extracted for each detected diver. Next, realtime tracking by-detection is performed with an extension of the deep SORT algorithm. We evaluate this technique on a series of videos of divers performing human-robot collaborative tasks and show that our methods result in more divers being accurately identified during tracking. We also discuss the practical considerations of applying multi-person tracking to on-board autonomous robot operations, and we consider how failure cases can be addressed during on-board tracking.
more »
« less
- Award ID(s):
- 1845364
- PAR ID:
- 10146568
- Date Published:
- Journal Name:
- 2020 IEEE International Conference on Robotics and Automation
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract This work proposes vision-only navigation strategies for an autonomous underwater robot. This approach is a step towards solving the coverage path planning problem in a 3-D environment for surveying underwater structures. Given the challenging conditions of the underwater domain, it is very complicated to obtain accurate state estimates reliably. Consequently, it is a great challenge to extend known path planning or coverage techniques developed for aerial or ground robot controls. In this work, we are investigating a navigation strategy utilizing only vision to assist in covering a complex underwater structure. We propose to use a navigation strategy akin to what a human diver will execute when circumnavigating around a region of interest, in particular when collecting data from a shipwreck. The focus of this article is a step towards enabling the autonomous operation of lightweight robots near underwater wrecks in order to collect data for creating photo-realistic maps and volumetric 3-D models while at the same time avoiding collisions. The proposed method uses convolutional neural networks to learn the control commands based on the visual input. We have demonstrated the feasibility of using a system based only on vision to learn specific strategies of navigation with 80% accuracy on the prediction of control command changes. Experimental results and a detailed overview of the proposed method are discussed.more » « less
-
Bearing only cooperative localization has been used successfully on aerial and ground vehicles. In this paper we present an extension of the approach to the underwater domain. The focus is on adapting the technique to handle the challenging visibility conditions underwater. Furthermore, data from inertial, magnetic, and depth sensors are utilized to improve the robustness of the estimation. In addition to robotic applications, the presented technique can be used for cave mapping and for marine archeology surveying, both by human divers. Experimental results from different environments, including a fresh water, low visibility, lake in South Carolina; a cavern in Florida; and coral reefs in Barbados during the day and during the night, validate the robustness and the accuracy of the proposed approach.more » « less
-
Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices.more » « less
-
This paper explores the problem of deploying machine learning (ML)-based object detection and segmentation models on edge platforms to enable realtime caveline detection for Autonomous Underwater Vehicles (AUVs) used for under-water cave exploration and mapping. We specifically investigate three ML models, i.e., U-Net, Vision Transformer (ViT), and YOLOv8, deployed on three edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), and NVIDIA Jetson Nano. The experimental results unveil clear tradeoffs between model accuracy, processing speed, and energy consumption. The most accurate model has shown to be U-Net with an 85.53 F1-score and 85.38 Intersection Over Union (IoU) value. Meanwhile, the highest inference speed and lowest energy consumption are achieved by the YOLOv8 model deployed on Jetson Nano operating in the high-power and low-power modes, respectively. The comprehensive quantitative analyses and comparative results provided in the paper highlight important nuances that can guide the deployment of caveline detection systems on underwater robots for ensuring safe and reliable AUV navigation during underwater cave exploration and mapping missions.more » « less
An official website of the United States government

