Abstract In-situ visual observations of marine organisms is crucial to developing behavioural understandings and their relations to their surrounding ecosystem. Typically, these observations are collected via divers, tags, and remotely-operated or human-piloted vehicles. Recently, however, autonomous underwater vehicles equipped with cameras and embedded computers with GPU capabilities are being developed for a variety of applications, and in particular, can be used to supplement these existing data collection mechanisms where human operation or tags are more difficult. Existing approaches have focused on using fully-supervised tracking methods, but labelled data for many underwater species are severely lacking. Semi-supervised trackers may offer alternative tracking solutions because they require less data than fully-supervised counterparts. However, because there are not existing realistic underwater tracking datasets, the performance of semi-supervised tracking algorithms in the marine domain is not well understood. To better evaluate their performance and utility, in this paper we provide (1) a novel dataset specific to marine animals located athttp://warp.whoi.edu/vmat/, (2) an evaluation of state-of-the-art semi-supervised algorithms in the context of underwater animal tracking, and (3) an evaluation of real-world performance through demonstrations using a semi-supervised algorithm on-board an autonomous underwater vehicle to track marine animals in the wild.
more »
« less
Real-Time Multi-Diver Tracking and Re-identification for Underwater Human-Robot Collaboration
Autonomous underwater robots working with teams of human divers may need to distinguish between different divers, e.g., to recognize a lead diver or to follow a specific team member. This paper describes a technique that enables autonomous underwater robots to track divers in real time as well as to reidentify them. The approach is an extension of Simple Online Realtime Tracking (SORT) with an appearance metric (deep SORT). Initial diver detection is performed with a custom CNN designed for realtime diver detection, and appearance features are subsequently extracted for each detected diver. Next, realtime tracking by-detection is performed with an extension of the deep SORT algorithm. We evaluate this technique on a series of videos of divers performing human-robot collaborative tasks and show that our methods result in more divers being accurately identified during tracking. We also discuss the practical considerations of applying multi-person tracking to on-board autonomous robot operations, and we consider how failure cases can be addressed during on-board tracking.
more »
« less
- Award ID(s):
- 1845364
- PAR ID:
- 10146568
- Date Published:
- Journal Name:
- 2020 IEEE International Conference on Robotics and Automation
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Abstract This work proposes vision-only navigation strategies for an autonomous underwater robot. This approach is a step towards solving the coverage path planning problem in a 3-D environment for surveying underwater structures. Given the challenging conditions of the underwater domain, it is very complicated to obtain accurate state estimates reliably. Consequently, it is a great challenge to extend known path planning or coverage techniques developed for aerial or ground robot controls. In this work, we are investigating a navigation strategy utilizing only vision to assist in covering a complex underwater structure. We propose to use a navigation strategy akin to what a human diver will execute when circumnavigating around a region of interest, in particular when collecting data from a shipwreck. The focus of this article is a step towards enabling the autonomous operation of lightweight robots near underwater wrecks in order to collect data for creating photo-realistic maps and volumetric 3-D models while at the same time avoiding collisions. The proposed method uses convolutional neural networks to learn the control commands based on the visual input. We have demonstrated the feasibility of using a system based only on vision to learn specific strategies of navigation with 80% accuracy on the prediction of control command changes. Experimental results and a detailed overview of the proposed method are discussed.more » « less
-
Real-time computer vision and remote visual sensing platforms are increasingly used in numerous underwater applications such as shipwreck mapping, subsea inspection, coastal water monitoring, surveillance, coral reef surveying, invasive fish tracking, and more. Recent advancements in robot vision and powerful single-board computers have paved the way for an imminent revolution in the next generation of subsea technologies. In this chapter, we present these exciting emerging applications and discuss relevant open problems and practical considerations. First, we delineate the specific environmental and operational challenges of underwater vision and highlight some prominent scientific and engineering solutions to ensure robust visual perception. We specifically focus on the characteristics of underwater light propagation from the perspective of image formation and photometry. We also discuss the recent developments and trends in underwater imaging literature to facilitate the restoration, enhancement, and filtering of inherently noisy visual data. Subsequently, we demonstrate how these ideas are extended and deployed in the perception pipelines of Autonomous Underwater Vehicles (AUVs) and Remotely Operated Vehicles (ROVs). In particular, we present several use cases for marine life monitoring and conservation, human-robot cooperative missions for inspecting submarine cables and archaeological sites, subsea structure or cave mapping, aquaculture, and marine ecology. We elaborately discuss how the deep visual learning and on-device AI breakthroughs are transforming the perception, planning, localization, and navigation capabilities of visually-guided underwater robots. Along this line, we also highlight the prospective future research directions and open problems at the intersection of computer vision and underwater robotics domains.more » « less
-
Bearing only cooperative localization has been used successfully on aerial and ground vehicles. In this paper we present an extension of the approach to the underwater domain. The focus is on adapting the technique to handle the challenging visibility conditions underwater. Furthermore, data from inertial, magnetic, and depth sensors are utilized to improve the robustness of the estimation. In addition to robotic applications, the presented technique can be used for cave mapping and for marine archeology surveying, both by human divers. Experimental results from different environments, including a fresh water, low visibility, lake in South Carolina; a cavern in Florida; and coral reefs in Barbados during the day and during the night, validate the robustness and the accuracy of the proposed approach.more » « less
-
Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices.more » « less
An official website of the United States government

