This paper explores the problem of deploying machine learning (ML)-based object detection and segmentation models on edge platforms to enable realtime caveline detection for Autonomous Underwater Vehicles (AUVs) used for under-water cave exploration and mapping. We specifically investigate three ML models, i.e., U-Net, Vision Transformer (ViT), and YOLOv8, deployed on three edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), and NVIDIA Jetson Nano. The experimental results unveil clear tradeoffs between model accuracy, processing speed, and energy consumption. The most accurate model has shown to be U-Net with an 85.53 F1-score and 85.38 Intersection Over Union (IoU) value. Meanwhile, the highest inference speed and lowest energy consumption are achieved by the YOLOv8 model deployed on Jetson Nano operating in the high-power and low-power modes, respectively. The comprehensive quantitative analyses and comparative results provided in the paper highlight important nuances that can guide the deployment of caveline detection systems on underwater robots for ensuring safe and reliable AUV navigation during underwater cave exploration and mapping missions.
more »
« less
Weakly Supervised Caveline Detection for AUV Navigation Inside Underwater Caves
Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices.
more »
« less
- PAR ID:
- 10496708
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- Proceedings of the IEEERSJ International Conference on Intelligent Robots and Systems
- ISSN:
- 2153-0858
- ISBN:
- 978-1-6654-9190-7
- Page Range / eLocation ID:
- 9933 to 9940
- Format(s):
- Medium: X
- Location:
- Detroit, MI, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
— In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstacles (e.g. ground plain and overhead layers), scuba divers, and open areas for servoing. Through comprehensive benchmark analyses on cave systems in USA, Mexico, and Spain locations, we demonstrate that robust deep visual models can be developed based on CaveSeg for fast semantic scene parsing of underwater cave environments. In particular, we formulate a novel transformer-based model that is computationally light and offers near real-time execution in addition to achieving state-of-the-art performance. Finally, we explore the design choices and implications of semantic segmentation for visual servoing by AUVs inside underwater caves. The proposed model and benchmark dataset open up promising opportunities for future research in autonomous underwater cave exploration and mapping.more » « less
-
This paper presents a systematic approach for the 3-D mapping of underwater caves. Exploration of underwater caves is very important for furthering our understanding of hydrogeology, managing efficiently water resources, and advancing our knowledge in marine archaeology. Underwater cave exploration by human divers however, is a tedious, labor intensive, extremely dangerous operation, and requires highly skilled people. As such, it is an excellent fit for robotic technology, which has never before been addressed. In addition to the underwater vision constraints, cave mapping presents extra challenges in the form of lack of natural illumination and harsh contrasts, resulting in failure for most of the state-ofthe-art visual based state estimation packages. A new approach employing a stereo camera and a video-light is presented. Our approach utilizes the intersection of the cone of the video-light with the cave boundaries: walls, floor, and ceiling, resulting in the construction of a wire frame outline of the cave. Successive frames are combined using a state of the art visual odometry algorithm while simultaneously inferring scale through the stereo reconstruction. Results from experiments at a cave, part of the Sistema Camilo, Quintana Roo, Mexico, validate our approach. The cave wall reconstruction presented provides an immersive experience in 3-D.more » « less
-
This paper addresses the Informative Path Planning (IPP) algorithm for autonomous robots to explore unknown 2D environments for mapping purposes. IPP can be beneficial to many applications such as search and rescue and cave exploration, where mapping an unknown environment is necessary. Autonomous robots' limited operation time due to their finite battery necessitates an efficient IPP algorithm, however, it is challenging because autonomous robots may not have any information about the environment. In this paper, we formulate a mathematical structure of the IPP problem along with the derivation of the optimal control input. Then, a discretized model for the IPP algorithm is presented as a solution for exploring an unknown environment. The proposed approach provides relatively fast computation time while being applicable to broad robot and sensor platforms. Various simulation results are provided to show the performance of the proposed IPP algorithm.more » « less
-
This paper presents a systematic approach on realtime reconstruction of an underwater environment using Sonar, Visual, Inertial, and Depth data. In particular, low lighting conditions, or even complete absence of natural light inside caves, results in strong lighting variations, e.g., the cone of the artificial video light intersecting underwater structures, and the shadow contours. The proposed method utilizes the well defined edges between well lit areas and darkness to provide additional features, resulting into a denser 3D point cloud than the usual point clouds from a Visual SLAM system. Experimental results in an underwater cave at Ginnie Springs, FL, with a custommade underwater sensor suite demonstrate the performance of our system. This will enable more robust navigation of AUVs using the denser 3D point cloud to detect obstacles and achieve higher resolution reconstructions.more » « less