skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on December 18, 2025

Title: Edge-Centric Real-Time Segmentation for Autonomous Underwater Cave Exploration
This paper addresses the challenge of deploying machine learning (ML)-based segmentation models on edge platforms to facilitate real-time scene segmentation for Autonomous Underwater Vehicles (AUVs) in underwater cave exploration and mapping scenarios. We focus on three ML models-U-Net, CaveSeg, and YOLOv8n-deployed on four edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), Google Edge TPU, and NVIDIA Jetson Nano. Experimental results reveal that mobile models with modern architectures, such as YOLOv8n, and specialized models for semantic segmentation, like U-Net, offer higher accuracy with lower latency. YOLOv8n emerged as the most accurate model, achieving a 72.5 Intersection Over Union (IoU) score. Meanwhile, the U-Net model deployed on the Coral Dev board delivered the highest speed at 79.24 FPS and the lowest energy consumption at 6.23 mJ. The detailed quantitative analyses and comparative results presented in this paper offer critical insights for deploying cave segmentation systems on underwater robots, ensuring safe and reliable AUV navigation during cave exploration and mapping missions.  more » « less
Award ID(s):
1943205 2024741
PAR ID:
10614035
Author(s) / Creator(s):
; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-7488-9
Page Range / eLocation ID:
1404 to 1411
Format(s):
Medium: X
Location:
Miami, FL, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper addresses the challenge of deploying machine learning (ML)-based segmentation models on edge platforms to facilitate real-time scene segmentation for Autonomous Underwater Vehicles (AUVs) in underwater cave exploration and mapping scenarios. We focus on three ML models-U-Net, CaveSeg, and YOLOv8n-deployed on four edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), Google Edge TPU, and NVIDIA Jetson Nano. Experimental results reveal that mobile models with modern architectures, such as YOLOv8n, and specialized models for semantic segmentation, like U-Net, offer higher accuracy with lower latency. YOLOv8n emerged as the most accurate model, achieving a 72.5 Intersection Over Union (IoU) score. Meanwhile, the U-Net model deployed on the Coral Dev board delivered the highest speed at 79.24 FPS and the lowest energy consumption at 6.23 mJ. The detailed quantitative analyses and comparative results presented in this paper offer critical insights for deploying cave segmentation systems on underwater robots, ensuring safe and reliable AUV navigation during cave exploration and mapping missions. 
    more » « less
  2. This paper explores the problem of deploying machine learning (ML)-based object detection and segmentation models on edge platforms to enable realtime caveline detection for Autonomous Underwater Vehicles (AUVs) used for under-water cave exploration and mapping. We specifically investigate three ML models, i.e., U-Net, Vision Transformer (ViT), and YOLOv8, deployed on three edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), and NVIDIA Jetson Nano. The experimental results unveil clear tradeoffs between model accuracy, processing speed, and energy consumption. The most accurate model has shown to be U-Net with an 85.53 F1-score and 85.38 Intersection Over Union (IoU) value. Meanwhile, the highest inference speed and lowest energy consumption are achieved by the YOLOv8 model deployed on Jetson Nano operating in the high-power and low-power modes, respectively. The comprehensive quantitative analyses and comparative results provided in the paper highlight important nuances that can guide the deployment of caveline detection systems on underwater robots for ensuring safe and reliable AUV navigation during underwater cave exploration and mapping missions. 
    more » « less
  3. — In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstacles (e.g. ground plain and overhead layers), scuba divers, and open areas for servoing. Through comprehensive benchmark analyses on cave systems in USA, Mexico, and Spain locations, we demonstrate that robust deep visual models can be developed based on CaveSeg for fast semantic scene parsing of underwater cave environments. In particular, we formulate a novel transformer-based model that is computationally light and offers near real-time execution in addition to achieving state-of-the-art performance. Finally, we explore the design choices and implications of semantic segmentation for visual servoing by AUVs inside underwater caves. The proposed model and benchmark dataset open up promising opportunities for future research in autonomous underwater cave exploration and mapping. 
    more » « less
  4. This paper presents a systematic approach for the 3-D mapping of underwater caves. Exploration of underwater caves is very important for furthering our understanding of hydrogeology, managing efficiently water resources, and advancing our knowledge in marine archaeology. Underwater cave exploration by human divers however, is a tedious, labor intensive, extremely dangerous operation, and requires highly skilled people. As such, it is an excellent fit for robotic technology, which has never before been addressed. In addition to the underwater vision constraints, cave mapping presents extra challenges in the form of lack of natural illumination and harsh contrasts, resulting in failure for most of the state-ofthe-art visual based state estimation packages. A new approach employing a stereo camera and a video-light is presented. Our approach utilizes the intersection of the cone of the video-light with the cave boundaries: walls, floor, and ceiling, resulting in the construction of a wire frame outline of the cave. Successive frames are combined using a state of the art visual odometry algorithm while simultaneously inferring scale through the stereo reconstruction. Results from experiments at a cave, part of the Sistema Camilo, Quintana Roo, Mexico, validate our approach. The cave wall reconstruction presented provides an immersive experience in 3-D. 
    more » « less
  5. Precise coastal shoreline mapping is essential for monitoring changes in erosion rates, surface hydrology, and ecosystem structure and function. Monitoring water bodies in the Arctic National Wildlife Refuge (ANWR) is of high importance, especially considering the potential for oil and natural gas exploration in the region. In this work, we propose a modified variant of the Deep Neural Network based U-Net Architecture for the automated mapping of 4 Band Orthorectified NOAA Airborne Imagery using sparsely labeled training data and compare it to the performance of traditional Machine Learning (ML) based approaches—namely, random forest, xgboost—and spectral water indices—Normalized Difference Water Index (NDWI), and Normalized Difference Surface Water Index (NDSWI)—to support shoreline mapping of Arctic coastlines. We conclude that it is possible to modify the U-Net model to accept sparse labels as input and the results are comparable to other ML methods (an Intersection-over-Union (IoU) of 94.86% using U-Net vs. an IoU of 95.05% using the best performing method). 
    more » « less