skip to main content


Search for: All records

Award ID contains: 2024741

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. — In this paper, we present CaveSeg - the first visual learning pipeline for semantic segmentation and scene parsing for AUV navigation inside underwater caves. We address the problem of scarce annotated training data by preparing a comprehensive dataset for semantic segmentation of underwater cave scenes. It contains pixel annotations for important navigation markers (e.g. caveline, arrows), obstacles (e.g. ground plain and overhead layers), scuba divers, and open areas for servoing. Through comprehensive benchmark analyses on cave systems in USA, Mexico, and Spain locations, we demonstrate that robust deep visual models can be developed based on CaveSeg for fast semantic scene parsing of underwater cave environments. In particular, we formulate a novel transformer-based model that is computationally light and offers near real-time execution in addition to achieving state-of-the-art performance. Finally, we explore the design choices and implications of semantic segmentation for visual servoing by AUVs inside underwater caves. The proposed model and benchmark dataset open up promising opportunities for future research in autonomous underwater cave exploration and mapping. 
    more » « less
    Free, publicly-accessible full text available May 13, 2025
  2. This paper presents an extension to visual inertial odometry (VIO) by introducing tightly-coupled fusion of magnetometer measurements. A sliding window of keyframes is optimized by minimizing re-projection errors, relative inertial errors, and relative magnetometer orientation errors. The results of IMU orientation propagation are used to efficiently transform magnetometer measurements between frames producing relative orientation constraints between consecutive frames. The soft and hard iron effects are calibrated using an ellipsoid fitting algorithm. The introduction of magnetometer data results in significant reductions in the orientation error and also in recovery of the true yaw orientation with respect to the magnetic north. The proposed framework operates in all environments with slow-varying magnetic fields, mainly outdoors and underwater. We have focused our work on the underwater domain, especially in underwater caves, as the narrow passage and turbulent flow make it difficult to perform loop closures and reset the localization drift. The underwater caves present challenges to VIO due to the absence of ambient light and the confined nature of the environment, while also being a crucial source of fresh water and providing valuable historical records. Experimental results from underwater caves demonstrate the improvements in accuracy and robustness introduced by the proposed VIO extension. 
    more » « less
    Free, publicly-accessible full text available May 13, 2025
  3. This paper explores the problem of deploying machine learning (ML)-based object detection and segmentation models on edge platforms to enable realtime caveline detection for Autonomous Underwater Vehicles (AUVs) used for under-water cave exploration and mapping. We specifically investigate three ML models, i.e., U-Net, Vision Transformer (ViT), and YOLOv8, deployed on three edge platforms: Raspberry Pi-4, Intel Neural Compute Stick 2 (NCS2), and NVIDIA Jetson Nano. The experimental results unveil clear tradeoffs between model accuracy, processing speed, and energy consumption. The most accurate model has shown to be U-Net with an 85.53 F1-score and 85.38 Intersection Over Union (IoU) value. Meanwhile, the highest inference speed and lowest energy consumption are achieved by the YOLOv8 model deployed on Jetson Nano operating in the high-power and low-power modes, respectively. The comprehensive quantitative analyses and comparative results provided in the paper highlight important nuances that can guide the deployment of caveline detection systems on underwater robots for ensuring safe and reliable AUV navigation during underwater cave exploration and mapping missions. 
    more » « less
    Free, publicly-accessible full text available December 15, 2024
  4. Underwater caves are challenging environments that are crucial for water resource management, and for our understanding of hydro-geology and history. Mapping underwater caves is a time-consuming, labor-intensive, and hazardous operation. For autonomous cave mapping by underwater robots, the major challenge lies in vision-based estimation in the complete absence of ambient light, which results in constantly moving shadows due to the motion of the camera-light setup. Thus, detecting and following the caveline as navigation guidance is paramount for robots in autonomous cave mapping missions. In this paper, we present a computationally light caveline detection model based on a novel Vision Transformer (ViT)-based learning pipeline. We address the problem of scarce annotated training data by a weakly supervised formulation where the learning is reinforced through a series of noisy predictions from intermediate sub-optimal models. We validate the utility and effectiveness of such weak supervision for caveline detection and tracking in three different cave locations: USA, Mexico, and Spain. Experimental results demonstrate that our proposed model, CL-ViT, balances the robustness-efficiency trade-off, ensuring good generalization performance while offering 10+ FPS on single-board (Jetson TX2) devices. 
    more » « less
  5. Vision-based state estimation is challenging in underwater environments due to color attenuation, low visibility and floating particulates. All visual-inertial estimators are prone to failure due to degradation in image quality. However, underwater robots are required to keep track of their pose during field deployments. We propose robust estimator fusing the robot's dynamic and kinematic model with proprioceptive sensors to propagate the pose whenever visual-inertial odometry (VIO) fails. To detect the VIO failures, health tracking is used, which enables switching between pose estimates from VIO and a kinematic estimator. Loop closure implemented on weighted posegraph for global trajectory optimization. Experimental results from an Aqua2 Autonomous Underwater Vehicle field deployments demonstrates the robustness of our approach over different underwater environments such as over shipwrecks and coral reefs. The proposed hybrid approach is robust to VIO failures producing consistent trajectories even in harsh conditions. 
    more » « less
  6. IEEE (Ed.)
    This paper addresses the robustness problem of visual-inertial state estimation for underwater operations. Underwater robots operating in a challenging environment are required to know their pose at all times. All vision-based localization schemes are prone to failure due to poor visibility conditions, color loss, and lack of features. The proposed approach utilizes a model of the robot's kinematics together with proprioceptive sensors to maintain the pose estimate during visual-inertial odometry (VIO) failures. Furthermore, the trajectories from successful VIO and the ones from the model-driven odometry are integrated in a coherent set that maintains a consistent pose at all times. Health-monitoring tracks the VIO process ensuring timely switches between the two estimators. Finally, loop closure is implemented on the overall trajectory. The resulting framework is a robust estimator switching between model-based and visual-inertial odometry (SM/VIO). Experimental results from numerous deployments of the Aqua2 vehicle demonstrate the robustness of our approach over coral reefs and a shipwreck. 
    more » « less
  7. This paper presents SVIn2, a novel tightly-coupled keyframe-based Simultaneous Localization and Mapping (SLAM) system, which fuses Scanning Profiling Sonar, Visual, Inertial, and water-pressure information in a non-linear optimization framework for small and large scale challenging underwater environments. The developed real-time system features robust initialization, loop-closing, and relocalization capabilities, which make the system reliable in the presence of haze, blurriness, low light, and lighting variations, typically observed in underwater scenarios. Over the last decade, Visual-Inertial Odometry and SLAM systems have shown excellent performance for mobile robots in indoor and outdoor environments, but often fail underwater due to the inherent difficulties in such environments. Our approach combats the weaknesses of previous approaches by utilizing additional sensors and exploiting their complementary characteristics. In particular, we use (1) acoustic range information for improved reconstruction and localization, thanks to the reliable distance measurement; (2) depth information from water-pressure sensor for robust initialization, refining the scale, and assisting to limit the drift in the tightly-coupled integration. The developed software—made open source—has been successfully used to test and validate the proposed system in both benchmark datasets and numerous real world underwater scenarios, including datasets collected with a custom-made underwater sensor suite and an autonomous underwater vehicle Aqua2. SVIn2 demonstrated outstanding performance in terms of accuracy and robustness on those datasets and enabled other robotic tasks, for example, planning for underwater robots in presence of obstacles.

     
    more » « less
  8. This paper discusses a novel approach for the exploration of an underwater structure. A team of robots splits into two roles: certain robots approach the structure collecting detailed information (proximal observers) while the rest (distal observers) keep a distance providing an overview of the mission and assist in the localization of the proximal observers via a Cooperative Localization framework. Proximal observers utilize a novel robust switching model-based/visual-inertial odometry to overcome vision-based localization failures. Exploration strategies for the proximal and the distal observer are discussed. 
    more » « less