Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available June 21, 2026
-
This paper explores the design and development of a language-based interface for dynamic mission programming of autonomous underwater vehicles (AUVs). The proposed `Word2Wave' (W2W) framework enables interactive programming and parameter configuration of AUVs for remote subsea missions. The W2W framework includes: (i) a set of novel language rules and command structures for efficient language-to-mission mapping; (ii) a GPT-based prompt engineering module for training data generation; (iii) a small language model (SLM)-based sequence-to-sequence learning pipeline for mission command generation from human speech or text; and (iv) a novel user interface for 2D mission map visualization and human-machine interfacing. The proposed learning pipeline adapts an SLM named T5-Small that can learn language-to-mission mapping from processed language data effectively, providing robust and efficient performance. In addition to a benchmark evaluation with state-of-the-art, we conduct a user interaction study to demonstrate the effectiveness of W2W over commercial AUV programming interfaces. Across participants, W2W-based programming required less than 10% time for mission programming compared to traditional interfaces; it is deemed to be a simpler and more natural paradigm for subsea mission programming with a usability score of 76.25. W2W opens up promising future research opportunities on hands-free AUV mission programming for efficient subsea deployments.more » « lessFree, publicly-accessible full text available May 19, 2026
-
In this letter, we introduce the idea of AquaFuse, a physics-based method for synthesizing waterbody properties in underwater imagery. We formulate a closed-form solution for waterbody fusion that facilitates realistic data augmentation and geometrically consistent underwater scene rendering. AquaFuse leverages the physical characteristics of light propagation underwater to synthesize the waterbody from one scene to the object contents of another. Unlike data-driven style transfer methods, AquaFuse preserves the depth consistency and object geometry in an input scene. We validate this unique feature by comprehensive experiments over diverse sets of underwater scenes. We find that the AquaFused images preserve over 94% depth consistency and 90–95% structural similarity of the input scenes. We also demonstrate that it generates accurate 3D view synthesis by preserving object geometry while adapting to the inherent waterbody fusion process. AquaFuse opens up a new research direction in data augmentation by geometry-preserving style transfer for underwater imaging and robot vision.more » « lessFree, publicly-accessible full text available May 1, 2026
-
Underwater image restoration aims to recover color, contrast, and appearance in underwater scenes, crucial for fields like marine ecology and archaeology. While pixel-domain diffusion methods work for simple scenes, they are computationally heavy and produce artifacts in complex, depth-varying scenes. We present a single-step latent diffusion method, SLURPP (Single-step Latent Underwater Restoration with Pretrained Priors), that overcomes these limitations by combining a novel network architecture with an accurate synthetic data generation pipeline. SLURPP combines pretrained latent diffusion models - which encode strong priors on the geometry and depth of scenes with an explicit scene decomposition, which allows one to model and account for the effects of light attenuation and backscattering. To train SLURPP, we design a physics-based underwater image synthesis pipeline that applies varied and realistic underwater degradation effects to existing terrestrial image datasets. We evaluate our method extensively on both synthetic and real-world benchmarks and demonstrate state-of-the-art performance.more » « less
-
Underwater ROVs (Remotely Operated Vehicles) are unmanned submersibles designed for exploring and operating in the depths of the ocean. Despite using high-end cameras, typical teleoperation engines based on first-person (egocentric) views limit a surface operator’s ability to maneuver the ROV in complex deep-water missions. In this paper, we present an interactive teleoperation interface that enhances the operational capabilities via increased situational awareness. This is accomplished by (i) offering on-demand third-person (exocentric) visuals from past egocentric views, and (ii) facilitating enhanced peripheral information with augmented ROV pose in real-time. We achieve this by integrating a 3D geometry-based Ego-to-Exo view synthesis algorithm into a monocular SLAM system for accurate trajectory estimation. The proposed closed-form solution only uses past egocentric views from the ROV and a SLAM backbone for pose estimation, which makes it portable to existing ROV platforms. Unlike data-driven solutions, it is invariant to applications and waterbody-specific scenes. We validate the geometric accuracy of the proposed framework through extensive experiments of 2-DOF indoor navigation and 6-DOF underwater cave exploration in challenging low-light conditions. A subjective evaluation on 15 human teleoperators further confirms the effectiveness of the integrated features for improved teleoperation. We demonstrate the benefits of dynamic Ego-to-Exo view generation and real-time pose rendering for remote ROV teleoperation by following navigation guides such as cavelines inside underwater caves. This new way of interactive ROV teleoperation opens up promising opportunities for future research in subsea telerobotics.more » « less
-
Real-time computer vision and remote visual sensing platforms are increasingly used in numerous underwater applications such as shipwreck mapping, subsea inspection, coastal water monitoring, surveillance, coral reef surveying, invasive fish tracking, and more. Recent advancements in robot vision and powerful single-board computers have paved the way for an imminent revolution in the next generation of subsea technologies. In this chapter, we present these exciting emerging applications and discuss relevant open problems and practical considerations. First, we delineate the specific environmental and operational challenges of underwater vision and highlight some prominent scientific and engineering solutions to ensure robust visual perception. We specifically focus on the characteristics of underwater light propagation from the perspective of image formation and photometry. We also discuss the recent developments and trends in underwater imaging literature to facilitate the restoration, enhancement, and filtering of inherently noisy visual data. Subsequently, we demonstrate how these ideas are extended and deployed in the perception pipelines of Autonomous Underwater Vehicles (AUVs) and Remotely Operated Vehicles (ROVs). In particular, we present several use cases for marine life monitoring and conservation, human-robot cooperative missions for inspecting submarine cables and archaeological sites, subsea structure or cave mapping, aquaculture, and marine ecology. We elaborately discuss how the deep visual learning and on-device AI breakthroughs are transforming the perception, planning, localization, and navigation capabilities of visually-guided underwater robots. Along this line, we also highlight the prospective future research directions and open problems at the intersection of computer vision and underwater robotics domains.more » « less
An official website of the United States government

Full Text Available