In this paper, we propose a real-time deep-learning approach for determining the 6D relative pose of Autonomous Underwater Vehicles (AUV) from a single image. A team of autonomous robots localizing themselves, in a communicationconstrained underwater environment, is essential for many applications such as underwater exploration, mapping, multirobot convoying, and other multi-robot tasks. Due to the profound difficulty of collecting ground truth images with accurate 6D poses underwater, this work utilizes rendered images from the Unreal Game Engine simulation for training. An image translation network is employed to bridge the gap between the rendered and the real images producing synthetic images for training. The proposed method predicts the 6D pose of an AUV from a single image as 2D image keypoints representing 8 corners of the 3D model of the AUV, and then the 6D pose in the camera coordinates is determined using RANSACbased PnP. Experimental results in underwater environments (swimming pool and ocean) with different cameras demonstrate the robustness of the proposed technique, where the trained system decreased translation error by 75.5\% and orientation error by 64.6\% over the state-of-the-art methods.
more »
« less
iDehaze: Supervised Underwater Image Enhancement and Dehazing via Physically Accurate Photorealistic Simulations
Underwater image enhancement and turbidity removal (dehazing) is a very challenging problem, not only due to the sheer variety of environments where it is applicable, but also due to the lack of high-resolution, labelled image data. In this paper, we present a novel, two-step deep learning approach for underwater image dehazing and colour correction. In iDehaze, we leverage computer graphics to physically model light propagation in underwater conditions. Specifically, we construct a three-dimensional, photorealistic simulation of underwater environments, and use them to gather a large supervised training dataset. We then train a deep convolutional neural network to remove the haze in these images, then train a second network to transform the colour space of the dehazed images onto a target domain. Experiments demonstrate that our two-step iDehaze method is substantially more effective at producing high-quality underwater images, achieving state-of-the-art performance on multiple datasets. Code, data and benchmarks will be open sourced.
more »
« less
- Award ID(s):
- 2000475
- PAR ID:
- 10422761
- Date Published:
- Journal Name:
- Electronics
- Volume:
- 12
- Issue:
- 11
- ISSN:
- 2079-9292
- Page Range / eLocation ID:
- 2352
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)This paper presents a policy-driven sequential image augmentation approach for image-related tasks. Our approach applies a sequence of image transformations (e.g., translation, rotation) over a training image, one transformation at a time, with the augmented image from the previous time step treated as the input for the next transformation. This sequential data augmentation substantially improves sample diversity, leading to improved test performance, especially for data-hungry models (e.g., deep neural networks). However, the search for the optimal transformation of each image at each time step of the sequence has high complexity due to its combination nature. To address this challenge, we formulate the search task as a sequential decision process and introduce a deep policy network that learns to produce transformations based on image content. We also develop an iterative algorithm to jointly train a classifier and the policy network in the reinforcement learning setting. The immediate reward of a potential transformation is defined to encourage transformations producing hard samples for the current classifier. At each iteration, we employ the policy network to augment the training dataset, train a classifier with the augmented data, and train the policy net with the aid of the classifier. We apply the above approach to both public image classification benchmarks and a newly collected image dataset for material recognition. Comparisons to alternative augmentation approaches show that our policy-driven approach achieves comparable or improved classification performance while using significantly fewer augmented images. The code is available at https://github.com/Paul-LiPu/rl_autoaug.more » « less
-
In this paper, we present OpenWaters, a real-time open-source underwater simulation kit for generating photorealistic underwater scenes. OpenWaters supports creation of massive amount of underwater images by emulating diverse real-world conditions. It allows for fine controls over every variable in a simulation instance, including geometry, rendering parameters like ray-traced water caustics, scattering, and ground-truth labels. Using underwater depth (distance between camera and object) estimation as the use-case, we showcase and validate the capabilities of OpenWaters to model underwater scenes that are used to train a deep neural network for depth estimation. Our experimental evaluation demonstrates depth estimation using synthetic underwater images with high accuracy, and feasibility of transfer-learning of features from synthetic to real-world images.more » « less
-
Messinger, David W.; Velez-Reyes, Miguel (Ed.)Recently, multispectral and hyperspectral data fusion models based on deep learning have been proposed to generate images with a high spatial and spectral resolution. The general objective is to obtain images that improve spatial resolution while preserving high spectral content. In this work, two deep learning data fusion techniques are characterized in terms of classification accuracy. These methods fuse a high spatial resolution multispectral image with a lower spatial resolution hyperspectral image to generate a high spatial-spectral hyperspectral image. The first model is based on a multi-scale long short-term memory (LSTM) network. The LSTM approach performs the fusion using a multiple step process that transitions from low to high spatial resolution using an intermediate step capable of reducing spatial information loss while preserving spectral content. The second fusion model is based on a convolutional neural network (CNN) data fusion approach. We present fused images using four multi-source datasets with different spatial and spectral resolutions. Both models provide fused images with increased spatial resolution from 8m to 1m. The obtained fused images using the two models are evaluated in terms of classification accuracy on several classifiers: Minimum Distance, Support Vector Machines, Class-Dependent Sparse Representation and CNN classification. The classification results show better performance in both overall and average accuracy for the images generated with the multi-scale LSTM fusion over the CNN fusionmore » « less
-
null (Ed.)Underwater motion recognition using acoustic wireless networks has a promisingly potential to be applied to the diver activity monitoring and aquatic animal recognition without the burden of expensive underwater cameras which have been used by the image-based underwater classification techniques. However, accurately extracting features that are independent of the complicated underwater environments such as inhomogeneous deep seawater is a serious challenge for underwater motion recognition. Velocities of target body (VTB) during the motion are excellent environment independent features for WiFi-based recognition techniques in the indoor environments, however, VTB features are hard to be extracted accurately in the underwater environments. The inaccurate VTB estimation is caused by the fact that the signal propagates along with a curve instead of a straight line as the signal propagates in the air. In this paper, we propose an underwater motion recognition mechanism in the inhomogeneous deep seawater using acoustic wireless networks. To accurately extract velocities of target body features, we first derive Doppler Frequency Shift (DFS) coefficients that can be utilized for VTB estimation when signals propagate deviously. Secondly, we propose a dynamic self-refining (DSR) optimization algorithm with acoustic wireless networks that consist of multiple transmitter-receiver links to estimate the VTB. Those VTB features can be utilized to train the convolutional neural networks (CNN). Through the simulation, estimated VTB features are evaluated and the testing recognition results validate that our proposed underwater motion recognition mechanism is able to achieve high classification accuracy.more » « less
An official website of the United States government

