Recovering rigid registration between successive camera poses lies at the heart of 3D reconstruction, SLAM and visual odometry. Registration relies on the ability to compute discriminative 2D features in successive camera images for determining feature correspondences, which is very challenging in feature-poor environments, i.e. low-texture and/or low-light environments. In this paper, we aim to address the challenge of recovering rigid registration between successive camera poses in feature-poor environments in a Visual Inertial Odometry (VIO) setting. In addition to inertial sensing, we instrument a small aerial robot with an RGBD camera and propose a framework that unifies the incorporation of 3Dmore »
PRGFlow: Unified SWAP-aware deep global optical flow for aerial robot navigation
Global optical flow estimation is the foundation stone for obtaining odometry which is used to enable aerial robot navigation. However, such a method has to be of low latency and high robustness whilst also respecting the size, weight, area and power (SWAP) constraints of the robot. A combination of cameras coupled with inertial measurement units (IMUs) has proven to be the best combination in order to obtain such low latency odometry on resource-constrained aerial robots. Recently, deep learning approaches for visual inertial fusion have gained momentum due to their high accuracy and robustness. However, an equally noteworthy benefit for robotics of these techniques are their inherent scalability (adaptation to different sized aerial robots) and unification (same method works on different sized aerial robots). To this end, we present a deep learning approach called PRGFlow for obtaining global optical flow and then loosely fuse it with an IMU for full 6-DoF (Degrees of Freedom) relative pose estimation (which is then integrated to obtain odometry). The network is evaluated on the MSCOCO dataset and the dead-reckoned odometry on multiple real-flight trajectories without any fine-tuning or re-training. A detailed benchmark comparing different network architectures and loss functions to enable scalability is also presented. more »
- Award ID(s):
- 2020624
- Publication Date:
- NSF-PAR ID:
- 10309877
- Journal Name:
- Electronics letters
- Volume:
- 57
- Issue:
- 16
- ISSN:
- 1757-160X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Human-Robot Collaboration (HRC), which enables a workspace where human and robot can dynamically and safely collaborate for improved operational efficiency, has been identified as a key element in smart manufacturing. Human action recognition plays a key role in the realization ofHRC, as it helps identify current human action and provides the basis for future action prediction and robot planning. While Deep Learning (DL) has demonstrated great potential in advancing human action recognition, effectively leveraging the temporal information of human motions to improve the accuracy and robustness of action recognition has remained as a challenge. Furthermore, it is often difficult tomore »
-
Unmanned aerial vehicles (UAVs) must keep track of their location in order to maintain flight plans. Currently, this task is almost entirely performed by a combination of Inertial Measurement Units (IMUs) and reference to GNSS (Global Navigation Satellite System). Navigation by GNSS, however, is not always reliable, due to various causes both natural (reflection and blockage from objects, technical fault, inclement weather) and artificial (GPS spoofing and denial). In such GPS-denied situations, it is desirable to have additional methods for aerial geolocalization. One such method is visual geolocalization, where aircraft use their ground facing cameras to localize and navigate. Themore »
-
Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (IMU) data into the pose refinement process, which, compared to the state-of-the-art, greatly enhances the pose prediction. The improved accuracy and robustness make it possible for numerous vision algorithms to use imagery captured by rolling shutter cameras and produce highlymore »
-
Event-based cameras have shown great promise in a variety of situations where frame based cameras suffer, such as high speed motions and high dynamic range scenes. However, developing algorithms for event measurements requires a new class of hand crafted algorithms. Deep learning has shown great success in providing model free solutions to many problems in the vision community, but existing networks have been developed with frame based images in mind, and there does not exist the wealth of labeled data for events as there does for images for supervised training. To these points, we present EV-FlowNet, a novel self-supervised deepmore »