skip to main content


Title: Deep Visual Gravity Vector Detection for Unmanned Aircraft Attitude Estimation
This paper demonstrates a feasible method for using a deep neural network as a sensor to estimate the attitude of a flying vehicle using only flight video. A dataset of still images and associated gravity vectors was collected and used to perform supervised learning. The network builds on a previously trained network and was trained to be able to approximate the attitude of the camera with an average error of about 8 degrees. Flight test video was recorded and processed with a relatively simple visual odometry method. The aircraft attitude is then estimated with the visual odometry as the state propagation and network providing the attitude measurement in an extended Kalman filter. Results show that the proposed method of having the neural network provide a gravity vector attitude measurement from the flight imagery reduces the standard deviation of the attitude error by approximately 12 times compared to a baseline approach.  more » « less
Award ID(s):
1650547
NSF-PAR ID:
10053362
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
IEEE/RSJ International Conference on Intelligent Robots and Systems
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Human remote-control (RC) pilots have the ability to perceive the position and orientation of an aircraft using only third-person-perspective visual sensing. While novice pilots often struggle when learning to control RC aircraft, they can sense the orientation of the aircraft with relative ease. In this paper, we hypothesize and demonstrate that deep learning methods can be used to mimic the human ability to perceive the orientation of an aircraft from monocular imagery. This work uses a neural network to directly sense the aircraft attitude. The network is combined with more conventional image processing methods for visual tracking of the aircraft. The aircraft track and attitude measurements from the convolutional neural network (CNN) are combined in a particle filter that provides a complete state estimate of the aircraft. The network topology, training, and testing results are presented as well as filter development and results. The proposed method was tested in simulation and hardware flight demonstrations. 
    more » « less
  2. Unmanned aerial vehicles (UAVs) must keep track of their location in order to maintain flight plans. Currently, this task is almost entirely performed by a combination of Inertial Measurement Units (IMUs) and reference to GNSS (Global Navigation Satellite System). Navigation by GNSS, however, is not always reliable, due to various causes both natural (reflection and blockage from objects, technical fault, inclement weather) and artificial (GPS spoofing and denial). In such GPS-denied situations, it is desirable to have additional methods for aerial geolocalization. One such method is visual geolocalization, where aircraft use their ground facing cameras to localize and navigate. The state of the art in many ground-level image processing tasks involve the use of Convolutional Neural Networks (CNNs). We present here a study of how effectively a modern CNN designed for visual classification can be applied to the problem of Absolute Visual Geolocalization (AVL, localization without a prior location estimate). An Xception based architecture is trained from scratch over a >1000 km2 section of Washington County, Arkansas to directly regress latitude and longitude from images from different orthorectified high-altitude survey flights. It achieves average localization accuracy on unseen image sets over the same region from different years and seasons with as low as 115 m average error, which localizes to 0.004% of the training area, or about 8% of the width of the 1.5 × 1.5 km input image. This demonstrates that CNNs are expressive enough to encode robust landscape information for geolocalization over large geographic areas. Furthermore, discussed are methods of providing uncertainty for CNN regression outputs, and future areas of potential improvement for use of deep neural networks in visual geolocalization. 
    more » « less
  3. null (Ed.)
    In this work we address the adequacy of two machine learning methods to tackle the problem of wind velocity estimation in the lowermost region of the atmosphere using on-board inertial drone data within an outdoor setting. We fed these data, and accompanying wind tower measurements, into a K-nearest neighbor (KNN) algorithm and a long short-term memory (LSTM) neural network to predict future windspeeds, by exploiting the stabilization response of two hovering drones in a wind field. Of the two approaches, we found that LSTM proved to be the most capable supervised learning model during more capricious wind conditions, and made competent windspeed predictions with an average root mean square error of 0.61 m·s−1 averaged across two drones, when trained on at least 20 min of flight data. During calmer conditions, a linear regression model demonstrated acceptable performance, but under more variable wind regimes the LSTM performed considerably better than the linear model, and generally comparable to more sophisticated methods. Our approach departs from other multi-rotor-based windspeed estimation schemes by circumventing the use of complex and specific dynamic models, to instead directly learn the relationship between drone attitude and fluctuating windspeeds. This exhibits utility in a range of otherwise prohibitive environments, like mountainous terrain or off-shore sites. 
    more » « less
  4. This work presents a multiplicative extended Kalman filter (MEKF) for estimating the relative state of a multirotor vehicle operating in a GPS-denied environment. The filter fuses data from an inertial measurement unit and altimeter with relative-pose updates from a keyframe-based visual odometry or laser scan-matching algorithm. Because the global position and heading states of the vehicle are unobservable in the absence of global measurements such as GPS, the filter in this article estimates the state with respect to a local frame that is colocated with the odometry keyframe. As a result, the odometry update provides nearly direct measurements of the relative vehicle pose, making those states observable. Recent publications have rigorously documented the theoretical advantages of such an observable parameterization, including improved consistency, accuracy, and system robustness, and have demonstrated the effectiveness of such an approach during prolonged multirotor flight tests. This article complements this prior work by providing a complete, self-contained, tutorial derivation of the relative MEKF, which has been thoroughly motivated but only briefly described to date. This article presents several improvements and extensions to the filter while clearly defining all quaternion conventions and properties used, including several new useful properties relating to error quaternions and their Euler-angle decomposition. Finally, this article derives the filter both for traditional dynamics defined with respect to an inertial frame, and for robocentric dynamics defined with respect to the vehicle’s body frame, and provides insights into the subtle differences that arise between the two formulations. 
    more » « less
  5. Rolling shutter distortion is highly undesirable for photography and computer vision algorithms (e.g., visual SLAM) because pixels can be potentially captured at different times and poses. In this paper, we propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction. Our contribution in this work is to incorporate inertial measurement unit (IMU) data into the pose refinement process, which, compared to the state-of-the-art, greatly enhances the pose prediction. The improved accuracy and robustness make it possible for numerous vision algorithms to use imagery captured by rolling shutter cameras and produce highly accurate results. We also extend a dataset to have real rolling shutter images, IMU data, depth maps, camera poses, and corresponding global shutter images for rolling shutter correction training. We demonstrate the efficacy of the proposed method by evaluating the performance of Direct Sparse Odometry (DSO) algorithm on rolling shutter imagery corrected using the proposed approach. Results show marked improvements of the DSO algorithm over using uncorrected imagery, validating the proposed approach. 
    more » « less