skip to main content


Title: Deep RC: Enabling Remote Control through Deep Learning
Human remote-control (RC) pilots have the ability to perceive the position and orientation of an aircraft using only third-person-perspective visual sensing. While novice pilots often struggle when learning to control RC aircraft, they can sense the orientation of the aircraft with relative ease. In this paper, we hypothesize and demonstrate that deep learning methods can be used to mimic the human ability to perceive the orientation of an aircraft from monocular imagery. This work uses a neural network to directly sense the aircraft attitude. The network is combined with more conventional image processing methods for visual tracking of the aircraft. The aircraft track and attitude measurements from the convolutional neural network (CNN) are combined in a particle filter that provides a complete state estimate of the aircraft. The network topology, training, and testing results are presented as well as filter development and results. The proposed method was tested in simulation and hardware flight demonstrations.  more » « less
Award ID(s):
1650547
NSF-PAR ID:
10136667
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
International Conference on Unmanned Aircraft Systems
Page Range / eLocation ID:
1161 to 1167
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper demonstrates a feasible method for using a deep neural network as a sensor to estimate the attitude of a flying vehicle using only flight video. A dataset of still images and associated gravity vectors was collected and used to perform supervised learning. The network builds on a previously trained network and was trained to be able to approximate the attitude of the camera with an average error of about 8 degrees. Flight test video was recorded and processed with a relatively simple visual odometry method. The aircraft attitude is then estimated with the visual odometry as the state propagation and network providing the attitude measurement in an extended Kalman filter. Results show that the proposed method of having the neural network provide a gravity vector attitude measurement from the flight imagery reduces the standard deviation of the attitude error by approximately 12 times compared to a baseline approach. 
    more » « less
  2. Spatial ability is the ability to generate, store, retrieve, and transform visual information to mentally represent a space and make sense of it. This ability is a critical facet of human cognition that affects knowledge acquisition, productivity, and workplace safety. Although having improved spatial ability is essential for safely navigating and perceiving a space on earth, it is more critical in altered environments of other planets and deep space, which may pose extreme and unfamiliar visuospatial conditions. Such conditions may range from microgravity settings with the misalignment of body and visual axes to a lack of landmark objects that offer spatial cues to perceive size, distance, and speed. These altered visuospatial conditions may pose challenges to human spatial cognitive processing, which assists humans in locating objects in space, perceiving them visually, and comprehending spatial relationships between the objects and surroundings. The main goal of this paper is to examine if eye-tracking data of gaze pattern can indicate whether such altered conditions may demand more mental efforts and attention. The key dimensions of spatial ability (i.e., spatial visualization, spatial relations, and spatial orientation) are examined under the three simulated conditions: (1) aligned body and visual axes (control group); (2) statically misaligned body and visual axes (experiment group I); and dynamically misaligned body and visual axes (experiment group II). The three conditions were simulated in Virtual Reality (VR) using Unity 3D game engine. Participants were recruited from Texas A&M University student population who wore HTC VIVE Head-Mounted Displays (HMDs) equipped with eye-tracking technology to work on three spatial tests to measure spatial visualization, orientation, and relations. The Purdue Spatial Visualization Test: Rotations (PSVT: R), the Mental Cutting Test (MCT), and the Perspective Taking Ability (PTA) test were used to evaluate the spatial visualization, spatial relations, and spatial orientation of 78 participants, respectively. For each test, gaze data was collected through Tobii eye-tracker integrated in the HTC Vive HMDs. Quick eye movements, known as saccades, were identified by analyzing raw eye-tracking data using the rate of change of gaze position over time as a measure of mental effort. The results showed that the mean number of saccades in MCT and PSVT: R tests was statistically larger in experiment group II than in the control group or experiment group I. However, PTA test data did not meet the required assumptions to compare the mean number of saccades in the three groups. The results suggest that spatial relations and visualization may require more mental effort under dynamically misaligned idiotropic and visual axes than aligned or statically misaligned idiotropic and visual axes. However, the data could not reveal whether spatial orientation requires more/less mental effort under aligned, statically misaligned, and dynamically misaligned idiotropic and visual axes. The results of this study are important to understand how altered visuospatial conditions impact spatial cognition and how simulation- or game-based training tools can be developed to train people in adapting to extreme or altered work environments and working more productively and safely.

     
    more » « less
  3. In this paper, we introduce a neural network (NN)-based symbol detection scheme for Wi-Fi systems and its associated hardware implementation in software radios. To be specific, reservoir computing (RC), a special type of recurrent neural network (RNN), is adopted to conduct the task of symbol detection for Wi-Fi receivers. Instead of introducing extra training overhead/set to facilitate the RC-based symbol detection, a new training framework is introduced to take advantage of the signal structure in existing Wi-Fi protocols (e.g., IEEE 802.11 standards), that is, the introduced RC-based symbol detector will utilize the inherent long/short training sequences and structured pilots sent by the Wi-Fi transmitter to conduct online learning of the transmit symbols. In other words, our introduced NN-based symbol detector does not require any additional training sets compared to existing Wi-Fi systems. The introduced RC-based Wi-Fi symbol detector is implemented on the software-defined radio (SDR) platform to further provide realistic and meaningful performance comparisons against the traditional Wi-Fi receiver. Over the air, experiment results show that the introduced RC based Wi-Fi symbol detector outperforms conventional Wi-Fi symbol detection methods in various environments indicating the significance and the relevance of our work. 
    more » « less
  4. In many applications, data is easy to acquire but expensive and time-consuming to label, prominent examples include medical imaging and NLP. This disparity has only grown in recent years as our ability to collect data improves. Under these constraints, it makes sense to select only the most informative instances from the unlabeled pool and request an oracle (e.g., a human expert) to provide labels for those samples. The goal of active learning is to infer the informativeness of unlabeled samples so as to minimize the number of requests to the oracle. Here, we formulate active learning as an open-set recognition problem. In this paradigm, only some of the inputs belong to known classes; the classifier must identify the rest as unknown . More specifically, we leverage variational neural networks (VNNs), which produce high-confidence (i.e., low-entropy) predictions only for inputs that closely resemble the training data. We use the inverse of this confidence measure to select the samples that the oracle should label. Intuitively, unlabeled samples that the VNN is uncertain about contain features that the network has not been exposed to; thus they are more informative for future training. We carried out an extensive evaluation of our novel, probabilistic formulation of active learning, achieving state-of-the-art results on MNIST, CIFAR-10, CIFAR-100, and FashionMNIST. Additionally, unlike current active learning methods, our algorithm can learn even in the presence of out-of-distribution outliers. As our experiments show, when the unlabeled pool consists of a mixture of samples from multiple datasets, our approach can automatically distinguish between samples from seen vs. unseen datasets. Overall, our results show that high-quality uncertainty measures are key for pool-based active learning. 
    more » « less
  5. ABSTRACT

    Machine learning models can greatly improve the search for strong gravitational lenses in imaging surveys by reducing the amount of human inspection required. In this work, we test the performance of supervised, semi-supervised, and unsupervised learning algorithms trained with the ResNetV2 neural network architecture on their ability to efficiently find strong gravitational lenses in the Deep Lens Survey (DLS). We use galaxy images from the survey, combined with simulated lensed sources, as labeled data in our training data sets. We find that models using semi-supervised learning along with data augmentations (transformations applied to an image during training, e.g. rotation) and Generative Adversarial Network (GAN) generated images yield the best performance. They offer 5 – 10 times better precision across all recall values compared to supervised algorithms. Applying the best performing models to the full 20 deg2 DLS survey, we find 3 Grade-A lens candidates within the top 17 image predictions from the model. This increases to 9 Grade-A and 13 Grade-B candidates when 1 per cent (∼2500 images) of the model predictions are visually inspected. This is ≳ 10 × the sky density of lens candidates compared to current shallower wide-area surveys (such as the Dark Energy Survey), indicating a trove of lenses awaiting discovery in upcoming deeper all-sky surveys. These results suggest that pipelines tasked with finding strong lens systems can be highly efficient, minimizing human effort. We additionally report spectroscopic confirmation of the lensing nature of two Grade-A candidates identified by our model, further validating our methods.

     
    more » « less