skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: AstroVision: Towards autonomous feature detection and description for missions to small bodies using deep learning
Missions to small celestial bodies rely heavily on optical feature tracking for characterization of and relative navigation around the target body. While deep learning has led to great advancements in feature detection and description, training and validating data-driven models for space applications is challenging due to the limited availability of large-scale, annotated datasets. This paper introduces AstroVision, a large-scale dataset comprised of 115,970 densely annotated, real images of 16 different small bodies captured during past and ongoing missions. We leverage AstroVision to develop a set of standardized benchmarks and conduct an exhaustive evaluation of both handcrafted and data-driven feature detection and description methods. Next, we employ AstroVision for end-to-end training of a state-of-the-art, deep feature detection and description network and demonstrate improved performance on multiple benchmarks. The full benchmarking pipeline and the dataset will be made publicly available to facilitate the advancement of computer vision algorithms for space applications.  more » « less
Award ID(s):
2101250
PAR ID:
10471876
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Elsevier
Date Published:
Journal Name:
Acta Astronautica
Volume:
210
Issue:
C
ISSN:
0094-5765
Page Range / eLocation ID:
393 to 410
Subject(s) / Keyword(s):
Keypoint detection Feature description Feature tracking Deep learning Computer vision Spacecraft navigation Small celestial bodies
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Commercial satellite sensors offer the luxury of mapping of individual permafrost features and their change over time. Deep learning convolutional neural nets (CNNs) demonstrate a remarkable success in automated image analysis. Inferential strengths ofCNNmodels are driven primarily by the quality and volume of hand-labeled training samples. Production of hand-annotated samples is a daunting task. This is particularly true for regional-scale mapping applications, such as permafrost feature detection across the Arctic. Image augmentation is a strategic data-space solution to synthetically inflate the size and quality of training samples by transforming the color space or geometric shape or by injecting noise. In this study, we systematically investigate the effectiveness of a spectrum of augmentation methods when applied toCNNalgorithms to recognize ice-wedge polygons from commercial satellite imagery. Our findings suggest that a list of augmentation methods (such as hue, saturation, and salt and pepper noise) can increase the model performance. 
    more » « less
  2. We propose Deep Estimators of Features (DEFs), a learning-based framework for predicting sharp geometric features in sampled 3D shapes. Differently from existing data-driven methods, which reduce this problem to feature classification, we propose to regress a scalar field representing the distance from point samples to the closest feature line on local patches. Our approach is the first that scales to massive point clouds by fusing distance-to-feature estimates obtained on individual patches. We extensively evaluate our approach against related state-of-the-art methods on newly proposed synthetic and real-world 3D CAD model benchmarks. Our approach not only outperforms these (with improvements in Recall and False Positives Rates), but generalizes to real-world scans after training our model on synthetic data and fine-tuning it on a small dataset of scanned data. We demonstrate a downstream application, where we reconstruct an explicit representation of straight and curved sharp feature lines from range scan data. We make code, pre-trained models, and our training and evaluation datasets available at https://github.com/artonson/def. 
    more » « less
  3. While day-to-day questions come with a variety of answer types, the current question-answering (QA) literature has failed to adequately address the answer diversity of questions. To this end, we present GooAQ, a large-scale dataset with a variety of answer types. This dataset contains over 5 million questions and 3 million answers collected from Google. GooAQ questions are collected semi-automatically from the Google search engine using its autocomplete feature. This results in naturalistic questions of practical interest that are nonetheless short and expressed using simple language. GooAQ answers are mined from Google’s responses to our collected questions, specifically from the answer boxes in the search results. This yields a rich space of answer types, containing both textual answers (short and long) as well as more structured ones such as collections. We benchmark T5 models on GooAQ and observe that: (a) in line with recent work, LM’s strong performance on GooAQ’s short-answer questions heavily benefit from annotated data; however, (b) their quality in generating coherent and accurate responses for questions requiring long responses (such as ‘how’ and ‘why’ questions) is less reliant on observing annotated data and mainly supported by their pre-training. We release GooAQ to facilitate further research on improving QA with diverse response types. 
    more » « less
  4. Additive manufacturing (AM) methods have become mainstream in many industry sectors, especially aeronautics and space structures, where production volume for components is low and designs are highly customized. The frequency of launching space missions is increasing around the world. Some of these missions are sending landers and rovers to moon, mars, and other planets. Such space structures require numerous parts that are unique in design or are produced in just one or a very small production run. Such parts produced for high stake and very expensive missions require complete confidence in the quality of each part. Characterization of parts manufactured by AM is a significant challenge for many existing methods due to the geometric complexity, feature size in the structure, and size of the part. This paper discusses various challenges in applying current characterization methods to the AM sector. Machine learning (ML) methods are considered promising in materials and manufacturing fields. However, generating the training dataset by creating a large number of parts is expensive and impractical. New methods are required to train the ML algorithms on small datasets, especially for parts of unique geometry that are produced in limited production run such as space structures. 
    more » « less
  5. Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf ™ is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications, driven by the MLCommons ™ Association. We present the results from the first submission round including a diverse set of some of the world’s largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization and communication scheduling enabling overall >10× (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system’s memory hierarchy and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch-sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O and network behaviour to parameterize extended roofline performance models in future rounds. 
    more » « less