skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on May 14, 2026

Title: Learning Optimal UAV Trajectory for Data Collection in 3D Reconstruction Model
The advancement of 3D modeling applications in various domains has been significantly propelled by innovations in 3D computer vision models. However, the efficacy of these models, particularly in large-scale 3D reconstruction, depends on the quality and coverage of the viewpoints. This paper addresses optimizing the trajectory of an unmanned aerial vehicle (UAV) to collect optimal Next-Best View (NBV) for 3D reconstruction models. Unlike traditional methods that rely on predefined criteria or continuous tracking of the 3D model’s development, our approach leverages reinforcement learning to select the NBV based solely on single camera images and the relative positions of the UAV with the reference points to a target. The UAV is positioned with respect to four reference waypoints at the structure’s corners, maintaining its orientation (field of view) towards the structure. Our approach removes the need for constant monitoring of 3D reconstruction accuracy during policy learning, ultimately boosting both the efficiency and autonomy of the data collection process. The implications of this research extend to applications in inspection, surveillance, and mapping, where optimal viewpoint selection is crucial for information gain and operational efficiency.  more » « less
Award ID(s):
2137753
PAR ID:
10585389
Author(s) / Creator(s):
;
Publisher / Repository:
Proceedings of International Conference on Unmanned Aircraft Systems (ICUAS)
Date Published:
Format(s):
Medium: X
Location:
Charlotte, North Carolina
Sponsoring Org:
National Science Foundation
More Like this
  1. With the ever-increasing amount of 3D data being captured and processed, multi-view image compression is essential to various applications, including virtual reality and 3D modeling. Despite the considerable success of learning-based compression models on single images, limited progress has been made in multi-view image compression. In this paper, we propose an efficient approach to multi-view image compression by leveraging the redundant information across different viewpoints without explicitly using warping operations or camera parameters. Our method builds upon the recent advancements in Multi-Reference Entropy Models (MEM), which were initially proposed to capture correlations within an image. We extend the MEM models to employ cross-view correlations in addition to within-image correlations. Specifically, we generate latent representations for each view independently and integrate a cross-view context module within the entropy model. The estimation of entropy parameters for each view follows an autoregressive technique, leveraging correlations with the previous views. We show that adding this view context module further enhances the compression performance when jointly trained with the autoencoder. Experimental results demonstrate superior performance compared to both traditional and learning-based multi-view compression methods. 
    more » « less
  2. This paper introduces a deep neural network based method, i.e., DeepOrganNet, to generate and visualize fully high-fidelity 3D / 4D organ geometric models from single-view medical images with complicated background in real time. Traditional 3D / 4D medical image reconstruction requires near hundreds of projections, which cost insufferable computational time and deliver undesirable high imaging / radiation dose to human subjects. Moreover, it always needs further notorious processes to segment or extract the accurate 3D organ models subsequently. The computational time and imaging dose can be reduced by decreasing the number of projections, but the reconstructed image quality is degraded accordingly. To our knowledge, there is no method directly and explicitly reconstructing multiple 3D organ meshes from a single 2D medical grayscale image on the fly. Given single-view 2D medical images, e.g., 3D / 4D-CT projections or X-ray images, our end-to-end DeepOrganNet framework can efficiently and effectively reconstruct 3D / 4D lung models with a variety of geometric shapes by learning the smooth deformation fields from multiple templates based on a trivariate tensor-product deformation technique, leveraging an informative latent descriptor extracted from input 2D images. The proposed method can guarantee to generate high-quality and high-fidelity manifold meshes for 3D / 4D lung models; while, all current deep learning based approaches on the shape reconstruction from a single image cannot. The major contributions of this work are to accurately reconstruct the 3D organ shapes from 2D single-view projection, significantly improve the procedure time to allow on-the-fly visualization, and dramatically reduce the imaging dose for human subjects. Experimental results are evaluated and compared with the traditional reconstruction method and the state-of-the-art in deep learning, by using extensive 3D and 4D examples, including both synthetic phantom and real patient datasets. The efficiency of the proposed method shows that it only needs several milliseconds to generate organ meshes with 10K vertices, which has great potential to be used in real-time image guided radiation therapy (IGRT). 
    more » « less
  3. Unsupervised machine learning algorithms (clustering, genetic, and principal component analysis) automate Unmanned Aerial Vehicle (UAV) missions as well as the creation and refinement of iterative 3D photogrammetric models with a next best view (NBV) approach. The novel approach uses Structure-from-Motion (SfM) to achieve convergence to a specified orthomosaic resolution by identifying edges in the point cloud and planning cameras that “view” the holes identified by edges without requiring an initial model. This iterative UAV photogrammetric method successfully runs in various Microsoft AirSim environments. Simulated ground sampling distance (GSD) of models reaches as low as 3.4 cm per pixel, and generally, successive iterations improve resolution. Besides analogous application in simulated environments, a field study of a retired municipal water tank illustrates the practical application and advantages of automated UAV iterative inspection of infrastructure using 63 % fewer photographs than a comparable manual flight with analogous density point clouds obtaining a GSD of less than 3 cm per pixel. Each iteration qualitatively increases resolution according to a logarithmic regression, reduces holes in models, and adds details to model edges. 
    more » « less
  4. In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis. To this end, we propose DeepVoxels, a learned representation that encodes the view-dependent appearance of a 3D scene without having to explicitly model its geometry. At its core, our approach is based on a Cartesian 3D grid of persistent embedded features that learn to make use of the underlying 3D scene structure. Our approach combines insights from 3D geometric computer vision with recent advances in learning image-to-image mappings based on adversarial loss functions. DeepVoxels is supervised, without requiring a 3D reconstruction of the scene, using a 2D re-rendering loss and enforces perspective and multi-view geometry in a principled manner. We apply our persistent 3D scene representation to the problem of novel view synthesis demonstrating high-quality results for a variety of challenging scenes. 
    more » « less
  5. Safety and efficiency are primary goals of air traffic management. With the integration of unmanned aerial vehicles (UAVs) into the airspace, UAV traffic management (UTM) has attracted significant interest in the research community to maintain the capacity of three-dimensional (3D) airspace, provide information, and avoid collisions. We propose a new decision-making architecture for UAVs to avoid collision by formulating the problem into a multi-agent game in a 3D airspace. In the proposed game-theoretic approach, the Ego UAV plays a repeated two-player normal-form game, and the payoff functions are designed to capture both the safety and efficiency of feasible actions. An optimal decision in the form of Nash equilibrium (NE) is obtained. Simulation studies are conducted to demonstrate the performance of the proposed game-theoretic collision avoidance approach in several representative multi-UAV scenarios. 
    more » « less