skip to main content


Search for: All records

Award ID contains: 1826134

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
  2. Abstract. Point density is an important property that dictates the usability of a point cloud data set. This paper introduces an efficient, scalable, parallel algorithm for computing the local point density index, a sophisticated point cloud density metric. Computing the local point density index is non-trivial, because this computation involves a neighbour search that is required for each, individual point in the potentially large, input point cloud. Most existing algorithms and software are incapable of computing point density at scale. Therefore, the algorithm introduced in this paper aims to address both the needed computational efficiency and scalability for considering this factor in large, modern point clouds such as those collected in national or regional scans. The proposed algorithm is composed of two stages. In stage 1, a point-level, parallel processing step is performed to partition an unstructured input point cloud into partially overlapping, buffered tiles. A buffer is provided around each tile so that the data partitioning does not introduce spatial discontinuity into the final results. In stage 2, the buffered tiles are distributed to different processors for computing the local point density index in parallel. That tile-level parallel processing step is performed using a conventional algorithm with an R-tree data structure. While straight-forward, the proposed algorithm is efficient and particularly suitable for processing large point clouds. Experiments conducted using a 1.4 billion point data set acquired over part of Dublin, Ireland demonstrated an efficiency factor of up to 14.8/16. More specifically, the computational time was reduced by 14.8 times when the number of processes (i.e. executors) increased by 16 times. Computing the local point density index for the 1.4 billion point data set took just over 5 minutes with 16 executors and 8 cores per executor. The reduction in computational time was nearly 70 times compared to the 6 hours required without parallelism. 
    more » « less
  3. null (Ed.)
    This paper introduces a novel LiDAR point cloud data encoding solution that is compact, flexible, and fully supports distributed data storage within the Hadoop distributed computing environment. The proposed data encoding solution is developed based on Sequence File and Google Protocol Buffers. Sequence File is a generic splittable binary file format built in the Hadoop framework for storage of arbitrary binary data. The key challenge in adopting the Sequence File format for LiDAR data is in the strategy for effectively encoding the LiDAR data as binary sequences in a way that the data can be represented compactly, while allowing necessary mutation. For that purpose, a data encoding solution, based on Google Protocol Buffers (a language-neutral, cross-platform, extensible data serialisation framework) was developed and evaluated. Since neither of the underlying technologies is sufficient to completely and efficiently represent all necessary point formats for distributed computing, an innovative fusion of them was required to provide a viable data storage solution. This paper presents the details of such a data encoding implementation and rigorously evaluates the efficiency of the proposed data encoding solution. Benchmarking was done against a straightforward, naive text encoding implementation using a high-density aerial LiDAR scan of a portion of Dublin, Ireland. The results demonstrated a 6-times reduction in data volume, a 4-times reduction in database ingestion time, and up to a 5 times reduction in querying time. 
    more » « less
  4. null (Ed.)
    Abstract. Each year, lives are needlessly lost to floods due to residents failing to heed evacuation advisories. Risk communication research suggests that flood warnings need to be more vivid, contextualized, and visualizable, in order to engage the message recipient. This paper makes the case for the development of a low-cost augmented reality tool that enables individuals to visualize, at close range and in three-dimension, their homes, schools, and places of work and worship subjected to flooding (modeled upon a series of federally expected flood hazard levels). This paper also introduces initial tool development in this area and the related data input stream. 
    more » « less
  5. null (Ed.)
    Abstract. The massive amounts of spatio-temporal information often present in LiDAR data sets make their storage, processing, and visualisation computationally demanding. There is an increasing need for systems and tools that support all the spatial and temporal components and the three-dimensional nature of these datasets for effortless retrieval and visualisation. In response to these needs, this paper presents a scalable, distributed database system that is designed explicitly for retrieving and viewing large LiDAR datasets on the web. The ultimate goal of the system is to provide rapid and convenient access to a large repository of LiDAR data hosted in a distributed computing platform. The system is composed of multiple, share-nothing nodes operating in parallel. Namely, each node is autonomous and has a dedicated set of processors and memory. The nodes communicate with each other via an interconnected network. The data management system presented in this paper is implemented based on Apache HBase, a distributed key-value datastore within the Hadoop eco-system. HBase is extended with new data encoding and indexing mechanisms to accommodate both the point cloud and the full waveform components of LiDAR data. The data can be consumed by any desktop or web application that communicates with the data repository using the HTTP protocol. The communication is enabled by a web servlet. In addition to the command line tool used for administration tasks, two web applications are presented to illustrate the types of user-facing applications that can be coupled with the data system. 
    more » « less