Abstract. State-of-the-art remote sensing image management systems adopt scalable databases and employ sophisticated indexing techniques to perform window and containment queries. Many rely on space-filling curve (SFC) based index techniques designed for key-value databases and are predominantly employable for images that are iso-oriented. Critically, these indexes do not consider the high degree of overlap among images that exists in many data sets and the affiliated storage requirements. Specifically, employing an SFC-based grid cell index approach in consort with ground footprint coverage of the images requires storage of a unique image object identification (IOI) for each image in every grid cell where overlap occurs. Such an approach adversely affects both storage and query response times. In response, this paper presents an optimization technique for an SFC-based grid cell space indexing. The optimization is specifically designed for window and containment queries where the region of interest overlaps with at least a 2 × 2 grid of cells. The technique is based on four cell removal steps, thus called “four step algorithm” (4SA). Each step employs a unique spatial configuration to check for continuous spatial extent. If present, the IOI of the target cell is omitted from further consideration. Analysis and experiments on real world and synthetic image data demonstrated that 4SA improved storage demands by 41.3% – 47.8%. Furthermore, in the performed querying experiments, only 42% of IOI elements needed to be processed, thus yielding a 58% productivity gain. The reduction of IOI elements in querying also impacted the CPU execution time (3.0% – 5.2%). The 4SA also demonstrated data scalability and concurrent user scalability in querying large regions by completing the index searching and concurrent user scalability 1.86% – 3.35% faster than when 4SA was not applied.
more »
« less
DACMA: Designing space ordering optimizations to scalably manage aerial images
Aerial images are a special class of remote sensing images, as they are intentionally collected with a high degree of overlap. This high degree of overlap complicates existing index strategies such as R-tree and Space Filling Curve (SFC) based index techniques due to complications in space splitting, granularity of the grid cells and excessive duplication of image object identifiers (IOIs). However, SFC based space ordering can be modified to provide scalable management of overlapping aerial images. This involves overcoming similar IOIs in adjacent grid cells, which would naturally occur in SFC based grids with such data. IOI duplication can be minimized by merging adjacent grid cells through the proposed “Designing Adjacent Cell Merge Algorithm” (DACMA). This work focuses on establishing a proper adjacent cell merge metric and merge percentage value. Using a highly scalable, distributed HBase cluster for both a single aerial mapping project, and multiple aerial mapping projects, experiments evaluated Jaccard Similarity (JS) and Percentage of Overlap (PO) merge metrics. JS had significant advantages: (i) generating smaller merged regions and (ii) obtaining over 21% and 36% improvement in reducing query response times compared to PO. As a result, JS is proposed for the merge metric for DACMA. For the merge percentage two considerations were dominant: (i) substantial storage reductions with respect to both straight forward SFC-based cell space indexing and 4SA based indexing, and (ii) minimal impact on the query response time. The proposed merge percentage value was selected to optimize the storage (i.e. space) needs and response time (i.e. time) herein named the “Space-Time Trade-off Optimization Percentage” value (or STOP value) is presented.
more »
« less
- Award ID(s):
- 1826134
- PAR ID:
- 10576378
- Publisher / Repository:
- IEEE
- Date Published:
- ISBN:
- 978-1-6654-8045-1
- Page Range / eLocation ID:
- 4916 to 4925
- Subject(s) / Keyword(s):
- imagery aerial storage querying scalability database
- Format(s):
- Medium: X
- Location:
- Osaka, Japan
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
State-of-the-art, scalable, indexing techniques in location-based image data retrieval are primarily focused on supporting window and range queries. However, support of these indexes is not well explored when there are multiple spatially similar images to retrieve for a given geographic location. Adoption of existing spatial indexes such as the kD-tree pose major scalability impediments. In response, this work proposes a novel scalable, key-value, database oriented, secondary-memory based, spatial index to retrieve the top k most spatially similar images to a given geographic location. The proposed index introduces a 4-dimensional Hilbert index (4DHI). This space filling curve is implemented atop HBase (a key-value database). Experiments performed on both synthetically generated and real world data demonstrate comparable accuracy with MD-HBase (a state of the art, scalable, multidimensional point data management system) and better performance. Specifically, 4DHI yielded 34% - 39% storage improvements compared to the disk consumption of the original index of MD-HBase. The compactness in 4DHI also yielded up to 3.4 and 4.7 fold gains when retrieving 6400 and 12800 neighbours, respectively; compared to the adoption of original index of MD-HBase for respective neighbour searches. An optimization technique termed “Bounding Box Displacement” (BBD) is introduced to improve the accuracy of the top k approximations in relation to the results of in-memory kD-tree. Finally, a method of reducing row key length is also discussed for the proposed 4DHI to further improve the storage efficiency and scalability in managing large numbers of remotely sensed images.more » « less
-
Effective vector representation models, e.g., word2vec and node2vec, embed real-world objects such as images and documents in high dimensional vector space. In the meanwhile, the objects are often associated with attributes such as timestamps and prices. Many scenarios need to jointly query the vector representations of the objects together with their attributes. These queries can be formalized as range-filtering approximate nearest neighbor search (ANNS) queries. Specifically, given a collection of data vectors, each associated with an attribute value whose domain has a total order. The range-filtering ANNS consists of a query range and a query vector. It finds the approximate nearest neighbors of the query vector among all the data vectors whose attribute values fall in the query range. Existing approaches suffer from a rapidly degrading query performance when the query range width shifts. The query performance can be optimized by a solution that builds an ANNS index for every possible query range; however, the index time and index size become prohibitive -- the number of query ranges is quadratic to the number n of data vectors. To overcome these challenges, for the query range contains all attribute values smaller than a user-provided threshold, we design a structure called the segment graph whose index time and size are the same as a single ANNS index, yet can losslessly compress the n ANNS indexes, reducing the indexing cost by a factor of Ω(n). To handle general range queries, we propose a 2D segment graph with average-case index size O(n log n) to compress n segment graphs, breaking the quadratic barrier. Extensive experiments conducted on real-world datasets show that our proposed structures outperformed existing methods significantly; our index also exhibits superior scalability.more » « less
-
Image-based localization has been widely used for autonomous vehicles, robotics, augmented reality, etc., and this is carried out by matching a query image taken from a cell phone or vehicle dashcam to a large scale of geo-tagged reference images, such as satellite/aerial images or Google Street Views. However, the problem remains challenging due to the inconsistency between the query images and the large-scale reference datasets regarding various light and weather conditions. To tackle this issue, this work proposes a novel view synthesis framework equipped with deep generative models, which can merge the unique features from the outdated reference dataset with features from the images containing seasonal changes. Our design features a unique scheme to ensure that the synthesized images contain the important features from both reference and patch images, covering seasonable features and minimizing the gap for the image-based localization tasks. The performance evaluation shows that the proposed framework can synthesize the views in various weather and lighting conditions.more » « less
-
The unprecedented rise of social media platforms, combined with location-aware technologies, has led to continuously producing a significant amount of geo-social data that flows as a user-generated data stream. This data has been exploited in several important use cases in various application domains. This article supports geo-social personalized queries in streaming data environments. We define temporal geo-social queries that provide users with real-time personalized answers based on their social graph. The new queries allow incorporating keyword search to get personalized results that are relevant to certain topics. To efficiently support these queries, we propose an indexing framework that provides lightweight and effective real-time indexing to digest geo-social data in real time. The framework distinguishes highly dynamic data from relatively stable data and uses appropriate data structures and a storage tier for each. Based on this framework, we propose a novel geo-social index and adopt two baseline indexes to support the addressed queries. The query processor then employs different types of pruning to efficiently access the index content and provide a real-time query response. The extensive experimental evaluation based on real datasets has shown the superiority of our proposed techniques to index real-time data and provide low-latency queries compared to existing competitors.more » « less
An official website of the United States government

