NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

GeoGen I: Towards General Geospatial Point Data Generation from Text

https://doi.org/10.1145/3764921.3770154

Saeedan, Majid; Eldawy, Ahmed (November 2025, ACM)

Generating realistic geospatial vector data is important for evaluat-ing algorithms, index structures, and systems under diverse condi-tions. Existing synthetic data generators typically rely on simplestatistical or procedural models that fail to capture the complexityof real-world spatial patterns. This paper introduces GeoGen I, agenerative framework that produces geospatial point distributionsfrom natural language prompts. The system combines contrastivelearning, region context, and a diffusion-based generator to createplausible datasets. In the experiments, we test variations of themodel and provide both qualitative and quantitative evaluations.Our experiments show that it can generate spatial patterns alignedwith different prompts. While the results are promising, many chal-lenges still remain, including in dataset curation and quality, andthe model’s ability to capture subtle geospatial constraints.
more » « less
Free, publicly-accessible full text available November 2, 2026
AgriParcel: Geometry Farmlands Generation Framework for Comprehensive Agricultural Analysis

https://doi.org/10.1145/3764912.3770823

Shang, Zhuocheng; Distor, Adreyan; Eldawy, Ahmed (November 2025, ACM)

Free, publicly-accessible full text available November 3, 2026
LASEK: LLM-Assisted Style Exploration Kit for Geospatial Data

Bahadori, Tarlan; Sarvepalli, Sai Sreekar; Eldawy, Ahmed (July 2025, The VLDB Endowment)

Geospatial data visualization on a map is an essential tool for modern data exploration tools. However, these tools require users to manually configure the visualization style including color scheme and attribute selection, a process that is both complex and domain-specific. Large Language Models (LLMs) provide an opportunity to intelligently assist in styling based on the underlying data distribution and characteristics. This paper demonstrates LASEK, an LLM-assisted visualization framework that automates attribute selection and styling in large-scale spatio-temporal datasets. The system leverages LLMs to determine which attributes should be highlighted for visual distinction and even suggests how to integrate them in styling options improving interpretability and efficiency. We demonstrate our approach through interactive visualization scenarios, showing how LLM-driven attribute selection enhances clarity, reduces manual effort, and provides data-driven justifications for styling decisions.
more » « less
Free, publicly-accessible full text available July 2, 2026
DynoViz: Dynamic Visualization of Large Scale Satellite Data

https://doi.org/10.1145/3681763.3698475

Shang, Zhuocheng; Shivakumar, Suryaa Charan; Eldawy, Ahmed (October 2024, ACM Digital Library)

Full Text Available
Spatial Query Optimization With Learning

Zhang, Xin; Eldawy, Ahmed (September 2024, Proceedings of the VLDB Endowment)

Query optimization is a key component in database management systems (DBMS) and distributed data processing platforms. Re- cent research in the database community incorporated techniques from artificial intelligence to enhance query optimization. Various learning models have been extended and applied to the query optimization tasks, including query execution plan, query rewriting, and cost estimation. The tasks involved in query optimization differ based on the type of data being processed, such as relational data or spatial geometries. This tutorial reviews recent learning-based approaches for spatial query optimization tasks. We go over methods designed specifically for spatial data, as well as solutions proposed for high-dimensional data. Additionally, we present learning-based spatial indexing and spatial partitioning methods, which are also vital components in spatial data processing. We also identify several open research problems in these fields.
more » « less
Full Text Available
QPJVis Demo: Quality-boost Progressive Join Query Processing System

Zhang, Xin; Eldawy, Ahmed (September 2024, Proceedings of the VLDB Endowment)

Progressive query processing enables data scientists to efficiently analyze and explore large datasets. Data scientists can start further analyses earlier if the progressive result can represent the complete results well. Most progressive processing frameworks carefully control which parts of the input to process in order to improve the quality of progressive results. The input control strategies work well when the data are processed uniformly. However, the progressive results will be biased towards the join keys if the processed data are not uniform. A recently proposed input&output framework named QPJ corrects the bias by temporarily hiding some results. The framework dynamically estimates the distribution of the complete result and outputs progressive results with a similar distribution to the estimated complete result. This demo presents QPJVis, which is a progressive query processing system designed to inherently process the progressive queries using the QPJ frame- work. Additionally, we also implement an input control framework, Prism, in QPJVis so that users can compare the difference between the input&output framework and a purely input framework.
more » « less
Full Text Available
RDPro : Distributed Processing of Big Raster Data: [Scalable Data Science]

https://doi.org/10.14778/3712221.3712229

Shang, Zhuocheng; Singla, Samriddhi; Eldawy, Ahmed; Scudiero, Elia (November 2024, Proceedings of the VLDB Endowment)

Advancements in remote sensing technology allowed for collecting vast amounts of satellite and aerial imagery with up to 1 cm pixel resolutions, stored in raster format crucial for various research fields. However, processing this data poses challenges, including resolving data dependencies when location, resolution, and coordinate systems do not align and managing large datasets within memory constraints. This paper introduces RDPro, a novel Spark-based system that efficiently processes and analyzes large raster datasets. RDPro features a new data model tailored for data dependencies in a distributed, shared-nothing environment, complete with tools for loading and writing raster data. It also optimizes core raster operations within Spark, allowing users to integrate complex data science workflows. Comparative analysis shows RDPro outperforms existing systems by up to two orders of magnitude.
more » « less
Full Text Available
QPV: An Input Control Component For Progressive Visualization Analytics [Work-in-progress]

Zhang, Xin; Eldawy, Ahmed (September 2024, Proceedings of the VLDB Endowment)

Full Text Available
QPV: An Input Control Component For Progressive Visualization Analytics [Work-in-progress]

Zhang, Xin; Eldawy, Ahmed (August 2024, The VLDB Endowment)

Progressive visual analytics enable data scientists to efficiently explore large datasets and examine progressive results with low latency. Most progressive visualization frameworks use a progressive query processing module that controls the quality of the results and then feeds these results into a visualization module. The goal is to avoid poor-quality progressive results which could mislead data scientists. This method misses some optimization opportunities as it improves the quality of the intermediate result while ignoring how this result affects the final visualization. This work presents a work-in-progress quality-aware progressive visualization input control component, named QPV. The key idea of the proposed framework is to integrate the visualization module into the progressive query results so that the quality control takes into account the final visualization. With limited computational resources, QPV solves an optimization problem to allocate resources and alleviate the misleading effects in the progressive plots.
more » « less
Full Text Available
QPJVis Demo: Quality-Boost Progressive Join Query Processing System

https://doi.org/10.14778/3685800.3685871

Zhang, Xin; Eldawy, Ahmed (August 2024, Proceedings of the VLDB Endowment)

Progressive query processing enables data scientists to efficiently analyze and explore large datasets. Data scientists can start further analyses earlier if the progressive result can represent the complete results well. Most progressive processing frameworks carefully control which parts of the input to process in order to improve the quality of progressive results. The input control strategies work well when the data are processed uniformly. However, the progressive results will be biased towards the join keys if the processed data are not uniform. A recently proposed input&output framework named QPJ corrects the bias by temporarily hiding some results. The framework dynamically estimates the distribution of the complete result and outputs progressive results with a similar distribution to the estimated complete result. This demo presents QPJVis, which is a progressive query processing system designed to inherently process the progressive queries using the QPJ framework. Additionally, we also implement an input control framework, Prism, in QPJVis so that users can compare the difference between the input&output framework and a purely input framework.
more » « less
Full Text Available

« Prev Next »

Search for: All records