GeoGen I: Towards General Geospatial Point Data Generation from Text

Saeedan, Majid (ORCID:0000000292026894); Eldawy, Ahmed (ORCID:0000000265841455)

doi:10.1145/3764921.3770154

Citation Details

This content will become publicly available on November 2, 2026

GeoGen I: Towards General Geospatial Point Data Generation from Text

Generating realistic geospatial vector data is important for evaluat-ing algorithms, index structures, and systems under diverse condi-tions. Existing synthetic data generators typically rely on simplestatistical or procedural models that fail to capture the complexityof real-world spatial patterns. This paper introduces GeoGen I, agenerative framework that produces geospatial point distributionsfrom natural language prompts. The system combines contrastivelearning, region context, and a diffusion-based generator to createplausible datasets. In the experiments, we test variations of themodel and provide both qualitative and quantitative evaluations.Our experiments show that it can generate spatial patterns alignedwith different prompts. While the results are promising, many chal-lenges still remain, including in dataset curation and quality, andthe model’s ability to capture subtle geospatial constraints. more »

Award ID(s):: 2215705

PAR ID:: 10656402

Author(s) / Creator(s):: Saeedan, Majid; Eldawy, Ahmed

Publisher / Repository:: ACM

Date Published:: 2025-11-02

Page Range / eLocation ID:: 70 to 73

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on November 2, 2026
Conference Paper:
https://doi.org/10.1145/3764921.3770154

More Like this