Morphological (e.g. shape, size, and height) and function (e.g. working, living, and shopping) information of buildings is highly needed for urban planning and management as well as other applications such as city-scale building energy use modeling. Due to the limited availability of socio-economic geospatial data, it is more challenging to map building functions than building morphological information, especially over large areas. In this study, we proposed an integrated framework to map building functions in 50 U.S. cities by integrating multi-source web-based geospatial data. First, a web crawler was developed to extract Points of Interest (POIs) from Tripadvisor.com, and a map crawler was developed to extract POIs and land use parcels from Google Maps. Second, an unsupervised machine learning algorithm named OneClassSVM was used to identify residential buildings based on landscape features derived from Microsoft building footprints. Third, the type ratio of POIs and the area ratio of land use parcels were used to identify six non-residential functions (i.e. hospital, hotel, school, shop, restaurant, and office). The accuracy assessment indicates that the proposed framework performed well, with an average overall accuracy of 94% and a kappa coefficient of 0.63. With the worldwide coverage of Google Maps and Tripadvisor.com, the proposed framework is transferable to other cities over the world. The data products generated from this study are of great use for quantitative city-scale urban studies, such as building energy use modeling at the single building level over large areas.
more »
« less
Predicting building types using OpenStreetMap
Abstract Having accurate building information is paramount for a plethora of applications, including humanitarian efforts, city planning, scientific studies, and navigation systems. While volunteered geographic information from sources such as OpenStreetMap (OSM) has good building geometry coverage, descriptive attributes such as the type of a building are sparse. To fill this gap, this study proposes a supervised learning-based approach to provide meaningful, semantic information for OSM data without manual intervention. We present a basic demonstration of our approach that classifies buildings into either residential or non-residential types for three study areas: Fairfax County in Virginia (VA), Mecklenburg County in North Carolina (NC), and the City of Boulder in Colorado (CO). The model leverages (i) available OSM tags capturing non-spatial attributes, (ii) geometric and topological properties of the building footprints including adjacent types of roads, proximity to parking lots, and building size. The model is trained and tested using ground truth data available for the three study areas. The results show that our approach achieves high accuracy in predicting building types for the selected areas. Additionally, a trained model is transferable with high accuracy to other regions where ground truth data is unavailable. The OSM and data science community are invited to build upon our approach to further enrich the volunteered geographic information in an automated manner.
more »
« less
- Award ID(s):
- 2109647
- PAR ID:
- 10409333
- Date Published:
- Journal Name:
- Scientific Reports
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2045-2322
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Recent advances in big spatial data acquisition and deep learning allow novel algorithms that were not possible several years ago. We introduce a novel inverse procedural modeling algorithm for urban areas that addresses the problem of spatial data quality and uncertainty. Our method is fully automatic and produces a 3D approximation of an urban area given satellite imagery and global-scale data, including road network, population, and elevation data. By analyzing the values and the distribution of urban data, e.g., parcels, buildings, population, and elevation, we construct a procedural approximation of a city at a large-scale. Our approach has three main components: (1) procedural model generation to create parcel and building geometries, (2) parcel area estimation that trains neural networks to provide initial parcel sizes for a segmented satellite image of a city block, and (3) an optional optimization that can use partial knowledge of overall average building footprint area and building counts to improve results. We demonstrate and evaluate our approach on cities around the globe with widely different structures and automatically yield procedural models with up to 91,000 buildings, and spanning up to 150 km 2 . We obtain both a spatial arrangement of parcels and buildings similar to ground truth and a distribution of building sizes similar to ground truth, hence yielding a statistically similar synthetic urban space. We produce procedural models at multiple scales, and with less than 1% error in parcel and building areas in the best case as compared to ground truth and 5.8% error on average for tested cities.more » « less
-
Ground truth depth information is necessary for many computer vision tasks. Collecting this information is chal-lenging, especially for outdoor scenes. In this work, we propose utilizing single-view depth prediction neural networks pre-trained on synthetic scenes to generate relative depth, which we call pseudo-depth. This approach is a less expen-sive option as the pre-trained neural network obtains ac-curate depth information from synthetic scenes, which does not require any expensive sensor equipment and takes less time. We measure the usefulness of pseudo-depth from pre-trained neural networks by training indoor/outdoor binary classifiers with and without it. We also compare the difference in accuracy between using pseudo-depth and ground truth depth. We experimentally show that adding pseudo-depth to training achieves a 4.4% performance boost over the non-depth baseline model on DIODE, a large stan-dard test dataset, retaining 63.8% of the performance boost achieved from training a classifier on RGB and ground truth depth. It also boosts performance by 1.3% on another dataset, SUN397, for which ground truth depth is not avail-able. Our result shows that it is possible to take information obtained from a model pre-trained on synthetic scenes and successfully apply it beyond the synthetic domain to real-world data.more » « less
-
In the growing era of smart cities, data-driven decision-making is pivotal for urban planners and policymakers. Crowd-sourced data is a cost-effective means to collect this information, enabling more efficient urban management. However, ensuring data accuracy and establishing trustworthy “Ground Truth” in smart city sensor data presents unique challenges.Our study contributes by documenting the intricacies and obstacles associated with overcoming MAC randomization, sensor unpredictability, unreliable signal strength, and Wi-Fi probing inconsistencies in smart city data cleaning.We establish a framework for three different types of experiments: Counting, Proximity, and Sensor Range. Our novel approach incorporates the spatial layout of the city, an aspect often overlooked. We propose a database structure and metrics to enhance reproducibility and trust in the system.By presenting our findings, we aim to facilitate a deeper understanding of the nuances involved in handling sensor data, ultimately paving the way for more accurate and meaningful data-driven decision-making in smart cities.more » « less
-
Geographic information systems (GIS) provide accurate maps of terrain, roads, waterways, and building footprints and heights. Aircraft, particularly small unmanned aircraft systems (UAS), can exploit this and additional information such as building roof structure to improve navigation accuracy and safely perform contingency landings particularly in urban regions. However, building roof structure is not fully provided in maps. This paper proposes a method to automatically label building roof shape from publicly available GIS data. Satellite imagery and airborne LiDAR data are processed and manually labeled to create a diverse annotated roof image dataset for small to large urban cities. Multiple convolutional neural network (CNN) architectures are trained and tested, with the best performing networks providing a condensed feature set for support vector machine and decision tree classifiers. Satellite image and LiDAR data fusion is shown to provide greater classification accuracy than using either data type alone. Model confidence thresholds are adjusted leading to significant increases in models precision. Networks trained from roof data in Witten, Germany and Manhattan (New York City) are evaluated on independent data from these cities and Ann Arbor, Michigan.more » « less