skip to main content

Title: Urban Building Type Mapping Using Geospatial Data: A Case Study of Beijing, China
The information of building types is highly needed for urban planning and management, especially in high resolution building modeling in which buildings are the basic spatial unit. However, in many parts of the world, this information is still missing. In this paper, we proposed a framework to derive the information of building type using geospatial data, including point-of-interest (POI) data, building footprints, land use polygons, and roads, from Gaode and Baidu Maps. First, we used natural language processing (NLP)-based approaches (i.e., text similarity measurement and topic modeling) to automatically reclassify POI categories into which can be used to directly infer building types. Second, based on the relationship between building footprints and POIs, we identified building types using two indicators of type ratio and area ratio. The proposed framework was tested using over 440,000 building footprints in Beijing, China. Our NLP-based approaches and building type identification methods show overall accuracies of 89.0% and 78.2%, and kappa coefficient of 0.83 and 0.71, respectively. The proposed framework is transferrable to other China cities for deriving the information of building types from web mapping platforms. The data products generated from this study are of great use for quantitative urban studies at the building level.  more » « less
Award ID(s):
1854502 1803920
Author(s) / Creator(s):
; ; ; ; ;
Date Published:
Journal Name:
Remote Sensing
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Morphological (e.g. shape, size, and height) and function (e.g. working, living, and shopping) information of buildings is highly needed for urban planning and management as well as other applications such as city-scale building energy use modeling. Due to the limited availability of socio-economic geospatial data, it is more challenging to map building functions than building morphological information, especially over large areas. In this study, we proposed an integrated framework to map building functions in 50 U.S. cities by integrating multi-source web-based geospatial data. First, a web crawler was developed to extract Points of Interest (POIs) from, and a map crawler was developed to extract POIs and land use parcels from Google Maps. Second, an unsupervised machine learning algorithm named OneClassSVM was used to identify residential buildings based on landscape features derived from Microsoft building footprints. Third, the type ratio of POIs and the area ratio of land use parcels were used to identify six non-residential functions (i.e. hospital, hotel, school, shop, restaurant, and office). The accuracy assessment indicates that the proposed framework performed well, with an average overall accuracy of 94% and a kappa coefficient of 0.63. With the worldwide coverage of Google Maps and, the proposed framework is transferable to other cities over the world. The data products generated from this study are of great use for quantitative city-scale urban studies, such as building energy use modeling at the single building level over large areas. 
    more » « less
  2. Image/sketch completion is a core task that addresses the problem of completing the missing regions of an image/sketch with realistic and semantically consistent content. We address one type of completion which is producing a tentative completion of an aerial view of the remnants of a building structure. The inference process may start with as little as 10% of the structure and thus is fundamentally pluralistic (e.g., multiple completions are possible). We present a novel pluralistic building contour completion framework. A feature suggestion component uses an entropy-based model to request information from the user for the next most informative location in the image. Then, an image completion component trained using self-supervision and procedurally-generated content produces a partial or full completion. In our synthetic and real-world experiments for archaeological sites in Turkey, with up to only 4 iterations, we complete building footprints having only 10-15% of the ancient structure initially visible. We also compare to various state-of-the-art methods and show our superior quantitative/qualitative performance. While we show results for archaeology, we anticipate our method can be used for restoring highly incomplete historical sketches and for modern day urban reconstruction despite occlusions. 
    more » « less
  3. Image segmentation is a fundamental task that has benefited from recent advances in machine learning. One type of segmentation, of particular interest to computer vision, is that of urban segmentation. Although recent solutions have leveraged on deep neural networks, approaches usually do not consider regularities appearing in facade structures (e.g., windows are often in groups of similar alignment, size, or spacing patterns) as well as additional urban structures such as building footprints and roofs. Moreover, both satellite and street-view images are often noisy and occluded, thus getting the complete structure segmentation from a partial observation is difficult. Our key observations are that facades and other urban structures exhibit regular structures, and additional views are often available. In this paper, we present a novel framework (RFCNet) that consists of three modules to achieve multiple goals. Specifically, we propose Regularization to improve the regularities given an initial segmentation, Fusion that fuses multiple views of the segmentation, and Completion that can infer the complete structure if necessary. Experimental results show that our method outperforms previous state-of-the-art methods quantitatively and qualitatively for multiple facade datasets. Furthermore, by applying our framework to other urban structures (e.g., building footprints and roofs), we demonstrate our approach can be generalized to various pattern types. 
    more » « less
  4. Abstract Due to the mixed distribution of buildings and vegetation, wildland-urban interface (WUI) areas are characterized by complex fuel distributions and geographical environments. The behavior of wildfires occurring in the WUI often leads to severe hazards and significant damage to man-made structures. Therefore, WUI areas warrant more attention during the wildfire season. Due to the ever-changing dynamic nature of California’s population and housing, the update frequency and resolution of WUI maps that are currently used can no longer meet the needs and challenges of wildfire management and resource allocation for suppression and mitigation efforts. Recent developments in remote sensing technology and data analysis algorithms pose new opportunities for improving WUI mapping methods. WUI areas in California were directly mapped using building footprints extracted from remote sensing data by Microsoft along with the fuel vegetation cover from the LANDFIRE dataset in this study. To accommodate the new type of datasets, we developed a threshold criteria for mapping WUI based on statistical analysis, as opposed to using more ad-hoc criteria as used in previous mapping approaches. This method removes the reliance on census data in WUI mapping, and does not require the calculation of housing density. Moreover, this approach designates the adjacent areas of each building with large and dense parcels of vegetation as WUI, which can not only refine the scope and resolution of the WUI areas to individual buildings, but also avoids zoning issues and uncertainties in housing density calculation. Besides, the new method has the capability of updating the WUI map in real-time according to the operational needs. Therefore, this method is suitable for local governments to map local WUI areas, as well as formulating detailed wildfire emergency plans, evacuation routes, and management measures. 
    more » « less
  5. Frasch, Martin G. (Ed.)
    With the wider availability of healthcare data such as Electronic Health Records (EHR), more and more data-driven based approaches have been proposed to improve the quality-of-care delivery. Predictive modeling, which aims at building computational models for predicting clinical risk, is a popular research topic in healthcare analytics. However, concerns about privacy of healthcare data may hinder the development of effective predictive models that are generalizable because this often requires rich diverse data from multiple clinical institutions. Recently, federated learning (FL) has demonstrated promise in addressing this concern. However, data heterogeneity from different local participating sites may affect prediction performance of federated models. Due to acute kidney injury (AKI) and sepsis’ high prevalence among patients admitted to intensive care units (ICU), the early prediction of these conditions based on AI is an important topic in critical care medicine. In this study, we take AKI and sepsis onset risk prediction in ICU as two examples to explore the impact of data heterogeneity in the FL framework as well as compare performances across frameworks. We built predictive models based on local, pooled, and FL frameworks using EHR data across multiple hospitals. The local framework only used data from each site itself. The pooled framework combined data from all sites. In the FL framework, each local site did not have access to other sites’ data. A model was updated locally, and its parameters were shared to a central aggregator, which was used to update the federated model’s parameters and then subsequently, shared with each site. We found models built within a FL framework outperformed local counterparts. Then, we analyzed variable importance discrepancies across sites and frameworks. Finally, we explored potential sources of the heterogeneity within the EHR data. The different distributions of demographic profiles, medication use, and site information contributed to data heterogeneity. 
    more » « less