Urban and environmental researchers seek to obtain building features (e.g., building shapes, counts, and areas) at large scales. However, blurriness, occlusions, and noise from prevailing satellite images severely hinder the performance of image segmentation, super-resolution, or deep-learning-based translation networks. In this article, we combine globally available satellite images and spatial geometric feature datasets to create a generative modeling framework that enables obtaining significantly improved accuracy in per-building feature estimation and the generation of visually plausible building footprints. Our approach is a novel design that compensates for the degradation present in satellite images by using a novel deep network setup that includes segmentation, generative modeling, and adversarial learning for instance-level building features. Our method has proven its robustness through large-scale prototypical experiments covering heterogeneous scenarios from dense urban to sparse rural. Results show better quality over advanced segmentation networks for urban and environmental planning, and show promise for future continental-scale urban applications.
more »
« less
RFCNet: Enhancing urban segmentation using regularization, fusion, and completion
Image segmentation is a fundamental task that has benefited from recent advances in machine learning. One type of segmentation, of particular interest to computer vision, is that of urban segmentation. Although recent solutions have leveraged on deep neural networks, approaches usually do not consider regularities appearing in facade structures (e.g., windows are often in groups of similar alignment, size, or spacing patterns) as well as additional urban structures such as building footprints and roofs. Moreover, both satellite and street-view images are often noisy and occluded, thus getting the complete structure segmentation from a partial observation is difficult. Our key observations are that facades and other urban structures exhibit regular structures, and additional views are often available. In this paper, we present a novel framework (RFCNet) that consists of three modules to achieve multiple goals. Specifically, we propose Regularization to improve the regularities given an initial segmentation, Fusion that fuses multiple views of the segmentation, and Completion that can infer the complete structure if necessary. Experimental results show that our method outperforms previous state-of-the-art methods quantitatively and qualitatively for multiple facade datasets. Furthermore, by applying our framework to other urban structures (e.g., building footprints and roofs), we demonstrate our approach can be generalized to various pattern types.
more »
« less
- PAR ID:
- 10377193
- Date Published:
- Journal Name:
- Computer vision and image understanding
- Volume:
- 220
- Issue:
- 1077-3142
- ISSN:
- 1077-3142
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Automatic satellite-based reconstruction enables large and widespread creation of urban areas. However, satellite imagery is often noisy and incomplete, and is not suitable for reconstructing detailed building facades. We present a machine learning-based inverse procedural modeling method to automatically create synthetic facades from satellite imagery. Our key observation is that building facades exhibit regular, grid-like structures. Hence, we can overcome the low-resolution, noisy, and partial building data obtained from satellite imagery by synthesizing the underlying facade layout. Our method infers regular facade details from satellite-based image-fragments of a building, and applies them to occluded or under-sampled parts of the building, resulting in plausible, crisp facades. Using urban areas from six cities, we compare our approach to several state-of-the-art image completion/in-filling methods and our approach consistently creates better facade images.more » « less
-
Image/sketch completion is a core task that addresses the problem of completing the missing regions of an image/sketch with realistic and semantically consistent content. We address one type of completion which is producing a tentative completion of an aerial view of the remnants of a building structure. The inference process may start with as little as 10% of the structure and thus is fundamentally pluralistic (e.g., multiple completions are possible). We present a novel pluralistic building contour completion framework. A feature suggestion component uses an entropy-based model to request information from the user for the next most informative location in the image. Then, an image completion component trained using self-supervision and procedurally-generated content produces a partial or full completion. In our synthetic and real-world experiments for archaeological sites in Turkey, with up to only 4 iterations, we complete building footprints having only 10-15% of the ancient structure initially visible. We also compare to various state-of-the-art methods and show our superior quantitative/qualitative performance. While we show results for archaeology, we anticipate our method can be used for restoring highly incomplete historical sketches and for modern day urban reconstruction despite occlusions.more » « less
-
null (Ed.)The information of building types is highly needed for urban planning and management, especially in high resolution building modeling in which buildings are the basic spatial unit. However, in many parts of the world, this information is still missing. In this paper, we proposed a framework to derive the information of building type using geospatial data, including point-of-interest (POI) data, building footprints, land use polygons, and roads, from Gaode and Baidu Maps. First, we used natural language processing (NLP)-based approaches (i.e., text similarity measurement and topic modeling) to automatically reclassify POI categories into which can be used to directly infer building types. Second, based on the relationship between building footprints and POIs, we identified building types using two indicators of type ratio and area ratio. The proposed framework was tested using over 440,000 building footprints in Beijing, China. Our NLP-based approaches and building type identification methods show overall accuracies of 89.0% and 78.2%, and kappa coefficient of 0.83 and 0.71, respectively. The proposed framework is transferrable to other China cities for deriving the information of building types from web mapping platforms. The data products generated from this study are of great use for quantitative urban studies at the building level.more » « less
-
Geographic information systems (GIS) provide accurate maps of terrain, roads, waterways, and building footprints and heights. Aircraft, particularly small unmanned aircraft systems (UAS), can exploit this and additional information such as building roof structure to improve navigation accuracy and safely perform contingency landings particularly in urban regions. However, building roof structure is not fully provided in maps. This paper proposes a method to automatically label building roof shape from publicly available GIS data. Satellite imagery and airborne LiDAR data are processed and manually labeled to create a diverse annotated roof image dataset for small to large urban cities. Multiple convolutional neural network (CNN) architectures are trained and tested, with the best performing networks providing a condensed feature set for support vector machine and decision tree classifiers. Satellite image and LiDAR data fusion is shown to provide greater classification accuracy than using either data type alone. Model confidence thresholds are adjusted leading to significant increases in models precision. Networks trained from roof data in Witten, Germany and Manhattan (New York City) are evaluated on independent data from these cities and Ann Arbor, Michigan.more » « less
An official website of the United States government

