skip to main content


Title: House Price Prediction via Visual Cues and Estate Attributes
The price of a house depends on many factors, such as its size, location, amenities, surrounding establishments, and the season in which the house is being sold, just to name a few of them. As a seller, it is absolutely essential to price the property competitively else it will not attract any buyers. This problem has given rise to multiple companies as well as past research works that try to enhance the predictability of property prices using relevant mathematical models and machine learning techniques. In this research, we investigate the usage of machine learning in predicting the house price based on related estate attributes and visual images. To this end, we collect a dataset of 2,000 houses across different cities in the United States. For each house, we annotate 14 estate attributes and five visual images for exterior, interior-living room, kitchen, bedroom, and bathroom. Following the dataset collection, different features are extracted from the input data. Furthermore, a multi-kernel regression approach is used to predict the house price from both visual cues and estate attributes. The extensive experiments demonstrate the superiority of the proposed method over the baselines.  more » « less
Award ID(s):
2025234
NSF-PAR ID:
10428263
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ISVC 2022
Page Range / eLocation ID:
91=103
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Climate change-induced sea level rise (SLR) will affect a range of coastal assets and prompt difficult decisions about coastal land use across the world. Several recent studies find that current and projected SLR is associated with relatively lower property values. We contribute to this growing body of research with a case study of O‘ahu, Hawai‘i, which is famed for its beaches as well as valuable coastal real estate. We leverage a dataset that unpacks multiple types of SLR exposure and coastal parcel attributes. We apply property transaction data for the island of O‘ahu through 2019 to investigate the effect of current and expected SLR exposure on residential property prices. We find that exposed properties have already experienced declines in transaction prices, at 9 to 14%, attributed to expectations of exposure to chronic inundation (as opposed to seasonal flooding). The price declines are mainly for multi-dwelling homes as opposed to single family homes. The market response of residential properties to SLR has important implications for coastal management strategies, in particular the viability and timing of programs for retreat. 
    more » « less
  2. Abstract

    The rapid urbanisation of China has received growing attention regarding its urban residential environments. In this article, we model the spatial heterogeneity of housing prices and explore the spatial discrepancy of landscape effects on property values in Shenzhen, a large Chinese city. In contrast to previous studies, this paper integrates the official housing transaction records and housing attributes from open data along with field surveys. Then, the results using the hedonic price model (HPM), geographically weighted regression (GWR) without landscape metrics and GWR with landscape metrics are compared. The results show that GWR with landscape metrics outperforms the other two models. In summary, this research provides new insights into landscape metrics in real estate studies and can guide decision‐makers plan and design cities while also providing guidance to regulate and control urban property values based on local conditions.

     
    more » « less
  3. Work in computer vision and natural language processing involving images and text has been experiencing explosive growth over the past decade, with a particular boost coming from the neural network revolution. The present volume brings together five research articles from several different corners of the area: multilingual multimodal image description (Frank et al. ), multimodal machine translation (Madhyastha et al. , Frank et al. ), image caption generation (Madhyastha et al. , Tanti et al. ), visual scene understanding (Silberer et al. ), and multimodal learning of high-level attributes (Sorodoc et al. ). In this article, we touch upon all of these topics as we review work involving images and text under the three main headings of image description (Section 2), visually grounded referring expression generation (REG) and comprehension (Section 3), and visual question answering (VQA) (Section 4). 
    more » « less
  4. null (Ed.)
    Introduction: Vaso-occlusive crises (VOCs) are a leading cause of morbidity and early mortality in individuals with sickle cell disease (SCD). These crises are triggered by sickle red blood cell (sRBC) aggregation in blood vessels and are influenced by factors such as enhanced sRBC and white blood cell (WBC) adhesion to inflamed endothelium. Advances in microfluidic biomarker assays (i.e., SCD Biochip systems) have led to clinical studies of blood cell adhesion onto endothelial proteins, including, fibronectin, laminin, P-selectin, ICAM-1, functionalized in microchannels. These microfluidic assays allow mimicking the physiological aspects of human microvasculature and help characterize biomechanical properties of adhered sRBCs under flow. However, analysis of the microfluidic biomarker assay data has so far relied on manual cell counting and exhaustive visual morphological characterization of cells by trained personnel. Integrating deep learning algorithms with microscopic imaging of adhesion protein functionalized microfluidic channels can accelerate and standardize accurate classification of blood cells in microfluidic biomarker assays. Here we present a deep learning approach into a general-purpose analytical tool covering a wide range of conditions: channels functionalized with different proteins (laminin or P-selectin), with varying degrees of adhesion by both sRBCs and WBCs, and in both normoxic and hypoxic environments. Methods: Our neural networks were trained on a repository of manually labeled SCD Biochip microfluidic biomarker assay whole channel images. Each channel contained adhered cells pertaining to clinical whole blood under constant shear stress of 0.1 Pa, mimicking physiological levels in post-capillary venules. The machine learning (ML) framework consists of two phases: Phase I segments pixels belonging to blood cells adhered to the microfluidic channel surface, while Phase II associates pixel clusters with specific cell types (sRBCs or WBCs). Phase I is implemented through an ensemble of seven generative fully convolutional neural networks, and Phase II is an ensemble of five neural networks based on a Resnet50 backbone. Each pixel cluster is given a probability of belonging to one of three classes: adhered sRBC, adhered WBC, or non-adhered / other. Results and Discussion: We applied our trained ML framework to 107 novel whole channel images not used during training and compared the results against counts from human experts. As seen in Fig. 1A, there was excellent agreement in counts across all protein and cell types investigated: sRBCs adhered to laminin, sRBCs adhered to P-selectin, and WBCs adhered to P-selectin. Not only was the approach able to handle surfaces functionalized with different proteins, but it also performed well for high cell density images (up to 5000 cells per image) in both normoxic and hypoxic conditions (Fig. 1B). The average uncertainty for the ML counts, obtained from accuracy metrics on the test dataset, was 3%. This uncertainty is a significant improvement on the 20% average uncertainty of the human counts, estimated from the variance in repeated manual analyses of the images. Moreover, manual classification of each image may take up to 2 hours, versus about 6 minutes per image for the ML analysis. Thus, ML provides greater consistency in the classification at a fraction of the processing time. To assess which features the network used to distinguish adhered cells, we generated class activation maps (Fig. 1C-E). These heat maps indicate the regions of focus for the algorithm in making each classification decision. Intriguingly, the highlighted features were similar to those used by human experts: the dimple in partially sickled RBCs, the sharp endpoints for highly sickled RBCs, and the uniform curvature of the WBCs. Overall the robust performance of the ML approach in our study sets the stage for generalizing it to other endothelial proteins and experimental conditions, a first step toward a universal microfluidic ML framework targeting blood disorders. Such a framework would not only be able to integrate advanced biophysical characterization into fast, point-of-care diagnostic devices, but also provide a standardized and reliable way of monitoring patients undergoing targeted therapies and curative interventions, including, stem cell and gene-based therapies for SCD. Disclosures Gurkan: Dx Now Inc.: Patents & Royalties; Xatek Inc.: Patents & Royalties; BioChip Labs: Patents & Royalties; Hemex Health, Inc.: Consultancy, Current Employment, Patents & Royalties, Research Funding. 
    more » « less
  5. Images can give us insights into the contextual meanings of words, but current image-text grounding approaches require detailed annotations. Such granular annotation is rare, expensive, and unavailable in most domain-specific contexts. In contrast, unlabeled multi-image, multi-sentence documents are abundant. Can lexical grounding be learned from such documents, even though they have significant lexical and visual overlap? Working with a case study dataset of real estate listings, we demonstrate the challenge of distinguishing highly correlated grounded terms, such as “kitchen” and “bedroom”, and introduce metrics to assess this document similarity. We present a simple unsupervised clustering-based method that increases precision and recall beyond object detection and image tagging baselines when evaluated on labeled subsets of the dataset. The proposed method is particularly effective for local contextual meanings of a word, for example associating “granite” with countertops in the real estate dataset and with rocky landscapes in a Wikipedia dataset. 
    more » « less