skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Application of Machine Learning for Predicting Building Energy Use at Different Temporal and Spatial Resolution under Climate Change in USA
Given the urgency of climate change, development of fast and reliable methods is essential to understand urban building energy use in the sector that accounts for 40% of total energy use in USA. Although machine learning (ML) methods may offer promise and are less difficult to develop, discrepancy in methods, results, and recommendations have emerged that requires attention. Existing research also shows inconsistencies related to integrating climate change models into energy modeling. To address these challenges, four models: random forest (RF), extreme gradient boosting (XGBoost), single regression tree, and multiple linear regression (MLR), were developed using the Commercial Building Energy Consumption Survey dataset to predict energy use intensity (EUI) under projected heating and cooling degree days by the Intergovernmental Panel on Climate Change (IPCC) across the USA during the 21st century. The RF model provided better performance and reduced the mean absolute error by 4%, 11%, and 12% compared to XGBoost, single regression tree, and MLR, respectively. Moreover, using the RF model for climate change analysis showed that office buildings’ EUI will increase between 8.9% to 63.1% compared to 2012 baseline for different geographic regions between 2030 and 2080. One region is projected to experience an EUI reduction of almost 1.5%. Finally, good data enhance the predicting ability of ML therefore, comprehensive regional building datasets are crucial to assess counteraction of building energy use in the face of climate change at finer spatial scale.  more » « less
Award ID(s):
1934824
PAR ID:
10193363
Author(s) / Creator(s):
;
Date Published:
Journal Name:
Buildings
Volume:
10
Issue:
8
ISSN:
2075-5309
Page Range / eLocation ID:
139
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Basal area is a key measure of forest stocking and an important proxy of forest productivity in the face of climate change. Black walnut ( Juglans nigra ) is one of the most valuable timber species in North America. However, little is known about how the stocking of black walnut would change with differed bioclimatic conditions under climate change. In this study, we projected the current and future basal area of black walnut. We trained different machine learning models using more than 1.4 million tree records from 10,162 Forest Inventory and Analysis (FIA) sample plots and 42 spatially explicit bioclimate and other environmental attributes. We selected random forests (RF) as the final model to estimate the basal area of black walnut under climate change because RF had a higher coefficient of determination ( R 2 ), lower root mean square error (RMSE), and lower mean absolute error (MAE) than the other two models (XGBoost and linear regression). The most important variables to predict basal area were the mean annual temperature and precipitation, potential evapotranspiration, topology, and human footprint. Under two emission scenarios (Representative Concentration Pathway 4.5 and 8.5), the RF model projected that black walnut stocking would increase in the northern part of the current range in the USA by 2080, with a potential shift of species distribution range although uncertainty still exists due to unpredictable events, including extreme abiotic (heat, drought) and biotic (pests, disease) occurrences. Our models can be adapted to other hardwood tree species to predict tree changes in basal area based on future climate scenarios. 
    more » « less
  2. Surface tension is a critical property that influences polymer behavior at interfaces and affects applications ranging from coatings to biomedical devices. Traditional experimental methods for measuring polymer surface tension are time-consuming, costly, and sensitive to environmental conditions. Computational approaches such as molecular dynamics (MD) simulations are valuable but computationally intensive, especially for polymers with long chains. This study investigates the use of machine learning (ML) techniques to predict polymer surface tension using different levels of molecular representation, focusing on multilinear regression (MLR), random forest (RF), and graph neural networks (GNNs). A data set of 317 homopolymers collected from the PolyInfo database is used to train and evaluate these models. Descriptors are derived at various levels of complexity, ranging from manually calculated features to graph-based representations. The GNN approach captures the intrinsic connectivity of polymer structures, while the MLR and RF models rely on manually crafted descriptors. The performance of these models is compared with experimental data, with the GNN model demonstrating superior accuracy due to its ability to directly learn from molecular graphs. Our results show that GNNs can better capture complex nonlinear relationships in polymer structures than traditional descriptorbased methods, suggesting their significant potential for accelerating polymer design and development. The study also includes validation of model predictions against molecular dynamics simulations, highlighting the potential of GNNs to accurately model polymer interfacial properties. 
    more » « less
  3. null (Ed.)
    Buildings are subject to significant stresses due to climate change and design strategies for climate resilient buildings are rife with uncertainties which could make interpreting energy use distributions difficult and questionable. This study intends to enhance a robust and credible estimate of the uncertainties and interpretations of building energy performance under climate change. A four-step climate uncertainty propagation approach which propagates downscaled future weather file uncertainties into building energy use is examined. The four-step approach integrates dynamic building simulation, fitting a distribution to average annual weather variables, regression model (between average annual weather variables and energy use) and random sampling. The impact of fitting different distributions to the weather variable (such as Normal, Beta, Weibull, etc.) and regression models (Multiple Linear and Principal Component Regression) of the uncertainty propagation method on cooling and heating energy use distribution for a sample reference office building is evaluated. Results show selecting a full principal component regression model following a best-fit distribution for each principal component of the weather variables can reduce the variation of the output energy distribution compared to simulated data. The results offer a way of understanding compound building energy use distributions and parsing the uncertain nature of climate projections. 
    more » « less
  4. null (Ed.)
    Abstract. In the past decades, data-driven machine-learning (ML) models have emerged as promising tools for short-term streamflow forecasting. Among other qualities, the popularity of ML models for such applications is due to their relative ease in implementation, less strict distributional assumption, and competitive computational and predictive performance. Despite the encouraging results, most applications of ML for streamflow forecasting have been limited to watersheds in which rainfall is the major source of runoff. In this study, we evaluate the potential of random forests (RFs), a popular ML method, to make streamflow forecasts at 1 d of lead time at 86 watersheds in the Pacific Northwest. These watersheds cover diverse climatic conditions and physiographic settings and exhibit varied contributions of rainfall and snowmelt to their streamflow. Watersheds are classified into three hydrologic regimes based on the timing of center-of-annual flow volume: rainfall-dominated, transient, and snowmelt-dominated. RF performance is benchmarked against naïve and multiple linear regression (MLR) models and evaluated using four criteria: coefficient of determination, root mean squared error, mean absolute error, and Kling–Gupta efficiency (KGE). Model evaluation scores suggest that the RF performs better in snowmelt-driven watersheds compared to rainfall-driven watersheds. The largest improvements in forecasts compared to benchmark models are found among rainfall-driven watersheds. RF performance deteriorates with increases in catchment slope and soil sandiness. We note disagreement between two popular measures of RF variable importance and recommend jointly considering these measures with the physical processes under study. These and other results presented provide new insights for effective application of RF-based streamflow forecasting. 
    more » « less
  5. Abstract Aim Rarity and geographic aspects of species distributions mediate their vulnerability to global change. We explore the relationships between species rarity and geography and their exposure to climate and land use change in a biodiversity hotspot. Location California, USA. Taxa One hundred and six terrestrial plants. Methods We estimated four rarity traits: range size, niche breadth, number of habitat patches, and patch isolation; and three geographic traits: mean elevation, topographic heterogeneity, and distance to coast. We used species distribution models to measure species exposure—predicted change in continuous habitat suitability within currently occupied habitat—under climate and land use change scenarios. Using regression models, decision‐tree models and variance partitioning, we assessed the relationships between species rarity, geography, and exposure to climate and land use change. Results Rarity, geography and greenhouse gas emissions scenario explained >35% of variance in climate change exposure and >61% for land use change exposure. While rarity traits (range size and number of habitat patches) were most important for explaining species exposure to climate change, geographic traits (elevation and topographic heterogeneity) were more strongly associated with species' exposure to land use change. Main conclusions Species with restricted range sizes and low topographic heterogeneity across their distributions were predicted to be the most exposed to climate change, while species at low elevations were the most exposed to habitat loss via land use change. However, even some broadly distributed species were projected to lose >70% of their currently suitable habitat due to climate and land use change if they are in geographically vulnerable areas, emphasizing the need to consider both species rarity traits and geography in vulnerability assessments. 
    more » « less