skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Weatherman: Exposing weather-based privacy threats in big energy data
Smart energy meters record electricity consumption and generation at fine-grained intervals, and are among the most widely deployed sensors in the world. Energy data embeds detailed information about a building's energy-efficiency, as well as the behavior of its occupants, which academia and industry are actively working to extract. In many cases, either inadvertently or by design, these third-parties only have access to anonymous energy data without an associated location. The location of energy data is highly useful and highly sensitive information: it can provide important contextual information to improve big data analytics or interpret their results, but it can also enable third-parties to link private behavior derived from energy data with a particular location. In this paper, we present Weatherman, which leverages a suite of analytics techniques to localize the source of anonymous energy data. Our key insight is that energy consumption data, as well as wind and solar generation data, largely correlates with weather, e.g., temperature, wind speed, and cloud cover, and that every location on Earth has a distinct weather signature that uniquely identifies it. Weatherman represents a serious privacy threat, but also a potentially useful tool for researchers working with anonymous smart meter data. We evaluate Weatherman's potential in both areas by localizing data from over one hundred smart meters using a weather database that includes data from over 35,000 locations. Our results show that Weatherman localizes coarse (one-hour resolution) energy consumption, wind, and solar data to within 16.68km, 9.84km, and 5.12km, respectively, on average, which is more accurate using much coarser resolution data than prior work on localizing only anonymous solar data using solar signatures.  more » « less
Award ID(s):
1505422 1405826 1645952 1253063 1534080
PAR ID:
10062543
Author(s) / Creator(s):
;
Date Published:
Journal Name:
IEEE International Conference on Big Data
Page Range / eLocation ID:
1079 to 1086
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We analyze 36 years of global, hourly weather data (1980–2015) to quantify the covariability of solar and wind resources as a function of time and location, over multi-decadal time scales and up to continental length scales. Assuming minimal excess generation, lossless transmission, and no other generation sources, the analysis indicates that wind-heavy or solar-heavy U.S.-scale power generation portfolios could in principle provide ∼80% of recent total annual U.S. electricity demand. However, to reliably meet 100% of total annual electricity demand, seasonal cycles and unpredictable weather events require several weeks’ worth of energy storage and/or the installation of much more capacity of solar and wind power than is routinely necessary to meet peak demand. To obtain ∼80% reliability, solar-heavy wind/solar generation mixes require sufficient energy storage to overcome the daily solar cycle, whereas wind-heavy wind/solar generation mixes require continental-scale transmission to exploit the geographic diversity of wind. Policy and planning aimed at providing a reliable electricity supply must therefore rigorously consider constraints associated with the geophysical variability of the solar and wind resource—even over continental scales. 
    more » « less
  2. Solar energy capacity is continuing to increase. The key challenge with integrating solar into buildings and the electric grid is its high power generation variability, which is a function of many factors, including a site's location, time, weather, and numerous physical attributes. There has been significant prior work on solar performance modeling and forecasting that infers a site's current and future solar generation based on these factors. Accurate solar performance models and forecasts are also a pre-requisite for conducting a wide range of building and grid energy-efficiency research. Unfortunately, much of the prior work is not accessible to researchers, either because it has not been released as open source, is time-consuming to re-implement, or requires access to proprietary data sources. To address the problem, we present Solar-TK, a data-driven toolkit for solar performance modeling and forecasting that is simple, extensible, and publicly accessible. Solar-TK's simple approach models and forecasts a site's solar output given only its location and a small amount of historical generation data. Solar-TK's extensible design includes a small collection of independent modules that connect together to implement basic modeling and forecasting, while also enabling users to implement new energy analytics. We plan to release Solar-TK as open source to enable research that requires realistic solar models and forecasts, and to serve as a baseline for comparing new solar modeling and forecasting techniques. We compare Solar-TK's simple approach with PVlib and show that it yields comparable accuracy. We present three case studies showing how Solar-TK can advance energy-efficiency research. 
    more » « less
  3. Traditional smart meters, which measure energy usage every 15 minutes or more and report it at least a few hours later, lack the granularity needed for real-time decision-making. To address this practical problem, we introduce a new method using generative adversarial networks (GAN) that enforces temporal consistency on its high-resolution outputs via hard inequality constraints using convex optimization. A unique feature of our GAN model is that it is trained solely on slow timescale aggregated historical energy data obtained from smart meters. The results demonstrate that the model can successfully create minute-by-minute temporally correlated profiles of power usage from 15-minute interval average power consumption information. This innovative approach, emphasizing inter-neuron constraints, offers a promising avenue for improved high-speed state estimation in distribution systems and enhances the applicability of data-driven solutions for monitoring and subsequently controlling such systems. 
    more » « less
  4. null (Ed.)
    In this notes paper, we present an open problem to the Buildsys community: energy data super-resolution, referring to the task of estimating the power consumption of a home at a higher resolution given the low-resolution power consumption. Super-resolution is especially useful when the smart meters collect data at a very low-sampling rate owing to a plethora of issues such as bandwidth, pricing, old hardware, among others. The problem is motivated by the success of image super resolution in the computer vision community. In this paper, we formally introduce the problem and present baseline methods and the algorithms we used to "solve" this problem. We evaluate the performance of the algorithms on a real-world dataset and discuss the results. We also discuss what makes this problem hard and why a trivial baseline is hard to beat. 
    more » « less
  5. Advanced metering infrastructure (AMI)is a critical part of a modern smart grid that performs the bidirectional data flow of sensitive power information such as smart metering data and control commands. The real-time monitoring and control of the grid are ensured through AMI. While smart meter data helps to improve the overall performance of the grid in terms of efficient energy management, it has also made the AMI an attractive target of cyber attackers with a goal of stealing energy. This is performed through the physical or cyber tampering of the meters, as well as by manipulating the network infrastructure to alter collected data. Proper technology is required for the identification of energy fraud. In this paper, we propose a novel technique to detect fraudulent data from smart meters based on the energy consumption patterns of the consumers by utilizing deep learning techniques. We also propose a method for detecting the suspicious relay nodes in the AMI infrastructure that may manipulate the data while forwarding it to the aggregators. We present the performance of our proposed technique, which shows the correctness of the models in identifying the suspicious smart meter data. 
    more » « less