Performance Assessment of Data Sampling Strategies for Neural Network-Based Voltage Approximations

Asiamah, Richard; Gupta, Rahul K; Haider, Rabab; Molzahn, Daniel K

doi:10.1109/NAPS61145.2024.10741775

Machine learning models have been developed for a wide variety of power system applications. The accuracy of a machine learning model strongly depends on the selection of training data. In many settings where real data are limited or unavailable, machine learning models are trained using synthetic data sampled via different strategies. Using the task of approximating the voltage magnitudes associated with specified complex power injections as an illustrative application, this paper compares the performance of neural networks trained on four different sampling strategies: (i) correlated loads at fixed power factor, (ii) correlated loads at varying power factor, (iii) uncor-related loads at fixed power factor, and (iv) uncorrelated loads at varying power factor. A new sampling strategy that combines these four strategies into one training dataset is also introduced and assessed. Results from transmission and distribution test cases of varying sizes show that these strategies for creating synthetic training data yield varied neural network accuracy. The accuracy differences across the various strategies vary by up to a factor of four. While none of the first four strategies outperform the others across all test cases, neural networks trained with the combined dataset perform the best overall, maintaining a high accuracy and low error spreads.

More Like this