This paper develops a generalized Copula-polynomial chaos expansion (PCE) framework for power system probabilistic power flow that can handle both linear and nonlinear correlations of uncertain power injections, such as wind and PVs. A data-driven Copula statistical model is used to capture the correlations of uncertain power injections. This allows us to resort to the Rosenblatt transformation to transform correlated variables into independent ones while preserving the dependence structure. This paves the way of leveraging the PCE for surrogate modeling and uncertainty quantification of power flow results, i.e., achieving the probabilistic distributions of power flows. Simulations carried out on the IEEE 57-bus system show that the proposed framework can get much more accurate results than other alternatives with different linear and nonlinear power injection correlations.
more »
« less
Triangular Flows for Generative Modeling: Statistical Consistency, Smoothness Classes, and Fast Rates
Triangular flows, also known as Knöthe-Rosenblatt measure couplings, comprise an important building block of normalizing flow models for generative modeling and density estimation, including popular autoregressive flows such as real-valued non-volume preserving transformation models (Real NVP). We present statistical guarantees and sample complexity bounds for triangular flow statistical models. In particular, we establish the statistical consistency and the finite sample convergence rates of the minimum Kullback-Leibler divergence statistical estimator of the Knöthe-Rosenblatt measure coupling using tools from empirical process theory. Our results highlight the anisotropic geometry of function classes at play in triangular flows, shed light on optimal coordinate ordering, and lead to statistical guarantees for Jacobian flows. We conduct numerical experiments to illustrate the practical implications of our theoretical findings.
more »
« less
- PAR ID:
- 10349861
- Date Published:
- Journal Name:
- Proceedings of Machine Learning Research
- Volume:
- 151
- ISSN:
- 2640-3498
- Page Range / eLocation ID:
- 10161-10195
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Advanced measurement techniques and high-performance computing have made large data sets available for a range of turbulent flows in engineering applications. Drawing on this abundance of data, dynamical models that reproduce structural and statistical features of turbulent flows enable effective model-based flow control strategies. This review describes a framework for completing second-order statistics of turbulent flows using models based on the Navier–Stokes equations linearized around the turbulent mean velocity. Dynamical couplings between states of the linearized model dictate structural constraints on the statistics of flow fluctuations. Colored-in-time stochastic forcing that drives the linearized model is then sought to account for and reconcile dynamics with available data (that is, partially known statistics). The number of dynamical degrees of freedom that are directly affected by stochastic excitation is minimized as a measure of model parsimony. The spectral content of the resulting colored-in-time stochastic contribution can alternatively arise from a low-rank structural perturbation of the linearized dynamical generator, pointing to suitable dynamical corrections that may account for the absence of the nonlinear interactions in the linearized model.more » « less
-
null (Ed.)Measuring flow spread in real time from large, high-rate data streams has numerous practical applications, where a data stream is modeled as a sequence of data items from different flows and the spread of a flow is the number of distinct items in the flow. Past decades have witnessed tremendous performance improvement for single-flow spread estimation. However, when dealing with numerous flows in a data stream, it remains a significant challenge to measure per-flow spread accurately while reducing memory footprint. The goal of this paper is to introduce new multi-flow spread estimation designs that incur much smaller processing overhead and query overhead than the state of the art, yet achieves significant accuracy improvement in spread estimation. We formally analyze the performance of these new designs. We implement them in both hardware and software, and use real-world data traces to evaluate their performance in comparison with the state of the art. The experimental results show that our best sketch significantly improves over the best existing work in terms of estimation accuracy, data item processing throughput, and online query throughput.more » « less
-
Abstract. Lava flows present a significant natural hazard to communities around volcanoes and are typically slow-moving (<1 to 5 cm s−1) and laminar. Recent lava flows during the 2018 eruption of Kīlauea volcano, Hawai'i, however, reached speeds as high as 11 m s−1 and were transitional to turbulent. The Kīlauea flows formed a complex network of braided channels departing from the classic rectangular channel geometry often employed by lava flow models. To investigate these extreme dynamics we develop a new lava flow model that incorporates nonlinear advection and a nonlinear expression for the fluid viscosity. The model makes use of novel discontinuous Galerkin (DG) finite-element methods and resolves complex channel geometry through the use of unstructured triangular meshes. We verify the model against an analytic test case and demonstrate convergence rates of P+1/2 for polynomials of degree 𝒫. Direct observations recorded by unoccupied aerial systems (UASs) during the Kīlauea eruption provide inlet conditions, constrain input parameters, and serve as a benchmark for model evaluation.more » « less
-
Many generative models have to combat missing modes. The conventional wisdom to this end is by reducing through training a statistical distance (such as f -divergence) between the generated distribution and provided data distribution. But this is more of a heuristic than a guarantee. The statistical distance measures a global, but not local, similarity between two distributions. Even if it is small, it does not imply a plausible mode coverage. Rethinking this problem from a game-theoretic perspective, we show that a complete mode coverage is firmly attainable. If a generative model can approximate a data distribution moderately well under a global statistical distance measure, then we will be able to find a mixture of generators that collectively covers every data point and thus every mode, with a lower-bounded generation probability. Constructing the generator mixture has a connection to the multiplicative weights update rule, upon which we propose our algorithm. We prove that our algorithm guarantees complete mode coverage. And our experiments on real and synthetic datasets confirm better mode coverage over recent approaches, ones that also use generator mixtures but rely on global statistical distances.more » « less
An official website of the United States government

