skip to main content

Search for: All records

Creators/Authors contains: "Li, Xiang"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Machine Learning is beginning to provide state-of-the-art performance in a range of environmental applications such as streamflow prediction in a hydrologic basin. However, building accurate broad-scale models for streamflow remains challenging in practice due to the variability in the dominant hydrologic processes, which are best captured by sets of process-related basin characteristics. Existing basin characteristics suffer from noise and uncertainty, among many other things, which adversely impact model performance. To tackle the above challenges, in this paper, we propose a novel Knowledge-guided Self-Supervised Learning (KGSSL) inverse framework to extract system characteristics from driver(input) and response(output) data. This first-of-its-kind framework achieves robust performance even when characteristics are corrupted or missing. We evaluate the KGSSL framework in the context of stream flow modeling using CAMELS (Catchment Attributes and MEteorology for Large-sample Studies) which is a widely used hydrology benchmark dataset. Specifically, KGSSL outperforms baseline by 16% in predicting missing characteristics. Furthermore, in the context of forward modelling, KGSSL inferred characteristics provide a 35% improvement in performance over a standard baseline when the static characteristic are unknown.
    Free, publicly-accessible full text available August 14, 2023
  2. Free, publicly-accessible full text available May 15, 2023
  3. The volume of a lake is a crucial component in understanding environmental and hydrologic processes. The State of Minnesota (USA) has tens of thousands of lakes, but only a small fraction has readily available bathymetric information. In this paper we develop and test methods for predicting water volume in the lake-rich region of Central Minnesota. We used three different published regression models for predicting lake volume using available data. The first model utilized lake surface area as the sole independent variable. The second model utilized lake surface area but also included an additional independent variable, the average change in land surface area in a designated buffer area surrounding a lake. The third model also utilized lake surface area but assumed the land surface to be a self-affine surface, thus allowing the surface area-lake volume relationship to be governed by a scale defined by the Hurst coefficient. These models all utilized bathymetric data available for 816 lakes across the region of study. The models explained over 80% of the variation in lake volumes. The sum difference between the total predicted lake volume and known volumes were <2%. We applied these models to predicting lake volumes using available independent variables for overmore »40,000 lakes within the study region. The total lake volumes for the methods ranged from 1,180,000- and 1,200,000-hectare meters. We also investigated machine learning models for estimating the individual lake volumes and found they achieved comparable and slightly better predictive performance than from the three regression analysis methods. A 15-year time series of satellite data for the study region was used to develop a time series of lake surface areas and those were used, with the first regression model, to calculate individual lake volumes and temporal variation in the total lake volume of the study region. The time series of lake volumes quantified the effect on water volume of a dry period that occurred from 2011 to 2012. These models are important both for estimating lake volume, but also provide critical information for scaling up different ecosystem processes that are sensitive to lake bathymetry.« less
    Free, publicly-accessible full text available June 16, 2023
  4. Free, publicly-accessible full text available March 1, 2023
  5. Free, publicly-accessible full text available January 1, 2023
  6. Smart grids can be vulnerable to attacks and accidents, and any initial failures in smart grids can grow to a large blackout because of cascading failure. Because of the importance of smart grids in modern society, it is crucial to protect them against cascading failures. Simulation of cascading failures can help identify the most vulnerable transmission lines and guide prioritization in protection planning, hence, it is an effective approach to protect smart grids from cascading failures. However, due to the enormous number of ways that the smart grids may fail initially, it is infeasible to simulate cascading failures at a large scale nor identify the most vulnerable lines efficiently. In this paper, we aim at 1) developing a method to run cascading failure simulations at scale and 2) building simplified, diffusion based cascading failure models to support efficient and theoretically bounded identification of most vulnerable lines. The goals are achieved by first constructing a novel connection between cascading failures and natural languages, and then adapting the powerful transformer model in NLP to learn from cascading failure data. Our trained transformer models have good accuracy in predicting the total number of failed lines in a cascade and identifying the most vulnerablemore »lines. We also constructed independent cascade (IC) diffusion models based on the attention matrices of the transformer models, to support efficient vulnerability analysis with performance bounds.« less
    Free, publicly-accessible full text available December 4, 2022
  7. Free, publicly-accessible full text available February 1, 2023