skip to main content


Search for: All records

Award ID contains: 1854655

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    In unconventional reservoirs, optimal completion controls are essential to improving well productivity and reducing costs. In this article, we propose a statistical model to investigate associations between shale oil production and completion parameters (e.g., completion lateral length, total proppant, number of hydraulic fracturing stages), while accounting for the influence of spatially heterogeneous geological conditions on hydrocarbon production. We develop a non-parametric regression method that combines a generalized additive model with a fused LASSO regularization for geological homogeneity pursuit. We present an alternating augmented Lagrangian method for model parameter estimations. The novelty and advantages of our method over the published ones are a) it can control or remove the heterogeneous non-completion effects; 2) it can account for and analyze the interactions among the completion parameters. We apply our method to the analysis of a real case from a Permian Basin US onshore field and show how our model can account for the interaction between the completion parameters. Our results provide key findings on how completion parameters affect oil production in that can lead to optimal well completion designs.

     
    more » « less
  2. Arctic sea ice extent (SIE) has drawn increasing attention from scientists in recent years because of its fast decline in the Boreal summer and early fall. The measurement of SIE is derived from remote sensing data and is both a lagged and leading indicator of climate change. To characterize at a local level the decline in SIE, we use remote-sensing data at 25 km resolution to fit a spatio-temporal logistic autoregressive model of the sea-ice evolution in the Arctic region. The model incorporates last year’s ice/water binary observations at nearby grid cells in an autoregressive manner with autoregressive coefficients that vary both in space and time. Using the model-based estimates of ice/water probabilities in the Arctic region, we propose several graphical summaries to visualize the spatio-temporal changes in Arctic sea ice beyond what can be visualized with the single time series of SIE. In ever-higher latitude bands, we observe a consistently declining temporal trend of sea ice in the early fall. We also observe a clear decline in and contraction of the sea ice’s distribution between 70∘N–75∘N, and of most concern is that this may reflect the future behavior of sea ice at ever-higher latitudes under climate change.

     
    more » « less
  3. Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems. While the number of clusters induced by random partition models has been studied extensively, another important model property regarding the balancedness of partition has been largely neglected. We formulate a framework to define and theoretically study the balancedness of exchangeable random partition models, by analyzing how a model assigns probabilities to partitions with different levels of balancedness. We demonstrate that the "rich-get-richer" characteristic of many existing popular random partition models is an inevitable consequence of two common assumptions: product-form exchangeability and projectivity. We propose a principled way to compare the balancedness of random partition models, which gives a better understanding of what model works better and what doesn’t for different applications. We also introduce the "rich-get-poorer" random partition models and illustrate their application to entity resolution tasks. 
    more » « less
  4. The COVID-19 pandemic has limited people’s visitation to public places because of social distancing and shelter-in-place orders. According to Google’s community mobility reports, some countries showed a decrease in park visitation during the pandemic, while others showed an increase. Although government responses played a significant role in this variation, little is known about park visitation changes and the park attributes that are associated with these changes. Therefore, we aimed to examine the associations between park characteristics and percent changes in park visitation in Harris County, TX, for three time periods: before, during, and after the shelter-in-place order of Harris County. We utilized SafeGraph’s point-of-interest data to extract weekly park visitation counts for the Harris County area. This dataset included the size of each park and its weekly number of visits from 2 March to 31 May 2020. In addition, we measured park characteristics, including greenness density, using the normalized difference vegetation index; park type (mini, neighborhood, community, regional/metropolitan); presence of sidewalks and bikeways; sidewalk and bikeway quantity; and bikeway quality. Results showed that park visitation decreased after issuing the shelter-in-place order and increased after this order was lifted. Results from linear regression models indicated that the higher the greenness density of the park, the smaller the decrease in park visitation during the shelter-in-place period compared to before the shelter-in-place order. This relationship also appeared after the shelter-in-place order. The presence of more sidewalks was related to less visitation increase after the shelter-in-place order. These findings can guide planners and designers to implement parks that promote public visitation during pandemics and potentially benefit people’s physical and mental health. 
    more » « less
  5. Structured point process data harvested from various platforms poses new challenges to the machine learning community. To cluster repeatedly observed marked point processes, we propose a novel mixture model of multi-level marked point processes for identifying potential heterogeneity in the observed data. Specifically, we study a matrix whose entries are marked log-Gaussian Cox processes and cluster rows of such a matrix. An efficient semi-parametric Expectation-Solution (ES) algorithm combined with functional principal component analysis (FPCA) of point processes is proposed for model estimation. The effectiveness of the proposed framework is demonstrated through simulation studies and real data analyses. 
    more » « less
  6. Graphs have been commonly used to represent complex data structures. In models dealing with graph-structured data, multivariate parameters may not only exhibit sparse patterns but have structured sparsity and smoothness in the sense that both zero and non-zero parameters tend to cluster together. We propose a new prior for high-dimensional parameters with graphical relations, referred to as the Tree-based Low-rank Horseshoe (T-LoHo) model, that generalizes the popular univariate Bayesian horseshoe shrinkage prior to the multivariate setting to detect structured sparsity and smoothness simultaneously. The T-LoHo prior can be embedded in many high-dimensional hierarchical models. To illustrate its utility, we apply it to regularize a Bayesian high-dimensional regression problem where the regression coefficients are linked by a graph, so that the resulting clusters have flexible shapes and satisfy the cluster contiguity constraint with respect to the graph. We design an efficient Markov chain Monte Carlo algorithm that delivers full Bayesian inference with uncertainty measures for model parameters such as the number of clusters. We offer theoretical investigations of the clustering effects and posterior concentration results. Finally, we illustrate the performance of the model with simulation studies and a real data application for anomaly detection on a road network. The results indicate substantial improvements over other competing methods such as the sparse fused lasso. 
    more » « less
  7. Nonparametric regression on complex domains has been a challenging task as most existing methods, such as ensemble models based on binary decision trees, are not designed to account for intrinsic geometries and domain boundaries. This article proposes a Bayesian additive regression spanning trees (BAST) model for nonparametric regression on manifolds, with an emphasis on complex constrained domains or irregularly shaped spaces embedded in Euclidean spaces. Our model is built upon a random spanning tree manifold partition model as each weak learner, which is capable of capturing any irregularly shaped spatially contiguous partitions while respecting intrinsic geometries and domain boundary constraints. Utilizing many nice properties of spanning tree structures, we design an efficient Bayesian inference algorithm. Equipped with a soft prediction scheme, BAST is demonstrated to significantly outperform other competing methods in simulation experiments and in an application to the chlorophyll data in Aral Sea, due to its strong local adaptivity to different levels of smoothness. 
    more » « less