Search for: All records

Award ID contains: 2100729

« Prev Next »

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Energy Demand Forecasting Using Temporal Variational Residual Network

https://doi.org/10.3390/forecast7030042

Ashebir, Simachew; Kim, Seongtae (September 2025, Forecasting)

The growing demand for efficient energy management has become essential for achieving sustainable development across social, economic, and environmental sectors. Accurate energy demand forecasting plays a pivotal role in energy management. However, energy demand data present unique challenges due to their complex characteristics, such as multi-seasonality, hidden structures, long-range dependency, irregularities, volatilities, and nonlinear patterns, making energy demand forecasting challenging. We propose a hybrid dimension reduction deep learning algorithm, Temporal Variational Residual Network (TVRN), to address these challenges and enhance forecasting performance. This model integrates variational autoencoders (VAEs), Residual Neural Networks (ResNets), and Bidirectional Long Short-Term Memory (BiLSTM) networks. TVRN employs VAEs for dimensionality reduction and noise filtering, ResNets to capture local, mid-level, and global features while tackling gradient vanishing issues in deeper networks, and BiLSTM to leverage past and future contexts for dynamic and accurate predictions. The performance of the proposed model is evaluated using energy consumption data, showing a significant improvement over traditional deep learning and hybrid models. For hourly forecasting, TVRN reduces root mean square error and mean absolute error, ranging from 19% to 86% compared to other models. Similarly, for daily energy consumption forecasting, this method outperforms existing models with an improvement in root mean square error and mean absolute error ranging from 30% to 95%. The proposed model significantly enhances the accuracy of energy demand forecasting by effectively addressing the complexities of multi-seasonality, hidden structures, and nonlinearity.
more » « less
Free, publicly-accessible full text available September 1, 2026
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection

Gao, Daiqi; Liu, Yufeng; Zeng, Donglin (August 2025, Journal of machine learning)
zhu, Ji (Ed.)
Dynamic treatment regimes or policies are a sequence of decision functions over multiple stages that are tailored to individual features. One important class of treatment policies in practice, namely multi-stage stationary treatment policies, prescribes treatment assignment probabilities using the same decision function across stages, where the decision is based on the same set of features consisting of time-evolving variables (e.g., routinely collected disease biomarkers). Although there has been extensive literature on constructing valid inference for the value function associated with dynamic treatment policies, little work has focused on the policies themselves, especially in the presence of high-dimensional feature variables. We aim to fill the gap in this work. Specifically, we first obtain the multi-stage stationary treatment policy by minimizing the negative augmented inverse probability weighted estimator of the value function to increase asymptotic efficiency. A penalty is applied on the policy parameters to select important feature variables. We then construct one-step improvements of the policy parameter estimators for valid inference. Theoretically, we show that the improved estimators are asymptotically normal, even if nuisance parameters are estimated at a slow convergence rate and the dimension of the feature variables increases with the sample size. Our numerical studies demonstrate that the proposed method estimates a sparse policy with a near-optimal value function and conducts valid inference for the policy parameters.
more » « less
Free, publicly-accessible full text available August 25, 2026
Variable selection and prediction performance of penalized two-part regression with community-based crime data application

https://doi.org/10.29220/CSAM.2024.31.4.441

Kim, Seong-Tae; Park, Man Sik (July 2024, Communications for Statistical Applications and Methods)

Full Text Available
Interaction Selection and Prediction Performance in High-Dimensional Data: A Comparative Study of Statistical and Tree-Based Methods

https://doi.org/10.6339/24-JDS1127

Nzekwe, Chinedu J; Kim, Seongtae; Mostafa, Sayed A (May 2024, Journal of Data Science)

Predictive modeling often ignores interaction effects among predictors in high-dimensional data because of analytical and computational challenges. Research in interaction selection has been galvanized along with methodological and computational advances. In this study, we aim to investigate the performance of two types of predictive algorithms that can perform interaction selection. Specifically, we compare the predictive performance and interaction selection accuracy of both penalty-based and tree-based predictive algorithms. Penalty-based algorithms included in our comparative study are the regularization path algorithm under the marginality principle (RAMP), the least absolute shrinkage selector operator (LASSO), the smoothed clipped absolute deviance (SCAD), and the minimax concave penalty (MCP). The tree-based algorithms considered are random forest (RF) and iterative random forest (iRF). We evaluate the effectiveness of these algorithms under various regression and classification models with varying structures and dimensions. We assess predictive performance using the mean squared error for regression and accuracy, sensitivity, specificity, balanced accuracy, and F1 score for classification. We use interaction coverage to judge the algorithm’s efficacy for interaction selection. Our findings reveal that the effectiveness of the selected algorithms varies depending on the number of predictors (data dimension) and the structure of the data-generating model, i.e., linear or nonlinear, hierarchical or non-hierarchical. There were at least one or more scenarios that favored each of the algorithms included in this study. However, from the general pattern, we are able to recommend one or more specific algorithm(s) for some specific scenarios. Our analysis helps clarify each algorithm’s strengths and limitations, offering guidance to researchers and data analysts in choosing an appropriate algorithm for their predictive modeling task based on their data structure.
more » « less
Full Text Available
Statistical Significance of Clustering with Multidimensional Scaling

https://doi.org/10.1080/10618600.2023.2219708

Shen, Hui; Bhamidi, Shankar; Liu, Yufeng (January 2024, Journal of Computational and Graphical Statistics)

Clustering is a fundamental tool for exploratory data analysis. One central problem in clustering is deciding if the clusters discovered by clustering methods are reliable as opposed to being artifacts of natural sampling variation. Statistical significance of clustering (SigClust) is a recently developed cluster evaluation tool for high-dimension, low-sample size data. Despite its successful application to many scientific problems, there are cases where the original SigClust may not work well. Furthermore, for specific applications, researchers may not have access to the original data and only have the dissimilarity matrix. In this case, clustering is still a valuable exploratory tool, but the original SigClust is not applicable. To address these issues, we propose a new SigClust method using multidimensional scaling (MDS). The underlying idea behind MDS-based SigClust is that one can achieve low-dimensional representations of the original data via MDS using only the dissimilarity matrix and then apply SigClust on the low-dimensional MDS space. The proposed MDS-based SigClust can circumvent the challenge of parameter estimation of the original method in high-dimensional spaces while keeping the essential clustering structure in the MDS space. Both simulations and real data applications demonstrate that the proposed method works remarkably well for assessing the statistical significance of clustering. Supplementary materials for this article are available online.
more » « less
Full Text Available
Multi-response Regression for Block-missing Multi-modal Data without Imputation

https://doi.org/10.5705/ss.202021.0170

Wang, Haodong; Li, Quefeng; Liu, Yufeng (January 2024, Statistica Sinica)

Multi-modal data are prevalent in many scientific fields. In this study, we consider the parameter estimation and variable selection for a multi-response regression using block-missing multi-modal data. Our method allows the dimensions of both the responses and the predictors to be large, and the responses to be incomplete and correlated, a common practical problem in high-dimensional settings. Our proposed method uses two steps to make a prediction from a multi-response linear regression model with block-missing multi-modal predictors. In the first step, without imputing missing data, we use all available data to estimate the covariance matrix of the predictors and the cross-covariance matrix between the predictors and the responses. In the second step, we use these matrices and a penalized method to simultaneously estimate the precision matrix of the response vector, given the predictors, and the sparse regression parameter matrix. Lastly, we demonstrate the effectiveness of the proposed method using theoretical studies, simulated examples, and an analysis of a multi-modal imaging data set from the Alzheimer’s Disease Neuroimaging Initiative.
more » « less
Full Text Available
Learning Optimal Group-structured Individualized Treatment Rules with Many Treatments

Ma, H.; Zeng, D.; Liu, Y. (January 2023, Journal of machine learning research)

Data driven individualized decision making problems have received a lot of attentions in recent years. In particular, decision makers aim to determine the optimal Individualized Treatment Rule (ITR) so that the expected speci ed outcome averaging over heterogeneous patient-speci c characteristics is maximized. Many existing methods deal with binary or a moderate number of treatment arms and may not take potential treatment e ect structure into account. However, the e ectiveness of these methods may deteriorate when the number of treatment arms becomes large. In this article, we propose GRoup Outcome Weighted Learning (GROWL) to estimate the latent structure in the treatment space and the op- timal group-structured ITRs through a single optimization. In particular, for estimating group-structured ITRs, we utilize the Reinforced Angle based Multicategory Support Vec- tor Machines (RAMSVM) to learn group-based decision rules under the weighted angle based multi-class classi cation framework. Fisher consistency, the excess risk bound, and the convergence rate of the value function are established to provide a theoretical guaran- tee for GROWL. Extensive empirical results in simulation studies and real data analysis demonstrate that GROWL enjoys better performance than several other existing methods.
more » « less
Full Text Available
Feature Reduction Method Comparison Towards Explainability and Efficiency in Cybersecurity Intrusion Detection Systems

https://doi.org/10.1109/ICMLA55696.2022.00211

Lehavi, Adam; Kim, Seongtae (December 2022, 21st IEEE ICMLA (International Conference on Machine Learning and Applications))

In the realm of cybersecurity, intrusion detection systems (IDS) detect and prevent attacks based on collected computer and network data. In recent research, IDS models have been constructed using machine learning (ML) and deep learning (DL) methods such as Random Forest (RF) and deep neural networks (DNN). Feature selection (FS) can be used to construct faster, more interpretable, and more accurate models. We look at three different FS techniques; RF information gain (RF-IG), correlation feature selection using the Bat Algorithm (CFS-BA), and CFS using the Aquila Optimizer (CFS-AO). Our results show CFS-BA to be the most efficient of the FS methods, building in 55% of the time of the best RF-IG model while achieving 99.99% of its accuracy. This reinforces prior contributions attesting to CFS-BA’s accuracy while building upon the relationship between subset size, CFS score, and RF-IG score in final results.
more » « less
Full Text Available
High Dimensional Change Point Inference: Recent Developments and Extensions

https://doi.org/10.1016/j.jmva.2021.104833

Liu, B. (January 2022, Journal of Multivariate Analysis)

Change point analysis aims to detect structural changes in a data sequence. It has always been an active research area since it was introduced in the 1950s. In modern statistical applications, however, high-throughput data with increasing dimensions are ubiquitous in fields ranging from economics, finance to genetics and engineering. For those problems, the earlier works are typically no longer applicable. As a result, the problem of testing a change point for high dimensional data sequences has been an important yet challenging task. In this paper, we first focus on models for at most one change point, and review recent state-of-art techniques for change point testing of high dimensional mean vectors and compare their theoretical properties. Based on that, we provide a survey of some extensions to general high dimensional parameters beyond mean vectors as well as strategies for testing multiple change points in high dimensions. Finally, we discuss some open problems for possible future research directions.
more » « less
Full Text Available
Non-asymptotic properties of individualized treatment rules from sequentially rule-adaptive trials

Gao, D.; Liu, Y.; Zeng, D. (January 2022, Journal of machine learning research)

Learning optimal individualized treatment rules (ITRs) has become increasingly important in the modern era of precision medicine. Many statistical and machine learning methods for learning optimal ITRs have been developed in the literature. However, most existing methods are based on data collected from traditional randomized controlled trials and thus cannot take advantage of the accumulative evidence when patients enter the trials sequentially. It is also ethically important that future patients should have a high probability to be treated optimally based on the updated knowledge so far. In this work, we propose a new design called sequentially rule-adaptive trials to learn optimal ITRs based on the contextual bandit framework, in contrast to the response-adaptive design in traditional adaptive trials. In our design, each entering patient will be allocated with a high probability to the current best treatment for this patient, which is estimated using the past data based on some machine learning algorithm (for example, outcome weighted learning in our implementation). We explore the tradeoff between training and test values of the estimated ITR in single-stage problems by proving theoretically that for a higher probability of following the estimated ITR, the training value converges to the optimal value at a faster rate, while the test value converges at a slower rate. This problem is different from traditional decision problems in the sense that the training data are generated sequentially and are dependent. We also develop a tool that combines martingale with empirical process to tackle the problem that cannot be solved by previous techniques for i.i.d. data. We show by numerical examples that without much loss of the test value, our proposed algorithm can improve the training value significantly as compared to existing methods. Finally, we use a real data study to illustrate the performance of the proposed method.
more » « less
Full Text Available

« Prev Next »