Imaging data-based prognostic models focus on using an asset’s degradation images to predict its time to failure (TTF). Most image-based prognostic models have two common limitations. First, they require degradation images to be complete (i.e., images are observed continuously and regularly over time). Second, they usually employ an unsupervised dimension reduction method to extract low-dimensional features and then use the features for TTF prediction. Because unsupervised dimension reduction is conducted on the degradation images without the involvement of TTFs, there is no guarantee that the extracted features are effective for failure time prediction. To address these challenges, this article develops a supervised tensor dimension reduction-based prognostic model. The model first proposes a supervised dimension reduction method for tensor data. It uses historical TTFs to guide the detection of a tensor subspace to extract low-dimensional features from high-dimensional incomplete degradation imaging data. Next, the extracted features are used to construct a prognostic model based on (log)-location-scale regression. An optimization algorithm for parameter estimation is proposed, and analytical solutions are discussed. Simulated data and a real-world data set are used to validate the performance of the proposed model. History: Bianca Maria Colosimo served as the senior editor for this article Funding: This work was supported by National Science Foundation [2229245]. Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://github.com/czhou9/Code-and-Data-for-IJDS and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2022.x022 ). 
                        more » 
                        « less   
                    
                            
                            Sequential Adversarial Anomaly Detection for One-Class Event Data
                        
                    
    
            We consider the sequential anomaly detection problem in the one-class setting when only the anomalous sequences are available and propose an adversarial sequential detector by solving a minimax problem to find an optimal detector against the worst-case sequences from a generator. The generator captures the dependence in sequential events using the marked point process model. The detector sequentially evaluates the likelihood of a test sequence and compares it with a time-varying threshold, also learned from data through the minimax problem. We demonstrate our proposed method’s good performance using numerical experiments on simulations and proprietary large-scale credit card fraud data sets. The proposed method can generally apply to detecting anomalous sequences. History: W. Nick Street served as the senior editor for this article. Funding: This work is partially supported by the National Science Foundation [Grants CAREER CCF-1650913, DMS-1938106, and DMS-1830210] and grant support from Macy’s Technology. Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://doi.org/10.24433/CO.2329910.v1 and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2023.0026 ). 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2134037
- PAR ID:
- 10434379
- Publisher / Repository:
- INFORMS
- Date Published:
- Journal Name:
- INFORMS Journal on Data Science
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 2694-4022
- Page Range / eLocation ID:
- 45 to 59
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Problem definition: We study a feature-based pricing problem with demand censoring in an offline, data-driven setting. In this problem, a firm is endowed with a finite amount of inventory and faces a random demand that is dependent on the offered price and the features (from products, customers, or both). Any unsatisfied demand that exceeds the inventory level is lost and unobservable. The firm does not know the demand function but has access to an offline data set consisting of quadruplets of historical features, inventory, price, and potentially censored sales quantity. Our objective is to use the offline data set to find the optimal feature-based pricing rule so as to maximize the expected profit. Methodology/results: Through the lens of causal inference, we propose a novel data-driven algorithm that is motivated by survival analysis and doubly robust estimation. We derive a finite sample regret bound to justify the proposed offline learning algorithm and prove its robustness. Numerical experiments demonstrate the robust performance of our proposed algorithm in accurately estimating optimal prices on both training and testing data. Managerial implications: The work provides practitioners with an innovative modeling and algorithmic framework for the feature-based pricing problem with demand censoring through the lens of causal inference. Our numerical experiments underscore the value of considering demand censoring in the context of feature-based pricing. Funding: The research of E. Fang is partially supported by the National Science Foundation [Grants NSF DMS-2346292, NSF DMS-2434666] and the Whitehead Scholarship. The research of C. Shi is partially supported by the Amazon Research Award. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2024.1061 .more » « less
- 
            A central problem of materials science is to determine whether a hypothetical material is stable without being synthesized, which is mathematically equivalent to a global optimization problem on a highly nonlinear and multimodal potential energy surface (PES). This optimization problem poses multiple outstanding challenges, including the exceedingly high dimensionality of the PES, and that PES must be constructed from a reliable, sophisticated, parameters-free, and thus very expensive computational method, for which density functional theory (DFT) is an example. DFT is a quantum mechanics-based method that can predict, among other things, the total potential energy of a given configuration of atoms. DFT, although accurate, is computationally expensive. In this work, we propose a novel expansion-exploration-exploitation framework to find the global minimum of the PES. Starting from a few atomic configurations, this “known” space is expanded to construct a big candidate set. The expansion begins in a nonadaptive manner, where new configurations are added without their potential energy being considered. A novel feature of this step is that it tends to generate a space-filling design without the knowledge of the boundaries of the domain space. If needed, the nonadaptive expansion of the space of configurations is followed by adaptive expansion, where “promising regions” of the domain space (those with low-energy configurations) are further expanded. Once a candidate set of configurations is obtained, it is simultaneously explored and exploited using Bayesian optimization to find the global minimum. The methodology is demonstrated using a problem of finding the most stable crystal structure of aluminum. History: Kwok Tsui served as the senior editor for this article. Funding: The authors acknowledge a U.S. National Science Foundation Grant DMREF-1921873 and XSEDE through Grant DMR170031. Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://codeocean.com/capsule/3366149/tree and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2023.0028 ).more » « less
- 
            Parameter calibration aims to estimate unobservable parameters used in a computer model by using physical process responses and computer model outputs. In the literature, existing studies calibrate all parameters simultaneously using an entire data set. However, in certain applications, some parameters are associated with only a subset of data. For example, in the building energy simulation, cooling (heating) season parameters should be calibrated using data collected during the cooling (heating) season only. This study provides a new multiblock calibration approach that considers such heterogeneity. Unlike existing studies that build emulators for the computer model response, such as the widely used Bayesian calibration approach, we consider multiple loss functions to be minimized, each for a block of parameters that use the corresponding data set, and estimate the parameters using a nonlinear optimization technique. We present the convergence properties under certain conditions and quantify the parameter estimation uncertainties. The superiority of our approach is demonstrated through numerical studies and a real-world building energy simulation case study. History: Bianca Maria Colosimo served as the senior editor for this article. Funding: This work was partially supported by the National Science Foundation [Grants CMMI-1662553, CMMI-2226348, and CBET-1804321]. Data Ethics & Reproducibility Note: The code capsule is available on Code Ocean at https://codeocean.com/capsule/8623151/tree/v1 and in the e-Companion to this article (available at https://doi.org/10.1287/ijds.2023.0029 ).more » « less
- 
            We consider a common problem occurring after using a statistical process control (SPC) method based on three-dimensional measurements: locate where on the surface of the part that triggered an out-of-control alarm there is a significant shape difference with respect to either an in-control part or its nominal (computer-aided design (CAD)) design. In the past, only registration-based solutions existed for this problem, which first orient and locate the part and its nominal design under the same frame of reference. Recently, spectral Laplacian methods have been proposed for the SPC of discrete parts and their measured surface meshes. These techniques provide an intrinsic solution to the SPC problem: that is, a solution exclusively based on data whose coordinates lie on the surfaces without making reference to their ambient space, thus avoiding registration. Registration-free methods avoid the computationally expensive, nonconvex registration step needed to align the parts as required by previous methods, eliminating registration errors, and they are important in industry because of the increasing use of portable noncontact scanners. In this paper, we first present a new registration-free solution to the post-SPC part defect localization problem. The approach uses a spectral decomposition of the Laplace–Beltrami operator in order to construct a functional map between the CAD and measured manifolds to locate defects on the suspected part. A computational complexity analysis demonstrates the approach scales better with the mesh size and is more stable than a registration-based approach. To reduce computational expense, a new mesh partitioning algorithm is presented to find a region of interest on the surface of the part where defects are more likely to exist. The functional map method involves a large number of point-to-point comparisons based on noisy measurements, and a new statistical thresholding method used to filter the false positives in the underlying massive multiple comparisons problem is also provided. Funding: This research was partially funded by the National Science Foundation [Grant CMMI 2121625]. Data Ethics & Reproducibility Note: There are no data ethics considerations. The code capsule is available on Code Ocean at https://codeocean.com/capsule/4615101/tree/v1 and in the e-Companion to this article (available https://doi.org/10.1287/ijds.2023.0030 ).more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    