skip to main content


Title: Einstein–Roscoe regression for the slag viscosity prediction problem in steelmaking
Abstract

In classical machine learning, regressors are trained without attempting to gain insight into the mechanism connecting inputs and outputs. Natural sciences, however, are interested in finding a robust interpretable function for the target phenomenon, that can return predictions even outside of the training domains. This paper focuses on viscosity prediction problem in steelmaking, and proposes Einstein–Roscoe regression (ERR), which learns the coefficients of the Einstein–Roscoe equation, and is able to extrapolate to unseen domains. Besides, it is often the case in the natural sciences that some measurements are unavailable or expensive than the others due to physical constraints. To this end, we employ a transfer learning framework based on Gaussian process, which allows us to estimate the regression parameters using the auxiliary measurements available in a reasonable cost. In experiments using the viscosity measurements in high temperature slag suspension system, ERR is compared favorably with various machine learning approaches in interpolation settings, while outperformed all of them in extrapolation settings. Furthermore, after estimating parameters using the auxiliary dataset obtained at room temperature, an increase in accuracy is observed in the high temperature dataset, which corroborates the effectiveness of the proposed approach.

 
more » « less
NSF-PAR ID:
10381821
Author(s) / Creator(s):
; ;
Publisher / Repository:
Nature Publishing Group
Date Published:
Journal Name:
Scientific Reports
Volume:
12
Issue:
1
ISSN:
2045-2322
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract

    Advances in visual perceptual tasks have been mainly driven by the amount, and types, of annotations of large-scale datasets. Researchers have focused on fully-supervised settings to train models using offline epoch-based schemes. Despite the evident advancements, limitations and cost of manually annotated datasets have hindered further development for event perceptual tasks, such as detection and localization of objects and events in videos. The problem is more apparent in zoological applications due to the scarcity of annotations and length of videos-most videos are at most ten minutes long. Inspired by cognitive theories, we present a self-supervised perceptual prediction framework to tackle the problem of temporal event segmentation by building a stable representation of event-related objects. The approach is simple but effective. We rely on LSTM predictions of high-level features computed by a standard deep learning backbone. For spatial segmentation, the stable representation of the object is used by an attention mechanism to filter the input features before the prediction step. The self-learned attention maps effectively localize the object as a side effect of perceptual prediction. We demonstrate our approach on long videos from continuous wildlife video monitoring, spanning multiple days at 25 FPS. We aim to facilitate automated ethogramming by detecting and localizing events without the need for labels. Our approach is trained in an online manner on streaming input and requires only a single pass through the video, with no separate training set. Given the lack of long and realistic (includes real-world challenges) datasets, we introduce a new wildlife video dataset–nest monitoring of the Kagu (a flightless bird from New Caledonia)–to benchmark our approach. Our dataset features a video from 10 days (over 23 million frames) of continuous monitoring of the Kagu in its natural habitat. We annotate every frame with bounding boxes and event labels. Additionally, each frame is annotated with time-of-day and illumination conditions. We will make the dataset, which is the first of its kind, and the code available to the research community. We find that the approach significantly outperforms other self-supervised, traditional (e.g., Optical Flow, Background Subtraction) and NN-based (e.g., PA-DPC, DINO, iBOT), baselines and performs on par with supervised boundary detection approaches (i.e., PC). At a recall rate of 80%, our best performing model detects one false positive activity every 50 min of training. On average, we at least double the performance of self-supervised approaches for spatial segmentation. Additionally, we show that our approach is robust to various environmental conditions (e.g., moving shadows). We also benchmark the framework on other datasets (i.e., Kinetics-GEBD, TAPOS) from different domains to demonstrate its generalizability. The data and code are available on our project page:https://aix.eng.usf.edu/research_automated_ethogramming.html

     
    more » « less
  2. Abstract

    Machine learning techniques were used to predict tensile properties of material extrusion-based additively manufactured parts made with Technomelt PA 6910, a hot melt adhesive. An adaptive data generation technique, specifically an active learning process based on the Gaussian process regression algorithm, was employed to enable prediction with limited training data. After three rounds of data collection, machine learning models based on linear regression, ridge regression, Gaussian process regression, and K-nearest neighbors were tasked with predicting properties for the test dataset, which consisted of parts fabricated with five processing parameters chosen using a random number generator. Overall, linear regression and ridge regression successfully predicted output parameters, with < 10% error for 56% of predictions. K-nearest neighbors performed worse than linear regression and ridge regression, with < 10% error for 32% of predictions and 10–20% error for 60% of predictions. While Gaussian process regression performed with the lowest accuracy (< 10% error for 32% of prediction cases and 10–20% error for 40% of predictions), it benefited most from the adaptive data generation technique. This work demonstrates that machine learning models using adaptive data generation techniques can efficiently predict properties of additively manufactured structures with limited training data.

     
    more » « less
  3. We study the problem of transfer learning, observing that previous efforts to understand its information-theoretic limits do not fully exploit the geometric structure of the source and target domains. In contrast, our study first illustrates the benefits of incorporating a natural geometric structure within a linear regression model, which corresponds to the generalized eigenvalue problem formed by the Gram matrices of both domains. We next establish a finite-sample minimax lower bound, propose a refined model interpolation estimator that enjoys a matching upper bound, and then extend our framework to multiple source domains and generalized linear models. Surprisingly, as long as information is available on the distance between the source and target parameters, negative-transfer does not occur. Simulation studies show that our proposed interpolation estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings. 
    more » « less
  4. Abstract

    Opioid use disorder is one of the most pressing public health problems of our time. Mobile health tools, including wearable sensors, have great potential in this space, but have been underutilized. Of specific interest are digital biomarkers, or end-user generated physiologic or behavioral measurements that correlate with health or pathology. The current manuscript describes a longitudinal, observational study of adult patients receiving opioid analgesics for acute painful conditions. Participants in the study are monitored with a wrist-worn E4 sensor, during which time physiologic parameters (heart rate/variability, electrodermal activity, skin temperature, and accelerometry) are collected continuously. Opioid use events are recorded via electronic medical record and self-report. Three-hundred thirty-nine discreet dose opioid events from 36 participant are analyzed among 2070 h of sensor data. Fifty-one features are extracted from the data and initially compared pre- and post-opioid administration, and subsequently are used to generate machine learning models. Model performance is compared based on individual and treatment characteristics. The best performing machine learning model to detect opioid administration is a Channel-Temporal Attention-Temporal Convolutional Network (CTA-TCN) model using raw data from the wearable sensor. History of intravenous drug use is associated with better model performance, while middle age, and co-administration of non-narcotic analgesia or sedative drugs are associated with worse model performance. These characteristics may be candidate input features for future opioid detection model iterations. Once mature, this technology could provide clinicians with actionable data on opioid use patterns in real-world settings, and predictive analytics for early identification of opioid use disorder risk.

     
    more » « less
  5. Abstract

    This paper considers estimation and prediction of a high-dimensional linear regression in the setting of transfer learning where, in addition to observations from the target model, auxiliary samples from different but possibly related regression models are available. When the set of informative auxiliary studies is known, an estimator and a predictor are proposed and their optimality is established. The optimal rates of convergence for prediction and estimation are faster than the corresponding rates without using the auxiliary samples. This implies that knowledge from the informative auxiliary samples can be transferred to improve the learning performance of the target problem. When the set of informative auxiliary samples is unknown, we propose a data-driven procedure for transfer learning, called Trans-Lasso, and show its robustness to non-informative auxiliary samples and its efficiency in knowledge transfer. The proposed procedures are demonstrated in numerical studies and are applied to a dataset concerning the associations among gene expressions. It is shown that Trans-Lasso leads to improved performance in gene expression prediction in a target tissue by incorporating data from multiple different tissues as auxiliary samples.

     
    more » « less