skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Cross-study Learning for Generalist and Specialist Predictions
Jointly using data from multiple similar sources for the training of prediction models is increasingly becoming an important task in many fields of science. In this paper, we propose a framework for {\it generalist and specialist} predictions that leverages multiple datasets, with potential heterogenity in the relationships between predictors and outcomes. Our framework uses ensembling with stacking, and includes three major components: 1) training of the ensemble members using one or more datasets, 2) a no-data-reuse technique for stacking weights estimation and 3) task-specific utility functions. We prove that under certain regularity conditions, our framework produces a stacked prediction function with oracle property. We also provide analytically the conditions under which the proposed no-data-reuse technique will increase the prediction accuracy of the stacked prediction function compared to using the full data. We perform a simulation study to numerically verify and illustrate these results and apply our framework to predicting mortality based on a collection of variables including long-term exposure to common air pollutants.  more » « less
Award ID(s):
1810829
PAR ID:
10176140
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ArXivorg
ISSN:
2331-8422
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. It is increasingly common to encounter prediction tasks in the biomedical sciences for which multiple datasets are available for model training. Common approaches such as pooling datasets before model fitting can produce poor out‐of‐study prediction performance when datasets are heterogeneous. Theoretical and applied work has shownmultistudy ensemblingto be a viable alternative that leverages the variability across datasets in a manner that promotes model generalizability. Multistudy ensembling uses a two‐stagestackingstrategy which fits study‐specific models and estimates ensemble weights separately. This approach ignores, however, the ensemble properties at the model‐fitting stage, potentially resulting in performance losses. Motivated by challenges in the estimation of COVID‐attributable mortality, we proposeoptimal ensemble construction, an approach to multistudy stacking whereby we jointly estimate ensemble weights and parameters associated with study‐specific models. We prove that limiting cases of our approach yield existing methods such as multistudy stacking and pooling datasets before model fitting. We propose an efficient block coordinate descent algorithm to optimize the loss function. We use our method to perform multicountry COVID‐19 baseline mortality prediction. We show that when little data is available for a country before the onset of the pandemic, leveraging data from other countries can substantially improve prediction accuracy. We further compare and characterize the method's performance in data‐driven simulations and other numerical experiments. Our method remains competitive with or outperforms multistudy stacking and other earlier methods in the COVID‐19 data application and in a range of simulation settings. 
    more » « less
  2. A robot can now grasp an object more effectively than ever before, but once it has the object what happens next? We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances. To address this, we introduce the JHU CoSTAR Block Stacking Dataset (BSD), where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data. We discuss the ways in which this dataset provides a valuable resource for a broad range of other topics of investigation. We find that hand-designed neural networks that work on prior datasets do not generalize to this task. Thus, to establish a baseline for this dataset, we demonstrate an automated search of neural network based models using a novel multiple-input HyperTree MetaModel, and find a final model which makes reasonable 3D pose predictions for grasping and stacking on our dataset. The CoSTAR BSD, code, and instructions are available at sites.google.com/site/costardataset 
    more » « less
  3. Raman, B.; Murala, S.; Chowdhury, A.; Dhall, A.; Goyal, P. (Ed.)
    Using offline training schemes, researchers have tackled the event segmentation problem by providing full or weak-supervision through manually annotated labels or self-supervised epoch-based training. Most works consider videos that are at most 10’s of minutes long. We present a self-supervised perceptual prediction framework capable of temporal event segmentation by building stable representations of objects over time and demonstrate it on long videos, spanning several days at 25 FPS. The approach is deceptively simple but quite effective. We rely on predictions of high-level features computed by a standard deep learning backbone. For prediction, we use an LSTM, augmented with an attention mechanism, trained in a self-supervised manner using the prediction error. The self-learned attention maps effectively localize and track the event-related objects in each frame. The proposed approach does not require labels. It requires only a single pass through the video, with no separate training set. Given the lack of datasets of very long videos, we demonstrate our method on video from 10 d (254 h) of continuous wildlife monitoring data that we had collected with required permissions. We find that the approach is robust to various environmental conditions such as day/night conditions, rain, sharp shadows, and windy conditions. For the task of temporally locating events at the activity level, we had an 80% activity recall rate for one false activity detection every 50 min. We will make the dataset, which is the first of its kind, and the code available to the research community. Project page is available at https://ramymounir.com/publications/EventSegmentation/. 
    more » « less
  4. Abstract Predicting failure in solids has broad applications including earthquake prediction which remains an unattainable goal. However, recent machine learning work shows that laboratory earthquakes can be predicted using micro-failure events and temporal evolution of fault zone elastic properties. Remarkably, these results come from purely data-driven models trained with large datasets. Such data are equivalent to centuries of fault motion rendering application to tectonic faulting unclear. In addition, the underlying physics of such predictions is poorly understood. Here, we address scalability using a novel Physics-Informed Neural Network (PINN). Our model encodes fault physics in the deep learning loss function using time-lapse ultrasonic data. PINN models outperform data-driven models and significantly improve transfer learning for small training datasets and conditions outside those used in training. Our work suggests that PINN offers a promising path for machine learning-based failure prediction and, ultimately for improving our understanding of earthquake physics and prediction. 
    more » « less
  5. Abstract Quantum federated learning (QFL) can facilitate collaborative learning across multiple clients using quantum machine learning (QML) models, while preserving data privacy. Although recent advances in QFL span different tasks like classification while leveraging several data types, no prior work has focused on developing a QFL framework that utilizes temporal data to approximate functions useful to analyze the performance of distributed quantum sensing networks. In this paper, a novel QFL framework that is the first to integrate quantum long short-term memory (QLSTM) models with temporal data is proposed. The proposedfederated QLSTM (FedQLSTM)framework is exploited for performing the task of function approximation. In this regard, three key use cases are presented: Bessel function approximation, sinusoidal delayed quantum feedback control function approximation, and Struve function approximation. Simulation results confirm that, for all considered use cases, the proposed FedQLSTM framework achieves a faster convergence rate under one local training epoch, minimizing the overall computations, and saving 25–33% of the number of communication rounds needed until convergence compared to an FL framework with classical LSTM models. 
    more » « less