skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Symphony in the Latent Space: Provably Integrating High-dimensional Techniques with Non-linear Machine Learning Models
This paper revisits building machine learning algorithms that involve interactions between entities, such as those between financial assets in an actively managed portfolio, or interactions between users in a social network. Our goal is to forecast the future evolution of ensembles of multivariate time series in such applications (e.g., the future return of a financial asset or the future popularity of a Twitter account). Designing ML algorithms for such systems requires addressing the challenges of high-dimensional interactions and non-linearity. Existing approaches usually adopt an ad-hoc approach to integrating high-dimensional techniques into non-linear models and re- cent studies have shown these approaches have questionable efficacy in time-evolving interacting systems. To this end, we propose a novel framework, which we dub as the additive influence model. Under our modeling assump- tion, we show that it is possible to decouple the learning of high-dimensional interactions from the learning of non-linear feature interactions. To learn the high-dimensional interac- tions, we leverage kernel-based techniques, with provable guarantees, to embed the entities in a low-dimensional latent space. To learn the non-linear feature-response interactions, we generalize prominent machine learning techniques, includ- ing designing a new statistically sound non-parametric method and an ensemble learning algorithm optimized for vector re- gressions. Extensive experiments on two common applica- tions demonstrate that our new algorithms deliver significantly stronger forecasting power compared to standard and recently proposed methods.  more » « less
Award ID(s):
2008557
PAR ID:
10468112
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
AAAI
Date Published:
Format(s):
Medium: X
Location:
Washington DC
Sponsoring Org:
National Science Foundation
More Like this
  1. This paper revisits building machine learning algorithms that involve interactions between entities, such as those between financial assets in an actively managed portfolio, or interac- tions between users in a social network. Our goal is to forecast the future evolution of ensembles of multivariate time series in such applications (e.g., the future return of a financial asset or the future popularity of a Twitter account). Designing ML algorithms for such systems requires addressing the challenges of high-dimensional interactions and non-linearity. Existing approaches usually adopt an ad-hoc approach to integrating high-dimensional techniques into non-linear models and re- cent studies have shown these approaches have questionable efficacy in time-evolving interacting systems. To this end, we propose a novel framework, which we dub as the additive influence model. Under our modeling assump- tion, we show that it is possible to decouple the learning of high-dimensional interactions from the learning of non-linear feature interactions. To learn the high-dimensional interac- tions, we leverage kernel-based techniques, with provable guarantees, to embed the entities in a low-dimensional latent space. To learn the non-linear feature-response interactions, we generalize prominent machine learning techniques, includ- ing designing a new statistically sound non-parametric method and an ensemble learning algorithm optimized for vector re- gressions. Extensive experiments on two common applica- tions demonstrate that our new algorithms deliver significantly stronger forecasting power compared to standard and recently proposed methods. 
    more » « less
  2. An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear layer of the network. This approach yields strong downstream performance in a variety of contexts, demonstrating that multitask pretraining leads to effective feature learning. Although several recent theoretical studies have shown that shallow NNs learn meaningful features when either (i) they are trained on a single task or (ii) they are linear, very little is known about the closer-to-practice case of nonlinear NNs trained on multiple tasks. In this work, we present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks. Using this observation, we show that when the tasks are binary classification tasks with labels depending on the projection of the data onto an r-dimensional subspace within the d k r-dimensional input space, a simple gradient-based multitask learning algorithm on a two-layer ReLU NN recovers this projection, allowing for generalization to downstream tasks with sample and neuron complexity independent of d. In contrast, we show that with high probability over the draw of a single task, training on this single task cannot guarantee to learn all r ground-truth features. 
    more » « less
  3. We propose DEER (Descriptive Knowledge Graph for Explaining Entity Relationships) - an open and informative form of modeling entity relationships. In DEER, relationships between entities are represented by free-text relation descriptions. For instance, the relationship between entities of machine learning and algorithm can be represented as “Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.” To construct DEER, we propose a self-supervised learning method to extract relation descriptions with the analysis of dependency patterns and generate relation descriptions with a transformer-based relation description synthesizing model, where no human labeling is required. Experiments demonstrate that our system can extract and generate high-quality relation descriptions for explaining entity relationships. The results suggest that we can build an open and informative knowledge graph without human annotation. 
    more » « less
  4. We propose DEER (Descriptive Knowledge Graph for Explaining Entity Relationships) - an open and informative form of modeling entity relationships. In DEER, relationships between entities are represented by free-text relation descriptions. For instance, the relationship between entities of machine learning and algorithm can be represented as {``}Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.{''} To construct DEER, we propose a self-supervised learning method to extract relation descriptions with the analysis of dependency patterns and generate relation descriptions with a transformer-based relation description synthesizing model, where no human labeling is required. Experiments demonstrate that our system can extract and generate high-quality relation descriptions for explaining entity relationships. The results suggest that we can build an open and informative knowledge graph without human annotation. 
    more » « less
  5. De Lorenzis, L; Papadrakakis, M; Zohdi T.I. (Ed.)
    This paper introduces a neural kernel method to generate machine learning plasticity models for micropolar and micromorphic materials that lack material symmetry and have internal structures. Since these complex materials often require higher-dimensional parametric space to be precisely characterized, we introduce a representation learning step where we first learn a feature vector space isomorphic to a finite-dimensional subspace of the original parametric function space from the augmented labeled data expanded from the narrow band of the yield data. This approach simplifies the data augmentation step and enables us to constitute the high-dimensional yield surface in a feature space spanned by the feature kernels. In the numerical examples, we first verified the implementations with data generated from known models, then tested the capacity of the models to discover feature spaces from meso-scale simulation data generated from representative elementary volume (RVE) of heterogeneous materials with internal structures. The neural kernel plasticity model and other alternative machine learning approaches are compared in a computational homogenization problem for layered geomaterials. The results indicate that the neural kernel feature space may lead to more robust forward predictions against sparse and high-dimensional data. 
    more » « less