Feature acquisition in predictive modeling is an important task in many practical applications. For example, in patient health prediction, we do not fully observe their personal features and need to dynamically select features to acquire. Our goal is to acquire a small subset of features that maximize prediction performance. Recently, some works reformulated feature acquisition as a Markov decision process and applied reinforcement learning (RL) algorithms, where the reward reflects both prediction performance and feature acquisition cost. However, RL algorithms only use zeroth-order information on the reward, which leads to slow empirical convergence, especially when there are many actions (number of features) to consider. For predictive modeling, it is possible to use first-order information on the reward, i.e., gradients, since we are often given an already collected dataset. Therefore, we propose differentiable feature acquisition (DiFA), which uses a differentiable representation of the feature selection policy to enable gradients to flow from the prediction loss to the policy parameters. We conduct extensive experiments on various real-world datasets and show that DiFA significantly outperforms existing feature acquisition methods when the number of features is large.
more »
« less
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition
We develop novel methodology for active feature acquisition (AFA), the study of sequentially acquiring a dynamic subset of features that minimizes acquisition costs whilst still yielding accurate inference. The AFA framework can be useful in a myriad of domains, including health care applications where the cost of acquiring additional features for a patient (in terms of time, money, risk, etc.) can be weighed against the expected improvement to diagnostic performance. Previous approaches for AFA have employed either: deep learning RL techniques, which have difficulty training policies due to a complicated state and action space; deep learning surrogate generative models, which require modeling complicated multidimensional conditional distributions; or greedy policies, which cannot account for jointly informative feature acquisitions. We show that we can bypass many of these challenges with a novel, nonparametric oracle based approach, which we coin the acquisition conditioned oracle (ACO). Extensive experiments show the superiority of the ACO to state-of-the-art AFA methods when acquiring features for both predictions and general decision-making.
more »
« less
- Award ID(s):
- 2133595
- PAR ID:
- 10543467
- Publisher / Repository:
- ICML 2024
- Date Published:
- Volume:
- 41
- Page Range / eLocation ID:
- 48957-48975
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Humans have the remarkable ability to recognize and acquire novel visual concepts in a zero-shot manner. Given a high-level, symbolic description of a novel concept in terms of previously learned visual concepts and their relations, humans can recognize novel concepts without seeing any examples. Moreover, they can acquire new concepts by parsing and communicating symbolic structures using learned visual concepts and relations. Endowing these capabilities in machines is pivotal in improving their generalization capability at inference time. We introduced Zero-shot Concept Recognition and Acquisition (ZeroC), a neuro-symbolic architecture that can recognize and acquire novel concepts in a zero-shot way. ZeroC represents concepts as graphs of constituent concept models (as nodes) and their relations (as edges). To allow inference time composition, we employed energy-based models (EBMs) to model concepts and relations. We designed ZeroC architecture so that it allows a one-to-one mapping between a symbolic graph structure of a concept and its corresponding EBM, which for the first time, allows acquiring new concepts, communicating its graph structure, and applying it to classification and detection tasks (even across domains) at inference time. We introduced algorithms for learning and inference with ZeroC. We evaluated ZeroC on a challenging grid-world dataset which is designed to probe zero-shot concept recognition and acquisition, and demonstrated its capability.more » « less
-
Deep neural networks are efficient learning machines which leverage upon a large amount of manually labeled data for learning discriminative features. However, acquiring substantial amount of supervised data, especially for videos can be a tedious job across various computer vision tasks. This necessitates learning of visual features from videos in an unsupervised setting. In this paper, we propose a computationally simple, yet effective, framework to learn spatio-temporal feature embedding from unlabeled videos. We train a Convolutional 3D Siamese network using positive and negative pairs mined from videos under certain probabilistic assumptions. Experimental results on three datasets demonstrate that our proposed framework is able to learn weights which can be used for same as well as cross dataset and tasks.more » « less
-
Cascadilla Press (Ed.)The morphosyntactic information in grammatical number marking may be a useful cue for children in the process of acquiring number words. A language with dual marking, like Slovenian, may help children to bootstrap the meaning of the word “two” by drawing their attention to sets of two as a referent of language. If the dual marker indeed facilitates number learning, then we hypothesized that “two” should be acquired earlier in populations exposed to the dual marker; the dual should be learned before “two”; and knowledge of the dual form should be correlated with knowledge of “two”. We tested these hypotheses by having Slovenian and English-speaking children complete the Give-a-Number and Give-Morphology tasks. We analyzed the Give-Morphology in a new way, using stricter criteria to determine that children “know” the morphological markers than simple percent correct. In this sample, Slovenian children exposed to the dual marker did not show evidence of knowing “two” (i.e., being 2-knowers) at very young ages or earlier than English-speaking children. Knowledge of the dual marker did not precede nor correlate with the acquisition of “two”; indeed, the dual form was only acquired after the singular and plural. These analyses were conducted using an open data set with more Slovenian 2-knowers, yielding similar results. These findings present challenges for the view that grammatical number plays a role in number acquisition. This theory requires articulation about how a dual-marked language can facilitate number acquisition if children do not notice or learn the dual form. The information in grammatical number marking may be a useful cue for children in the process of acquiring number words. A language with dual marking, like Slovenian, may help children to bootstrap the meaning of the word “two” by drawing their attention to sets of two as a referent of language. If the dual marker indeed facilitates number learning, we hypothesized that “two” should be acquired earlier in populations exposed to the dual marker; the dual should be learned before “two”; and knowledge of the dual form should be correlated with knowledge of “two”. We tested these hypotheses by having Slovenian and English-speaking children complete the Give-a-Number and Give-Morphology tasks. We analyzed the Give-Morphology in a new way, using stricter criteria to determine that children “know” the morphological markers than simple percent correct. In this sample, Slovenian children exposed to the dual marker did not show evidence of knowing “two” (i.e., being 2-knowers) at very young ages or earlier than English-speaking children. Knowledge of the dual marker did not precede nor correlate with the acquisition of “two”. Indeed, the dual form was acquired only after the singular and plural. Parallel analyses were also conducted using an open data set with more Slovenian 2-knowers, yielding similar results. These findings present challenges for the claim that grammatical number plays a role in number acquisition. Specifically, this theory requires better articulation about how a dual-marked language can facilitate number acquisition if children do not notice or learn the dual form.more » « less
-
Avidan, S. (Ed.)The subpopulation shifting challenge, known as some subpopulations of a category that are not seen during training, severely limits the classification performance of the state-of-the-art convolutional neural networks. Thus, to mitigate this practical issue, we explore incremental subpopulation learning (ISL) to adapt the original model via incrementally learning the unseen subpopulations without retaining the seen population data. However, striking a great balance between subpopulation learning and seen population forgetting is the main challenge in ISL but is not well studied by existing approaches. These incremental learners simply use a pre-defined and fixed hyperparameter to balance the learning objective and forgetting regularization, but their learning is usually biased towards either side in the long run. In this paper, we propose a novel two-stage learning scheme to explicitly disentangle the acquisition and forgetting for achieving a better balance between subpopulation learning and seen population forgetting: in the first “gain-acquisition” stage, we progressively learn a new classifier based on the margin-enforce loss, which enforces the hard samples and population to have a larger weight for classifier updating and avoid uniformly updating all the population; in the second “counter-forgetting” stage, we search for the proper combination of the new and old classifiers by optimizing a novel objective based on proxies of forgetting and acquisition. We benchmark the representative and state-of-the-art non-exemplar-based incremental learning methods on a large-scale subpopulation shifting dataset for the first time. Under almost all the challenging ISL protocols, we significantly outperform other methods by a large margin, demonstrating our superiority to alleviate the subpopulation shifting problem (Code is released in https://github.com/wuyujack/ISL).more » « less
An official website of the United States government

