NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Towards Cost Sensitive Decision Making

Li, Yang; Oliva, Junier (May 2025, International Conference on Artificial Intelligence and Statistics (AISTATS) 2025)

Many real-world situations allow for the acquisition of additional relevant information when making decisions with limited or uncertain data. However, traditional RL approaches either require all features to be acquired beforehand (e.g. in a MDP) or regard part of them as missing data that cannot be acquired (e.g. in a POMDP). In this work, we consider RL models that may actively acquire features from the environment to improve the decision quality and certainty, while automatically balancing the cost of feature acquisition process and the reward of task decision process. We propose the Active-Acquisition POMDP and identify two types of the acquisition process for different application domains. In order to assist the agent in the actively-acquired partially-observed environment and alleviate the exploration-exploitation dilemma, we develop a model-based approach, where a deep generative model is utilized to capture the dependencies of the features and impute the unobserved features. The imputations essentially represent the beliefs of the agent. Equipped with the dynamics model, we develop hierarchical RL algorithms to resolve both types of the AA-POMDPs. Empirical results demonstrate that our approach achieves considerably better performance than existing POMDP-RL solutions.
more » « less
Free, publicly-accessible full text available May 5, 2026
Unsupervised Imputation of Non-Ignorably Missing Data Using Importance-Weighted Autoencoders

https://doi.org/10.1080/19466315.2024.2368787

Lim, David K; Rashid, Naim U; Oliva, Junier B; Ibrahim, Joseph G (July 2024, Statistics in Biopharmaceutical Research)

Deep Learning (DL) methods have dramatically increased in popularity in recent years. While its initial success was demonstrated in the classification and manipulation of image data, there has been significant growth in the application of DL methods to problems in the biomedical sciences. However, the greater prevalence and complexity of missing data in biomedical datasets present significant challenges for DL methods. Here, we provide a formal treatment of missing data in the context of Variational Autoencoders (VAEs), a popular unsupervised DL architecture commonly used for dimension reduction, imputation, and learning latent representations of complex data. We propose a new VAE architecture, NIMIWAE, that is one of the first to flexibly account for both ignorable and non-ignorable patterns of missingness in input features at training time. Following training, samples can be drawn from the approximate posterior distribution of the missing data can be used for multiple imputation, facilitating downstream analyses on high dimensional incomplete datasets. We demonstrate through statistical simulation that our method outperforms existing approaches for unsupervised learning tasks and imputation accuracy. We conclude with a case study of an EHR dataset pertaining to 12,000 ICU patients containing a large number of diagnostic measurements and clinical outcomes, where many features are only partially observed.
more » « less
Full Text Available
Acquisition Conditioned Oracle for Nongreedy Active Feature Acquisition

Valancius, M; Lennon, M; Oliva, J (July 2024, ICML 2024)

We develop novel methodology for active feature acquisition (AFA), the study of sequentially acquiring a dynamic subset of features that minimizes acquisition costs whilst still yielding accurate inference. The AFA framework can be useful in a myriad of domains, including health care applications where the cost of acquiring additional features for a patient (in terms of time, money, risk, etc.) can be weighed against the expected improvement to diagnostic performance. Previous approaches for AFA have employed either: deep learning RL techniques, which have difficulty training policies due to a complicated state and action space; deep learning surrogate generative models, which require modeling complicated multidimensional conditional distributions; or greedy policies, which cannot account for jointly informative feature acquisitions. We show that we can bypass many of these challenges with a novel, nonparametric oracle based approach, which we coin the acquisition conditioned oracle (ACO). Extensive experiments show the superiority of the ACO to state-of-the-art AFA methods when acquiring features for both predictions and general decision-making.
more » « less
Full Text Available
Phoneme Hallucinator: One-Shot Voice Conversion via Set Expansion

https://doi.org/10.1609/aaai.v38i13.29411

Shan, Siyuan; Li, Yang; Banerjee, Amartya; Oliva, Junier B (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Voice conversion (VC) aims at altering a person's voice to make it sound similar to the voice of another person while preserving linguistic content. Existing methods suffer from a dilemma between content intelligibility and speaker similarity; i.e., methods with higher intelligibility usually have a lower speaker similarity, while methods with higher speaker similarity usually require plenty of target speaker voice data to achieve high intelligibility. In this work, we propose a novel method Phoneme Hallucinator that achieves the best of both worlds. Phoneme Hallucinator is a one-shot VC model; it adopts a novel model to hallucinate diversified and high-fidelity target speaker phonemes based just on a short target speaker voice (e.g. 3 seconds). The hallucinated phonemes are then exploited to perform neighbor-based voice conversion. Our model is a text-free, any-to-any VC model that requires no text annotations and supports conversion to any unseen speaker. Quantitative and qualitative evaluations show that Phoneme Hallucinator outperforms existing VC methods for both intelligibility and speaker similarity.
more » « less
Full Text Available
NRTSI: Non-Recurrent Time Series Imputation

https://doi.org/10.1109/ICASSP49357.2023.10095054

Shan, Siyuan; Li, Yang; Oliva, Junier B. (June 2023, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

Full Text Available
Continuously Parameterized Mixture Models

Bender, Christopher M (January 2023, International Conference on Machine Learning)

Full Text Available
Learning to Retrieve Videos by Asking Questions

https://doi.org/10.1145/3503161.3548361

Madasu, Avinash; Oliva, Junier; Bertasius, Gedas (October 2022, Proceedings of the 30th ACM International Conference on Multimedia)

Full Text Available
Transparent single-cell set classification with kernel mean embeddings

https://doi.org/10.1145/3535508.3545538

Shan, Siyuan; Baskaran, Vishal Athreya; Yi, Haidong; Ranek, Jolene; Stanley, Natalie; Oliva, Junier B. (August 2022, Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)

Full Text Available
Distribution-based sketching of single-cell samples

https://doi.org/10.1145/3535508.3545539

Baskaran, Vishal Athreya; Ranek, Jolene; Shan, Siyuan; Stanley, Natalie; Oliva, Junier B. (August 2022, Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)

Full Text Available
Practical Integration via Separable Bijective Networks

Bender, Christopher M.; Emmanuel, Patrick; Reiter, Michael K.; Oliva, Junier B. (April 2022, International Conference on Learning Representations)

Neural networks have enabled learning over examples that contain thousands of dimensions. However, most of these models are limited to training and evaluating on a finite collection of points and do not consider the hypervolume in which the data resides. Any analysis of the model’s local or global behavior is therefore limited to very expensive or imprecise estimators. We propose to formulate neural networks as a composition of a bijective (flow) network followed by a learnable, separable network. This construction allows for learning (or assessing) over full hypervolumes with precise estimators at tractable computational cost via integration over the input space. We develop the necessary machinery, propose several practical integrals to use during training, and demonstrate their utility.
more » « less
Full Text Available

« Prev Next »

Search for: All records