# Search for:All records

Editors contains: "Shpitser, Ilya"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

1. ; (Ed.)
We consider the problem of maximizing submodular functions under submodular constraints by formulating the problem in two ways: \SCSKC and \DiffC. Given two submodular functions $f$ and $g$ where $f$ is monotone, the objective of \SCSKC problem is to find a set $S$ of size at most $k$ that maximizes $f(S)$ under the constraint that $g(S)\leq \theta$, for a given value of $\theta$. The problem of \DiffC focuses on finding a set $S$ of size at most $k$ such that $h(S) = f(S)-g(S)$ is maximized. It is known that these problems are highly inapproximable and do not admit any constant factor multiplicative approximation algorithms unless NP is easy. Known approximation algorithms involve data-dependent approximation factors that are not efficiently computable. We initiate a study of the design of approximation algorithms where the approximation factors are efficiently computable. For the problem of \SCSKC, we prove that the greedy algorithm produces a solution whose value is at least $(1-1/e)f(\OPT) - A$, where $A$ is the data-dependent additive error. For the \DiffC problem, we design an algorithm that uses the \SCSKC greedy algorithm as a subroutine. This algorithm produces a solution whose value is at least $(1-1/e)h(\OPT)-B$, where $B$ is also a data-dependent additive error. A salient feature of our approach is that the additive error terms can be computed efficiently, thus enabling us to ascertain the quality of the solutions produced.
more » « less
Free, publicly-accessible full text available July 31, 2024
2. ; (Ed.)
Just-in-Time Adaptive Interventions (JITAIs) are a class of personalized health interventions developed within the behavioral science community. JITAIs aim to provide the right type and amount of support by iteratively selecting a sequence of intervention options from a pre-defined set of components in response to each individual's time varying state. In this work, we explore the application of reinforcement learning methods to the problem of learning intervention option selection policies. We study the effect of context inference error and partial observability on the ability to learn effective policies. Our results show that the propagation of uncertainty from context inferences is critical to improving intervention efficacy as context uncertainty increases, while policy gradient algorithms can provide remarkable robustness to partially observed behavioral state information.
more » « less
Free, publicly-accessible full text available July 31, 2024
3. ; (Ed.)
Crowdsourcing is an effective and efficient paradigm for obtaining labels for unlabeled corpus employing crowd workers. This work considers the budget allocation problem for a generalized setting on a graph of instances to be labeled where edges encode instance dependencies. Specifically, given a graph and a labeling budget, we propose an optimal policy to allocate the budget among the instances to maximize the overall labeling accuracy. We formulate the problem as a Bayesian Markov Decision Process (MDP), where we define our task as an optimization problem that maximizes the overall label accuracy under budget constraints. Then, we propose a novel stage-wise reward function that considers the effect of worker labels on the whole graph at each timestamp. This reward function is utilized to find an optimal policy for the optimization problem. Theoretically, we show that our proposed policies are consistent when the budget is infinite. We conduct extensive experiments on five real-world graph datasets and demonstrate the effectiveness of the proposed policies to achieve a higher label accuracy under budget constraints.
more » « less
Free, publicly-accessible full text available July 31, 2024
4. ; (Ed.)
Most existing approaches of differentially private (DP) machine learning focus on private training. Despite its many advantages, private training lacks the flexibility in adapting to incremental changes to the training dataset such as deletion requests from exercising GDPR’s right to be forgotten. We revisit a long-forgotten alternative, known as private prediction, and propose a new algorithm named Individual Kernelized Nearest Neighbor (Ind-KNN). Ind-KNN is easily updatable over dataset changes and it allows precise control of the Rényi DP at an individual user level — a user’s privacy loss is measured by the exact amount of her contribution to predictions; and a user is removed if her prescribed privacy budget runs out. Our results show that Ind-KNN consistently improves the accuracy over existing private prediction methods for a wide range of epsilon on four vision and language tasks. We also illustrate several cases under which Ind-KNN is preferable over private training with NoisySGD.
more » « less
Free, publicly-accessible full text available July 31, 2024