NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Learning to Help in Multi-Class Settings

Wu, Wu; Li, Yansong; Dong, Zeyu; Sathyavageeswaran, Nitya; Sarwate, Anand D (March 2025, The Thirteenth International Conference on Learning Representations)

Deploying complex machine learning models on resource-constrained devices is challenging due to limited computational power, memory, and model retrainability. To address these limitations, a hybrid system can be established by augmenting the local model with a server-side model, where samples are selectively deferred by a rejector and then sent to the server for processing. The hybrid system enables efficient use of computational resources while minimizing the overhead associated with server usage. The recently proposed Learning to Help (L2H) model proposed training a server model given a fixed local (client) model. This differs from the Learning to Defer (L2D) framework which trains the client for a fixed (expert) server. In both L2D and L2H, the training includes learning a rejector at the client to determine when to query the server. In this work, we extend the L2H model from binary to multi-class classification problems and demonstrate its applicability in a number of different scenarios of practical interest in which access to the server may be limited by cost, availability, or policy. We derive a stage-switching surrogate loss function that is differentiable, convex, and consistent with the Bayes rule corresponding to the 0-1 loss for the L2H model. Experiments show that our proposed methods offer an efficient and practical solution for multi-class classification in resource-constrained environments.
more » « less
Free, publicly-accessible full text available March 1, 2026
Timely Offloading in Mobile Edge Cloud Systems

https://doi.org/10.1109/ITW61385.2024.10806971

Sathyavageeswaran, Nitya; Yates, Roy D; Sarwate, Anand D; Mandayam, Narayan (November 2024, IEEE)

Future real-time applications like smart cities will use complex Machine Learning (ML) models for a variety of tasks. Timely status information is required for these applications to be reliable. Offloading computation to a mobile edge cloud (MEC) can reduce the completion time of these tasks. However, using the MEC may come at a cost such as related to use of a cloud service or privacy. In this paper, we consider a source that generates time-stamped status updates for delivery to a monitor after processing by the mobile device or MEC. We study how a scheduler must forward these updates to achieve timely updates at the monitor but also limit MEC usage. We measure timeliness at the monitor using the age of information (AoI) metric. We formulate this problem as an infinite horizon Markov decision process (MDP) with an average cost criterion. We prove that an optimal scheduling policy has an age-threshold structure that depends on how long an update has been in service.
more » « less
Free, publicly-accessible full text available November 24, 2025
Machine Learning with Differential Privacy

https://doi.org/10.1201/9781003185284-9

Sarwate, Anand D (August 2024, Chapman and Hall/CRC)

Full Text Available
Structured Low-Rank Tensors for Generalized Linear Models

Taki, Batoul A; Sarwate, Anand; Bajwa, Waheed U (August 2023, Transactions on machine learning research)

Recent works have shown that imposing tensor structures on the coefficient tensor in regression problems can lead to more reliable parameter estimation and lower sample complexity compared to vector-based methods. This work investigates a new low-rank tensor model, called Low Separation Rank (LSR), in Generalized Linear Model (GLM) problems. The LSR model – which generalizes the well-known Tucker and CANDECOMP/PARAFAC (CP) models, and is a special case of the Block Tensor Decomposition (BTD) model – is imposed onto the coefficient tensor in the GLM model. This work proposes a block coordinate descent algorithm for parameter estimation in LSR-structured tensor GLMs. Most importantly, it derives a minimax lower bound on the error threshold on estimating the coefficient tensor in LSR tensor GLM problems. The minimax bound is proportional to the intrinsic degrees of freedom in the LSR tensor GLM problem, suggesting that its sample complexity may be significantly lower than that of vectorized GLMs. This result can also be specialised to lower bound the estimation error in CP and Tucker-structured GLMs. The derived bounds are comparable to tight bounds in the literature for Tucker linear regression, and the tightness of the minimax lower bound is further assessed numerically. Finally, numerical experiments on synthetic datasets demonstrate the efficacy of the proposed LSR tensor model for three regression types (linear, logistic and Poisson). Experiments on a collection of medical imaging datasets demonstrate the usefulness of the LSR model over other tensor models (Tucker and CP) on real, imbalanced data with limited available samples. License: Creative Commons Attribution 4.0 International (CC BY 4.0)
more » « less
Full Text Available
Computationally Efficient Codes for Adversarial Binary-Erasure Channels

https://doi.org/10.1109/ISIT54713.2023.10206731

Li, Sijie; Krishnan, Prasad; Jaggi, Sidharth; Langberg, Michael; Sarwate, Anand D. (June 2023, IEEE)

We study communication models for channels with erasures in which the erasure pattern can be controlled by an adversary with partial knowledge of the transmitted codeword. In particular, we design block codes for channels with binary inputs with an adversary who can erase a fraction p of the transmitted bits. We consider causal adversaries, who must choose to erase an input bit using knowledge of that bit and previously transmitted bits, and myopic adversaries, who can choose an erasure pattern based on observing the transmitted codeword through a binary erasure channel with random erasures. For both settings we design efficient (polynomial time) encoding and decoding algorithms that use randomization at the encoder only. Our constructions achieve capacity for the causal and “sufficiently myopic” models. For the “insufficiently myopic” adversary, the capacity is unknown, but existing converses show the capacity is zero for a range of parameters. For all parameters outside of that range, our construction achieves positive rates.
more » « less
Approximating Functions with Approximate Privacy for Applications in Signal Estimation and Learning

https://doi.org/10.3390/e25050825

Tasnim, Naima; Mohammadi, Jafar; Sarwate, Anand D.; Imtiaz, Hafiz (May 2023, Entropy)

Large corporations, government entities and institutions such as hospitals and census bureaus routinely collect our personal and sensitive information for providing services. A key technological challenge is designing algorithms for these services that provide useful results, while simultaneously maintaining the privacy of the individuals whose data are being shared. Differential privacy (DP) is a cryptographically motivated and mathematically rigorous approach for addressing this challenge. Under DP, a randomized algorithm provides privacy guarantees by approximating the desired functionality, leading to a privacy–utility trade-off. Strong (pure DP) privacy guarantees are often costly in terms of utility. Motivated by the need for a more efficient mechanism with better privacy–utility trade-off, we propose Gaussian FM, an improvement to the functional mechanism (FM) that offers higher utility at the expense of a weakened (approximate) DP guarantee. We analytically show that the proposed Gaussian FM algorithm can offer orders of magnitude smaller noise compared to the existing FM algorithms. We further extend our Gaussian FM algorithm to decentralized-data settings by incorporating the CAPE protocol and propose capeFM. Our method can offer the same level of utility as its centralized counterparts for a range of parameter choices. We empirically show that our proposed algorithms outperform existing state-of-the-art approaches on synthetic and real datasets.
more » « less
Full Text Available
Differential Fairness: An Intersectional Framework for Fair AI

https://doi.org/10.3390/e25040660

Islam, Rashidul; Keya, Kamrun Naher; Pan, Shimei; Sarwate, Anand D.; Foulds, James R. (April 2023, Entropy)

We propose definitions of fairness in machine learning and artificial intelligence systems that are informed by the framework of intersectionality, a critical lens from the legal, social science, and humanities literature which analyzes how interlocking systems of power and oppression affect individuals along overlapping dimensions including gender, race, sexual orientation, class, and disability. We show that our criteria behave sensibly for any subset of the set of protected attributes, and we prove economic, privacy, and generalization guarantees. Our theoretical results show that our criteria meaningfully operationalize AI fairness in terms of real-world harms, making the measurements interpretable in a manner analogous to differential privacy. We provide a simple learning algorithm using deterministic gradient methods, which respects our intersectional fairness criteria. The measurement of fairness becomes statistically challenging in the minibatch setting due to data sparsity, which increases rapidly in the number of protected attributes and in the values per protected attribute. To address this, we further develop a practical learning algorithm using stochastic gradient methods which incorporates stochastic estimation of the intersectional fairness criteria on minibatches to scale up to big data. Case studies on census data, the COMPAS criminal recidivism dataset, the HHP hospitalization data, and a loan application dataset from HMDA demonstrate the utility of our methods.
more » « less
Full Text Available
Quadratically Constrained Myopic Adversarial Channels

https://doi.org/10.1109/TIT.2022.3167554

Zhang, Yihan; Vatedka, Shashank; Jaggi, Sidharth; Sarwate, Anand D. (August 2022, IEEE Transactions on Information Theory)

Full Text Available
The Capacity of Causal Adversarial Channels

https://doi.org/10.1109/ISIT50566.2022.9834709

Zhang, Yihan; Jaggi, Sidharth; Langberg, Michael; Sarwate, Anand D. (June 2022, IEEE International Symposium on Information Theory (ISIT))

Full Text Available
Low-Rank Phase Retrieval with Structured Tensor Models

https://doi.org/10.1109/ICASSP43922.2022.9746452

Kwon, Soo Min; Li, Xin; Sarwate, Anand D. (May 2022, 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

We study the low-rank phase retrieval problem, where the objective is to recover a sequence of signals (typically images) given the magnitude of linear measurements of those signals. Existing solutions involve recovering a matrix constructed by vectorizing and stacking each image. These solutions model this matrix to be low-rank and leverage the low-rank property to decrease the sample complexity required for accurate recovery. However, when the number of available measurements is more limited, these low-rank matrix models can often fail. We propose an algorithm called Tucker-Structured Phase Retrieval (TSPR) that models the sequence of images as a tensor rather than a matrix that we factorize using the Tucker decomposition. This factorization reduces the number of parameters that need to be estimated, allowing for a more accurate reconstruction. We demonstrate the effectiveness of our approach on real video datasets under several different measurement models.
more » « less
Full Text Available

« Prev Next »

Search for: All records