skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Adversarially-Robust Inference on Trees via Belief Propagation
We introduce and study the problem of posterior inference on tree-structured graphical models in the presence of a malicious adversary who can corrupt some observed nodes. In the well-studied broadcasting on trees model, corresponding to the ferromagnetic Ising model on a d-regular tree with zero external field, when a natural signal-to-noise ratio exceeds one (the celebrated Kesten-Stigum threshold), the posterior distribution of the root given the leaves is bounded away from Ber(1/2), and carries nontrivial information about the sign of the root. This posterior distribution can be computed exactly via dynamic programming, also known as belief propagation. We first confirm a folklore belief that a malicious adversary who can corrupt an inverse-polynomial fraction of the leaves of their choosing makes this inference impossible. Our main result is that accurate posterior inference about the root vertex given the leaves is possible when the adversary is constrained to make corruptions at a ρ-fraction of randomly-chosen leaf vertices, so long as the signal-to-noise ratio exceeds O(logd) and ρ≤cε for some universal c>0. Since inference becomes information-theoretically impossible when ρ≫ε, this amounts to an information-theoretically optimal fraction of corruptions, up to a constant multiplicative factor. Furthermore, we show that the canonical belief propagation algorithm performs this inference.  more » « less
Award ID(s):
2238080
PAR ID:
10568955
Author(s) / Creator(s):
;
Publisher / Repository:
Conference on Learning Theory. The 37th Annual Conference on Learning Theory (COLT 2024)
Date Published:
Format(s):
Medium: X
Location:
Edmonton, Canada
Sponsoring Org:
National Science Foundation
More Like this
  1. The dueling bandits problem has received a lot of attention in recent years due to its applications in recommendation systems and information retrieval. However, due to the prevalence of malicious users in these systems, it is becoming increasingly important to design dueling bandit algorithms that are robust to corruptions introduced by these malicious users. In this paper we study dueling bandits in the presence of an adversary that can corrupt some of the feedback received by the learner. We propose an algorithm for this problem that is agnostic to the amount of corruption introduced by the adversary: its regret degrades gracefully with the amount of corruption, and in case of no corruption, it essentially matches the optimal regret bounds achievable in the purely stochastic dueling bandits setting. 
    more » « less
  2. Rank aggregation from pairwise preferences has widespread applications in recommendation systems and information retrieval. Given the enormous economic and societal impact of these applications, and the consequent incentives for malicious players to manipulate ranking outcomes in their favor, an important challenge is to make rank aggregation algorithms robust to adversarial manipulations in data. In this paper, we initiate the study of robustness in rank aggregation under the popular Bradley-Terry-Luce (BTL) model for pairwise comparisons. We consider a setting where pairwise comparisons are initially generated according to a BTL model, but a fraction of these comparisons are corrupted by an adversary prior to being reported to us. We consider a strong contamination model, where an adversary having complete knowledge of the initial truthful data and the underlying true BTL parameters, can subsequently corrupt the truthful data by inserting, deleting, or changing data points. The goal is to estimate the true score/weight of each item under the BTL model, even in the presence of these corruptions. We characterize the extent of adversarial corruption under which the true BTL parameters are uniquely identifiable. We also provide a novel pruning algorithm that provably cleans the data of adversarial corruption under reasonable conditions on data generation and corruption. We corroborate our theory with experiments on both synthetic as well as real data showing that previous algorithms are vulnerable to even small amounts of corruption, whereas our algorithm can clean a reasonably high amount of corruption. 
    more » « less
  3. Oshman, Rotem (Ed.)
    Broadcast protocols enable a set of n parties to agree on the input of a designated sender, even in the face of malicious parties who collude to attack the protocol. In the honest-majority setting, a fruitful line of work harnessed randomization and cryptography to achieve low-communication broadcast protocols with sub-quadratic total communication and with "balanced" sub-linear communication cost per party. However, comparatively little is known in the dishonest-majority setting. Here, the most communication-efficient constructions are based on the protocol of Dolev and Strong (SICOMP '83), and sub-quadratic broadcast has not been achieved even using randomization and cryptography. On the other hand, the only nontrivial ω(n) communication lower bounds are restricted to deterministic protocols, or against strong adaptive adversaries that can perform "after the fact" removal of messages. We provide communication lower bounds in this space, which hold against arbitrary cryptography and setup assumptions, as well as a simple protocol showing near tightness of our first bound. - Static adversary. We demonstrate a tradeoff between resiliency and communication for randomized protocols secure against n-o(n) static corruptions. For example, Ω(n⋅ polylog(n)) messages are needed when the number of honest parties is n/polylog(n); Ω(n√n) messages are needed for O(√n) honest parties; and Ω(n²) messages are needed for O(1) honest parties. Complementarily, we demonstrate broadcast with O(n⋅polylog(n)) total communication and balanced polylog(n) per-party cost, facing any constant fraction of static corruptions. - Weakly adaptive adversary. Our second bound considers n/2 + k corruptions and a weakly adaptive adversary that cannot remove messages "after the fact." We show that any broadcast protocol within this setting can be attacked to force an arbitrary party to send messages to k other parties. Our bound implies limitations on the feasibility of balanced low-communication protocols: For example, ruling out broadcast facing 51% corruptions, in which all non-sender parties have sublinear communication locality. 
    more » « less
  4. We present a general Bernoulli Gaussian scale mixture based approach for modeling priors that can represent a large class of random signals. For inference, we introduce belief propagation (BP) to multi-snapshot signal recovery based on the minimum mean square error estimation criteria. Our method relies on intra-snapshot messages that update the signal vector for each snapshot and inter-snapshot messages that share probabilistic information related to the common sparsity structure across snapshots. Despite the very general model, our BP method can efficiently compute accurate approximations of marginal posterior PDFs. Preliminary numerical results illustrate the superior convergence rate and improved performance of the proposed method compared to approaches based on sparse Bayesian learning (SBL). 
    more » « less
  5. We present a general Bernoulli Gaussian scale mixture based approach for modeling priors that can represent a large class of random signals. For inference, we introduce belief propagation (BP) to multi-snapshot signal recovery based on the minimum mean square error estimation criteria. Our method relies on intra-snapshot messages that update the signal vector for each snapshot and inter-snapshot messages that share probabilistic information related to the common sparsity structure across snapshots. Despite the very general model, our BP method can efficiently compute accurate approximations of marginal posterior PDFs. Preliminary numerical results illustrate the superior convergence rate and improved performance of the proposed method compared to approaches based on sparse Bayesian learning (SBL). 
    more » « less