Conformal Frequency Estimation with Sketched Data

Sesia, Matteo; Favaro, Stefano

Citation Details

A flexible conformal inference method is developed to construct confidence intervals for the frequencies of queried objects in very large data sets, based on a much smaller sketch of those data. The approach is data-adaptive and requires no knowledge of the data distribution or of the details of the sketching algorithm; instead, it constructs provably valid frequentist confidence intervals under the sole assumption of data exchangeability. Although our solution is broadly applicable, this paper focuses on applications involving the count-min sketch algorithm and a non-linear variation thereof. The performance is compared to that of frequentist and Bayesian alternatives through simulations and experiments with data sets of SARS-CoV-2 DNA sequences and classic English literature. more »

Award ID(s):: 2210637

PAR ID:: 10408706

Author(s) / Creator(s):: Sesia, Matteo; Favaro, Stefano

Date Published:: 2022-07-01

Journal Name:: Advances in neural information processing systems

Volume:: 35

ISSN:: 1049-5258

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this