Consider a setting with N independent individuals, each with an unknown parameter, p_i in [0,1] drawn from some unknown distribution P*. After observing the outcomes of t independent Bernoulli trials, i.e., Xi ~ Binomial(t, p_i) per individual, our objective is to accurately estimate P*. This problem arises in numerous domains, including the social sciences, psychology, healthcare, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where t << N , the maximum likelihood estimator (MLE) is both statisticallyminimax optimal and efficiently computable. Precisely, for sufficiently large N , the MLE achieves the information theoretic optimal error bound of O(1/sqrt(t log N)) for N< exp(t), and O(1/t) for N> exp(t), with regards to the L1 distance between the true cdf and the estimated cdf.
more »
« less
Maximum Likelihood Estimation for Learning Populations of Parameters
Consider a setting with N independent individuals, each with an unknown parameter, p_i in [0,1] drawn from some unknown distribution P*. After observing the outcomes of t independent Bernoulli trials, i.e., Xi ~ Binomial(t, p_i) per individual, our objective is to accurately estimate P*. This problem arises in numerous domains, including the social sciences, psychology, healthcare, and biology, where the size of the population under study is usually large while the number of observations per individual is often limited. Our main result shows that, in the regime where t << N , the maximum likelihood estimator (MLE) is both statistically minimax optimal and efficiently computable. Precisely, for sufficiently large N , the MLE achieves the information theoretic optimal error bound of O(1/sqrt(t log N)) for N< exp(t), and O(1/t) for N> exp(t), with regards to the L1 distance between the true cdf and the estimated cdf.
more »
« less
- Award ID(s):
- 1704417
- PAR ID:
- 10098844
- Date Published:
- Journal Name:
- International Conference on Machine Learning (ICML)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola (Ed.)We revisit the noisy binary search model of [Karp and Kleinberg, 2007], in which we have n coins with unknown probabilities p_i that we can flip. The coins are sorted by increasing p_i, and we would like to find where the probability crosses (to within ε) of a target value τ. This generalized the fixed-noise model of [Burnashev and Zigangirov, 1974], in which p_i = 1/2 ± ε, to a setting where coins near the target may be indistinguishable from it. It was shown in [Karp and Kleinberg, 2007] that Θ(1/ε² log n) samples are necessary and sufficient for this task. We produce a practical algorithm by solving two theoretical challenges: high-probability behavior and sharp constants. We give an algorithm that succeeds with probability 1-δ from 1/C_{τ, ε} ⋅ (log₂ n + O(log^{2/3} n log^{1/3} 1/(δ) + log 1/(δ))) samples, where C_{τ, ε} is the optimal such constant achievable. For δ > n^{-o(1)} this is within 1 + o(1) of optimal, and for δ ≪ 1 it is the first bound within constant factors of optimal.more » « less
-
The goal of the trace reconstruction problem is to recover a string x E {0, 1} given many independent traces of x, where a trace is a subsequence obtained from deleting bits of x independently with some given probability. In this paper we consider two kinds of algorithms for the trace reconstruction problem. We first observe that the state-of-the-art result of Chase (STOC 2021), which is based on statistics of arbitrary length-k subsequences, can also be obtained by considering the “k-mer statistics”, i.e., statistics regarding occurrences of contiguous k-bit strings (a.k.a, k-mers) in the initial string x, for k = Mazooji and Shomorony (ISIT 2023) show that such statistics (called k-mer density map) can be estimated within accuracy from poly(n, 2k, l/e) traces. We call an algorithm to be k-mer-based if it reconstructs x given estimates of the k-mer density map. Such algorithms essentially capture all the analyses in the worst-case and smoothed-complexity models of the trace reconstruction problem we know of so far. Our first, and technically more involved, result shows that any k-mer-based algorithm for trace reconstruction must use exp n)) traces, under the assumption that the estimator requires poly(2k, 1 e) traces, thus establishing the optimality of this number of traces. Our analysis also shows that the analysis technique used by Chase is essentially tight, and hence new techniques are needed in order to improve the worst-case upper bound. Our second, simple, result considers the performance of the Maximum Likelihood Estimator (MLE), which specifically picks the source string that has the maximum likelihood to generate the samples (traces). We show that the MLE algorithm uses a nearly optimal number of traces, i.e., up to a factor of n in the number of samples needed for an optimal algorithm, and show that this factor of n loss may be necessary under general “model estimation” settings.more » « less
-
The goal of the trace reconstruction problem is to recover a string x E {0, 1} given many independent traces of x, where a trace is a subsequence obtained from deleting bits of x independently with some given probability. In this paper we consider two kinds of algorithms for the trace reconstruction problem. We first observe that the state-of-the-art result of Chase (STOC 2021), which is based on statistics of arbitrary length-k subsequences, can also be obtained by considering the “k-mer statistics”, i.e., statistics regarding occurrences of contiguous k-bit strings (a.k.a, k-mers) in the initial string x, for k = Mazooji and Shomorony (ISIT 2023) show that such statistics (called k-mer density map) can be estimated within accuracy from poly(n, 2k, l/e) traces. We call an algorithm to be k-mer-based if it reconstructs x given estimates of the k-mer density map. Such algorithms essentially capture all the analyses in the worst-case and smoothed-complexity models of the trace reconstruction problem we know of so far. Our first, and technically more involved, result shows that any k-mer-based algorithm for trace reconstruction must use exp n)) traces, under the assumption that the estimator requires poly(2k, 1 e) traces, thus establishing the optimality of this number of traces. Our analysis also shows that the analysis technique used by Chase is essentially tight, and hence new techniques are needed in order to improve the worst-case upper bound. Our second, simple, result considers the performance of the Maximum Likelihood Estimator (MLE), which specifically picks the source string that has the maximum likelihood to generate the samples (traces). We show that the MLE algorithm uses a nearly optimal number of traces, i.e., up to a factor of n in the number of samples needed for an optimal algorithm, and show that this factor of n loss may be necessary under general “model estimation” settings.more » « less
-
Abstract Singmaster’s conjecture asserts that every natural number greater than one occurs at most a bounded number of times in Pascal’s triangle; that is, for any natural number $$t \geq 2$$, the number of solutions to the equation $$\binom{n}{m} = t$$ for natural numbers $$1 \leq m \lt n$$ is bounded. In this paper we establish this result in the interior region $$\exp(\log^{2/3+\varepsilon} n) \leq m \leq n - \exp(\log^{2/3+\varepsilon} n)$$ for any fixed ɛ > 0. Indeed, when t is sufficiently large depending on ɛ, we show that there are at most four solutions (or at most two in either half of Pascal’s triangle) in this region. We also establish analogous results for the equation $$(n)_m = t$$, where $$(n)_m := n(n-1) \dots (n-m+1)$$ denotes the falling factorial.more » « less
An official website of the United States government

