We revisit the problem of finding B-block-long collisions in Merkle-Damg˚ard Hash Functions in the auxiliary-input random oracle model, in which an attacker gets a piece of S-bit advice about the random oracle and makes T oracle queries. Akshima, Cash, Drucker and Wee (CRYPTO 2020), based on the work of Coretti, Dodis, Guo and Steinberger (EUROCRYPT 2018), showed a simple attack for 2 ≤ B ≤ T (with respect to a random salt). The attack achieves advantage Ω( e ST B/2 n + T 2/2 n) where n is the output length of the random oracle. They conjectured that this attack is optimal. However, this so-called STB conjecture was only proved for B ≈ T and B = 2. Very recently, Ghoshal and Komargodski (CRYPTO 22) confirmed STB conjecture for all constant values of B, and provided an Oe(S 4T B2/2 n + T 2/2 n) bound for all choices of B. In this work, we prove an Oe((ST B/2 n)· max{1, ST2/2 n}+T 2/2 n) bound for every 2 < B < T. Our bound confirms the STB conjecture for ST2 ≤ 2 n, and is optimal up to a factor of S for ST2 > 2 n (note as T 2 is always at most 2n, otherwise finding a collision is trivial by the birthday attack). Our result subsumes all previous upper bounds for all ranges of parameters except for B = Oe(1) and ST2 > 2 n. We obtain our results by adopting and refining the technique of Chung, Guo, Liu, and Qian (FOCS 2020). Our approach yields more modular proofs and sheds light on how to bypass the limitations of prior techniques. Along the way, we obtain a considerably simpler and illuminating proof for B = 2, recovering the main result of Akshima, Cash, Drucker and Wee. 
                        more » 
                        « less   
                    
                            
                            Neural Network-based Estimation of the MMSE
                        
                    
    
            The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable S given another random variable T is of much interest in a variety of statistical contexts. Motivated by a growing interest in auditing machine learning models for unintended information leakage, we propose a neural network-based estimator of this MMSE. We derive a lower bound for the MMSE based on the proposed estimator and the Barron constant associated with the conditional expectation of S given T . Since the latter is typically unknown in practice, we derive a general bound for the Barron constant that produces order optimal estimates for canonical distribution models. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10297195
- Date Published:
- Journal Name:
- International Symposium on Information Theory
- Page Range / eLocation ID:
- 1023 to 1028
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            null (Ed.)We study collision-finding against Merkle-Damgård hashing in the random-oracle model by adversaries with an arbitrary S-bit auxiliary advice input about the random oracle and T queries. Recent work showed that such adversaries can find collisions (with respect to a random IV) with advantage 𝛺(𝑆𝑇2/2𝑛) , where n is the output length, beating the birthday bound by a factor of S. These attacks were shown to be optimal. We observe that the collisions produced are very long, on the order of T blocks, which would limit their practical relevance. We prove several results related to improving these attacks to find shorter collisions. We first exhibit a simple attack for finding B-block-long collisions achieving advantage 𝛺̃ (𝑆𝑇𝐵/2𝑛) . We then study if this attack is optimal. We show that the prior technique based on the bit-fixing model (used for the 𝑆𝑇2/2𝑛 bound) provably cannot reach this bound, and towards a general result we prove there are qualitative jumps in the optimal attacks for finding length 1, length 2, and unbounded-length collisions. Namely, the optimal attacks achieve (up to logarithmic factors) on the order of (𝑆+𝑇)/2𝑛 , 𝑆𝑇/2𝑛 and 𝑆𝑇2/2𝑛 advantage. We also give an upper bound on the advantage of a restricted class of short-collision finding attacks via a new analysis on the growth of trees in random functional graphs that may be of independent interest.more » « less
- 
            null (Ed.)Summary We investigate optimal subsampling for quantile regression. We derive the asymptotic distribution of a general subsampling estimator and then derive two versions of optimal subsampling probabilities. One version minimizes the trace of the asymptotic variance-covariance matrix for a linearly transformed parameter estimator and the other minimizes that of the original parameter estimator. The former does not depend on the densities of the responses given covariates and is easy to implement. Algorithms based on optimal subsampling probabilities are proposed and asymptotic distributions, and the asymptotic optimality of the resulting estimators are established. Furthermore, we propose an iterative subsampling procedure based on the optimal subsampling probabilities in the linearly transformed parameter estimation which has great scalability to utilize available computational resources. In addition, this procedure yields standard errors for parameter estimators without estimating the densities of the responses given the covariates. We provide numerical examples based on both simulated and real data to illustrate the proposed method.more » « less
- 
            We study algorithms using randomized value functions for exploration in reinforcement learning. This type of algorithms enjoys appealing empirical performance. We show that when we use 1) a single random seed in each episode, and 2) a Bernstein-type magnitude of noise, we obtain a worst-case O~(H√SAT) regret bound for episodic time-inhomogeneous Markov Decision Process where S is the size of state space, A is the size of action space, H is the planning horizon and T is the number of interactions. This bound polynomially improves all existing bounds for algorithms based on randomized value functions, and for the first time, matches the Ω(H√SAT) lower bound up to logarithmic factors. Our result highlights that randomized exploration can be near-optimal, which was previously achieved only by optimistic algorithms. To achieve the desired result, we develop 1) a new clipping operation to ensure both the probability of being optimistic and the probability of being pessimistic are lower bounded by a constant, and 2) a new recursive formula for the absolute value of estimation errors to analyze the regret.more » « less
- 
            Conditional Mutual Information (CMI) is a measure of conditional dependence between random variables X and Y, given another random variable Z. It can be used to quantify conditional dependence among variables in many data-driven inference problems such as graphical models, causal learning, feature selection and time-series analysis. While k-nearest neighbor (kNN) based estimators as well as kernel-based methods have been widely used for CMI estimation, they suffer severely from the curse of dimensionality. In this paper, we leverage advances in classifiers and generative models to design methods for CMI estimation. Specifically, we introduce an estimator for KL-Divergence based on the likelihood ratio by training a classifier to distinguish the observed joint distribution from the product distribution. We then show how to construct several CMI estimators using this basic divergence estimator by drawing ideas from conditional generative models. We demonstrate that the estimates from our proposed approaches do not degrade in performance with increasing dimension and obtain significant improvement over the widely used KSG estimator. Finally, as an application of accurate CMI estimation, we use our best estimator for conditional independence testing and achieve superior performance than the state-of-the-art tester on both simulated and real data-sets.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    