skip to main content


The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 11:00 PM ET on Thursday, May 23 until 2:00 AM ET on Friday, May 24 due to maintenance. We apologize for the inconvenience.

Title: A Shannon-Theoretic Approach to the Storage–Retrieval Trade-Off in PIR Systems
We consider the storage–retrieval rate trade-off in private information retrieval (PIR) systems using a Shannon-theoretic approach. Our focus is mostly on the canonical two-message two-database case, for which a coding scheme based on random codebook generation and the binning technique is proposed. This coding scheme reveals a hidden connection between PIR and the classic multiple description source coding problem. We first show that when the retrieval rate is kept optimal, the proposed non-linear scheme can achieve better performance over any linear scheme. Moreover, a non-trivial storage-retrieval rate trade-off can be achieved beyond space-sharing between this extreme point and the other optimal extreme point, achieved by the retrieve-everything strategy. We further show that with a method akin to the expurgation technique, one can extract a zero-error PIR code from the random code. Outer bounds are also studied and compared to establish the superiority of the non-linear codes over linear codes.  more » « less
Award ID(s):
2045656 2007108 1816546 2007067
Author(s) / Creator(s):
; ;
Publisher / Repository:
Date Published:
Journal Name:
Page Range / eLocation ID:
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Private information retrieval (PIR) allows a user to retrieve a desired message out of K possible messages from N databases without revealing the identity of the desired message. There has been significant recent progress on understanding fundamental information-theoretic limits of PIR, and in particular the download cost of PIR for several variations. Majority of existing works however, assume the presence of replicated databases, each storing all the K messages. In this work, we consider the problem of PIR from storage constrained databases. Each database has a storage capacity of μKL bits, where K is the number of messages, L is the size of each message in bits, and μ ∈ [1/N, 1] is the normalized storage. In the storage constrained PIR problem, there are two key design questions: a) how to store content across each database under storage constraints; and b) construction of schemes that allow efficient PIR through storage constrained databases. The main contribution of this work is a general achievable scheme for PIR from storage constrained databases for any value of storage. In particular, for any (N, K), with normalized storage μ = t/N, where the parameter t can take integer values t ∈ {1, 2, ..., N}, we show that our proposed PIR scheme achieves a download cost of (1 + 1/t + 1/2 + ⋯ + 1/t K-1 ). The extreme case when μ = 1 (i.e., t = N) corresponds to the setting of replicated databases with full storage. For this extremal setting, our scheme recovers the information-theoretically optimal download cost characterized by Sun and Jafar as (1 + 1/N + ⋯ +1/N K-1 ). For the other extreme, when μ = 1/N (i.e., t = 1), the proposed scheme achieves a download cost of K. The most interesting aspect of the result is that for intermediate values of storage, i.e., 1/N <; μ <; 1, the proposed scheme can strictly outperform memory-sharing between extreme values of storage. 
    more » « less
  2. This paper treats point-to-point, multiple access and random access lossless source coding in the finite-blocklength regime. A random coding technique is developed, and its power in analyzing the third-order coding performance is demonstrated in all three scenarios. Results include a third-order-optimal characterization of the Slepian-Wolf rate region and a proof showing that for dependent sources, the independent encoders used by Slepian-Wolf codes can achieve the same third-order- optimal performance as a single joint encoder. The concept of random access source coding, which generalizes the multiple access scenario to allow for a subset of participating encoders that is unknown a priori to both the encoders and the decoder, is introduced. Contributions include a new definition of the probabilistic model for a random access-discrete multiple source, a general random access source coding scheme that employs a rateless code with sporadic feedback, and an analysis demonstrating via a random coding argument that there exists a deterministic code of the proposed structure that simultaneously achieves the third- order-optimal performance of Slepian-Wolf codes for all possible subsets of encoders. 
    more » « less
  3. null (Ed.)
    We consider the private information retrieval (PIR) problem from decentralized uncoded caching databases. There are two phases in our problem setting, a caching phase, and a retrieval phase. In the caching phase, a data center containing all the K files, where each file is of size L bits, and several databases with storage size constraint μ K L bits exist in the system. Each database independently chooses μ K L bits out of the total K L bits from the data center to cache through the same probability distribution in a decentralized manner. In the retrieval phase, a user (retriever) accesses N databases in addition to the data center, and wishes to retrieve a desired file privately. We characterize the optimal normalized download cost to be D * = ∑ n = 1 N + 1 N n - 1 μ n - 1 ( 1 - μ ) N + 1 - n 1 + 1 n + ⋯ + 1 n K - 1 . We show that uniform and random caching scheme which is originally proposed for decentralized coded caching by Maddah-Ali and Niesen, along with Sun and Jafar retrieval scheme which is originally proposed for PIR from replicated databases surprisingly results in the lowest normalized download cost. This is the decentralized counterpart of the recent result of Attia, Kumar, and Tandon for the centralized case. The converse proof contains several ingredients such as interference lower bound, induction lemma, replacing queries and answering string random variables with the content of distributed databases, the nature of decentralized uncoded caching databases, and bit marginalization of joint caching distributions. 
    more » « less
  4. This paper formulates the cache-aided multi-user Private Information Retrieval (MuPIR) problem, including K u cache-equipped users, each of which wishes to retrieve a desired message efficiently from N distributed databases with access to K independent messages. Privacy of the users’ demands requires that any individual database can not learn anything about the demands of the users. The load of this problem is defined as the average number of downloaded bits per desired message bit. The goal is to find the optimal memory-load trade-off while preserving the demand privacy. Besides the formulation of the MuPIR problem, the contribution of this paper is two-fold. First, we characterize the optimal memory-load trade-off for a system with N = 2 databases, K = 2 messages and K u = 2 users demanding distinct messages; Second, a product design with order optimality guarantee is proposed. In addition, the product design can achieve the optimal load when the cache memory is large enough. The product design embeds the well-known Sun-Jafar PIR scheme into coded caching, in order to benefit from the coded caching gain while preserving the privacy of the users’ demands. 
    more » « less
  5. Underwater networks of wireless sensors deployed along the coast or in the deep water are the most promising solution for the development of underwater monitoring, exploration and surveillance applications. A key feature of underwater networks that can significantly enhance current monitoring applications is the ability to accommodate real-time video information on an underwater communication link. In fact, while today monitoring relies on the exchange of simple discrete information, e.g., water temperature, and particle concentration, among others, by introducing real-time streaming capability of non-static images between wireless underwater nodes one can completely revolutionize the whole underwater monitoring scenario. To achieve this goal, underwater links are required to support a sufficiently high data rate, compatible with the streaming rates of the transmitted video sequence. Unfortunately, the intrinsic characteristic of the underwater propagation medium has made this objective extremely challenging. In this paper, we present the first physical layer transmission scheme for short-range and high-data rate ultrasonic underwater communications. The proposed solution, which we will refer to as Underwater UltraSonar (U2S), is based on the idea of transmitting short information-bearing carrierless ultrasonic signals, e.g., pulses, following a pseudo-random adaptive time-hopping pattern with a superimposed rate-adaptive Reed-Solomon forward error correction (FEC) channel coding. We also present the design of the first prototype of a software-defined underwater ultrasonic transceiver that implements U2S PHY transmission scheme through which we evaluate the U2S performance in real-scenario underwater experiments at the PHY layer, i.e., Bit Error Rate (BER) extensively, and at the application layer, i.e., structural similarity (SSIM) index. Results show that U2S links can support point-to-point data rate up to 1.38 Mbps and that by leveraging the flexibility of the adaptive time-hopping and adaptive channel coding techniques, one can trade between link throughput and energy consumption, still satisfying application layer requirements. 
    more » « less