skip to main content


Title: The Capacity of Private Information Retrieval from Decentralized Uncoded Caching Databases
We consider the private information retrieval (PIR) problem from decentralized uncoded caching databases. There are two phases in our problem setting, a caching phase, and a retrieval phase. In the caching phase, a data center containing all the K files, where each file is of size L bits, and several databases with storage size constraint μ K L bits exist in the system. Each database independently chooses μ K L bits out of the total K L bits from the data center to cache through the same probability distribution in a decentralized manner. In the retrieval phase, a user (retriever) accesses N databases in addition to the data center, and wishes to retrieve a desired file privately. We characterize the optimal normalized download cost to be D * = ∑ n = 1 N + 1 N n - 1 μ n - 1 ( 1 - μ ) N + 1 - n 1 + 1 n + ⋯ + 1 n K - 1 . We show that uniform and random caching scheme which is originally proposed for decentralized coded caching by Maddah-Ali and Niesen, along with Sun and Jafar retrieval scheme which is originally proposed for PIR from replicated databases surprisingly results in the lowest normalized download cost. This is the decentralized counterpart of the recent result of Attia, Kumar, and Tandon for the centralized case. The converse proof contains several ingredients such as interference lower bound, induction lemma, replacing queries and answering string random variables with the content of distributed databases, the nature of decentralized uncoded caching databases, and bit marginalization of joint caching distributions.  more » « less
Award ID(s):
1713977
NSF-PAR ID:
10201781
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Information
Volume:
10
Issue:
12
ISSN:
2078-2489
Page Range / eLocation ID:
372
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Private information retrieval (PIR) allows a user to retrieve a desired message out of K possible messages from N databases without revealing the identity of the desired message. There has been significant recent progress on understanding fundamental information-theoretic limits of PIR, and in particular the download cost of PIR for several variations. Majority of existing works however, assume the presence of replicated databases, each storing all the K messages. In this work, we consider the problem of PIR from storage constrained databases. Each database has a storage capacity of μKL bits, where K is the number of messages, L is the size of each message in bits, and μ ∈ [1/N, 1] is the normalized storage. In the storage constrained PIR problem, there are two key design questions: a) how to store content across each database under storage constraints; and b) construction of schemes that allow efficient PIR through storage constrained databases. The main contribution of this work is a general achievable scheme for PIR from storage constrained databases for any value of storage. In particular, for any (N, K), with normalized storage μ = t/N, where the parameter t can take integer values t ∈ {1, 2, ..., N}, we show that our proposed PIR scheme achieves a download cost of (1 + 1/t + 1/2 + ⋯ + 1/t K-1 ). The extreme case when μ = 1 (i.e., t = N) corresponds to the setting of replicated databases with full storage. For this extremal setting, our scheme recovers the information-theoretically optimal download cost characterized by Sun and Jafar as (1 + 1/N + ⋯ +1/N K-1 ). For the other extreme, when μ = 1/N (i.e., t = 1), the proposed scheme achieves a download cost of K. The most interesting aspect of the result is that for intermediate values of storage, i.e., 1/N <; μ <; 1, the proposed scheme can strictly outperform memory-sharing between extreme values of storage. 
    more » « less
  2. Private information retrieval (PIR) allows a user to retrieve a desired message out of K possible messages from N databases (DBs) without revealing the identity of the desired message. In this work, we consider the problem of PIR from uncoded storage constrained DBs. Each DB has a storage capacity of μKL bits, where L is the size of each message in bits, and μ ∈ [1/N, 1] is the normalized storage. In the storage constrained PIR problem, there are two key challenges: a) construction of communication efficient schemes through storage content design at each DB that allow download efficient PIR; and b characterizing the optimal download cost via information-theoretic lower bounds. The novel aspect of this work is to characterize the optimum download cost of PIR with storage constrained DBs for any value of storage. In particular, for any (N, K), we show that the optimal tradeoff between storage (μ) and the download cost (D(μ)) is given by the lower convex hull of the pairs ([t/N](1+[1/t]+[1/(t 2 )]+...+[1/(t K-1 )])) for t = 1,2, ..., N. The main contribution of this paper is the converse proof, i.e., obtaining lower bounds on the download cost for PIR as a function of the available storage. 
    more » « less
  3. In the problem of cache-aided Multiuser Private Information Retrieval (MuPIR), a set of Ku cache-aided users wish to download their desired messages from a set of N distributed non-colluding databases each holding a library of K independent messages. The communication load of this problem is defined as the total number of bits downloaded (normalized by the message length) by the users. The goal is to find the optimal memory-load trade-off under the constraint of user demand privacy, which ensures that any individual database learns nothing about the demands of the users. In this paper, for the MuPIR problem with K = 2 messages, K u = 2 users and N ≥ 2 databases, we provide achievability for the memory-load pairs (N-1/2N, N+1/N) and(2(N-1)/2N-1,N+1/2N-1) by constructing specific achievable schemes based on the novel idea of Private Cacheaided Interference Alignment (PCIA). We prove that the proposed scheme is optimal if the cache placement is uncoded (i.e., users directly cache a subset of the library bits). Computer-aided investigation also shows that the proposed schemes are optimal in general when N = 2, 3. 
    more » « less
  4. In the problem of cache-aided Multiuser Private Information Retrieval (MuPIR), a set of K u cache-aided users wish to download their desired messages from a set of N distributed non-colluding databases each holding a library of K independent messages. The communication load of this problem is defined as the total number of bits downloaded (normalized by the message length) by the users. The goal is to find the optimal memory-load trade-off under the constraint of user demand privacy, which ensures that any individual database learns nothing about the demands of the users. In this paper, for the MuPIR problem with K=2 messages, Ku=2 users and N≥2 databases, we provide achievability for the memory-load pairs (N−12N,N+1N) and (2(N−1)2N−1,N+12N−1) by constructing specific achievable schemes based on the novel idea of Private Cache-aided Interference Alignment (PCIA). We prove that the proposed scheme is optimal if the cache placement is uncoded (i.e., users directly cache a subset of the library bits). Computer-aided investigation also shows that the proposed schemes are optimal in general when N=2,3. 
    more » « less
  5. We propose capacity-achieving schemes for private information retrieval (PIR) from uncoded databases (DBs) with both homogeneous and heterogeneous storage constraints. In the PIR setting, a user queries a set of DBs to privately download a message, where privacy implies that no one DB can infer which message the user desires. In general, a PIR scheme is comprised of storage placement and delivery designs. Previous works have derived the capacity, or infimum download cost, of PIR with uncoded storage placement and sufficient conditions of storage placement to meet capacity. However, the currently proposed storage placement designs require splitting each message into an exponential number of sub-messages with respect to the number of DBs. In this work, when DBs have the same storage constraint, we propose two simple storage placement designs that satisfy the capacity conditions. Then, for more general heterogeneous storage constraints, we translate the storage placement design process into a “filling problem”. We design an iterative algorithm to solve the filling problem where, in each iteration, messages are partitioned into sub-messages and stored at subsets of DBs. All of our proposed storage placement designs require a number of sub-messages per message at most equal to the number of DBs. 
    more » « less