skip to main content


Title: Batched Differentially Private Information Retrieval
Private Information Retrieval (PIR) allows several clients to query a database held by one or more servers, such that the contents of their queries remain private. Prior PIR schemes have achieved sublinear communication and computation by leveraging computational assumptions, federating trust among many servers, relaxing security to permit differentially private leakage, refactoring effort into an offline stage to reduce online costs, or amortizing costs over a large batch of queries. In this work, we present an efficient PIR protocol that combines all of the above techniques to achieve constant amortized communication and computation complexity in the size of the database and constant client work. We leverage differentially private leakage in order to provide better trade-offs between privacy and efficiency. Our protocol achieves speedups up to and exceeding 10x in practical settings compared to state of the art PIR protocols, and can scale to batches with hundreds of millions of queries on cheap commodity AWS machines. Our protocol builds upon a new secret sharing scheme that is both incremental and non-malleable, which may be of interest to a wider audience. Our protocol provides security up to abort against malicious adversaries that can corrupt all but one party.  more » « less
Award ID(s):
1718135 1931714
NSF-PAR ID:
10358604
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
31st USENIX Security Symposium
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Nissim, K. ; Waters, B. (Ed.)
    Recent new constructions of rate-1 OT [Döttling, Garg, Ishai, Malavolta, Mour, and Ostrovsky, CRYPTO 2019] have brought this primitive under the spotlight and the techniques have led to new feasibility results for private-information retrieval, and homomorphic encryption for branching programs. The receiver communication of this construction consists of a quadratic (in the sender's input size) number of group elements for a single instance of rate-1 OT. Recently [Garg, Hajiabadi, Ostrovsky, TCC 2020] improved the receiver communication to a linear number of group elements for a single string-OT. However, most applications of rate-1 OT require executing it multiple times, resulting in large communication costs for the receiver. In this work, we introduce a new technique for amortizing the cost of multiple rate-1 OTs. Specifically, based on standard pairing assumptions, we obtain a two-message rate-1 OT protocol for which the amortized cost per string-OT is asymptotically reduced to only four group elements. Our results lead to significant communication improvements in PSI and PIR, special cases of SFE for branching programs. - PIR: We obtain a rate-1 PIR scheme with client communication cost of $O(\lambda\cdot\log N)$ group elements for security parameter $\lambda$ and database size $N$. Notably, after a one-time setup (or one PIR instance), any following PIR instance only requires communication cost $O(\log N)$ number of group elements. - PSI with unbalanced inputs: We apply our techniques to private set intersection with unbalanced set sizes (where the receiver has a smaller set) and achieve receiver communication of $O((m+\lambda) \log N)$ group elements where $m, N$ are the sizes of the receiver and sender sets, respectively. Similarly, after a one-time setup (or one PSI instance), any following PSI instance only requires communication cost $O(m \cdot \log N)$ number of group elements. All previous sublinear-communication non-FHE based PSI protocols for the above unbalanced setting were also based on rate-1 OT, but incurred at least $O(\lambda^2 m \log N)$ group elements. 
    more » « less
  2. Private information retrieval (PIR) enables clients to query and retrieve data from untrusted servers without the untrusted servers learning which data was retrieved. In this paper, we present a new class of multi-server PIR protocols, which we call heterogeneous PIR (HPIR). In such multi-server PIR protocols, the computation and communication overheads imposed on the PIR servers are non-uniform, i.e., some servers handle higher computation/communication burdens than the others. This enables heterogeneous PIR protocols to be suitable for a range of new PIR applications. What enables us to enforce such heterogeneity is a unique PIR-tailored secret sharing algorithm that we leverage in building our PIR protocol. We have implemented our HPIR protocol and evaluated its performance in comparison with regular (i.e., homogenous) PIR protocols. Our evaluations demonstrate that a querying client can trade off the computation and communication loads of the (heterogeneous) PIR servers by adjusting some parameters. For example in a two-server scenario with a heterogeneity degree of 4/1, to retrieve a 456KB file from a 0.2GB database, the rich (i.e., resourceful) PIR server will do 1.1 seconds worth of computation compared to 0.3 seconds by the poor (resource-constrained) PIR server; this is while each of the servers would do the same 1 seconds of computation in a homogeneous setting. Also, for this given example, our HPIR protocol will impose a 912KB communication bandwidth on the rich server compared to 228KB on the poor server (by contrast to 456KB overheads on each of the servers for a traditional homogeneous design). 
    more » « less
  3. Abstract We study both the practical and theoretical efficiency of private information retrieval (PIR) protocols in a model wherein several untrusted servers work to obliviously service remote clients’ requests for data and yet no pair of servers colludes in a bid to violate said obliviousness. In exchange for such a strong security assumption, we obtain new PIR protocols exhibiting remarkable efficiency with respect to every cost metric—download, upload, computation, and round complexity—typically considered in the PIR literature. The new constructions extend a multiserver PIR protocol of Shah, Rashmi, and Ramchandran (ISIT 2014), which exhibits a remarkable property of its own: to fetch a b -bit record from a collection of r such records, the client need only download b + 1 bits total. We find that allowing “a bit more” download (and optionally introducing computational assumptions) yields a family of protocols offering very attractive trade-offs. In addition to Shah et al.’s protocol, this family includes as special cases (2-server instances of) the seminal protocol of Chor, Goldreich, Kushilevitz, and Sudan (FOCS 1995) and the recent DPF-based protocol of Boyle, Gilboa, and Ishai (CCS 2016). An implicit “folklore” axiom that dogmatically permeates the research literature on multiserver PIR posits that the latter protocols are the “most efficient” protocols possible in the perfectly and computationally private settings, respectively. Yet our findings soundly refute this supposed axiom: These special cases are (by far) the least performant representatives of our family, with essentially all other parameter settings yielding instances that are significantly faster. 
    more » « less
  4. Many relational data in our daily life are represented as graphs, making graph application an important workload. Because of the large scale of graph datasets, moving graph data to the cloud becomes a popular option. To keep the confidential and private graph secure from an untrusted cloud server, many cryptographic techniques are leveraged to hide the content of the data. However, protecting only the data content is not enough for a graph database. Because the structural information of the graph can be revealed through the database accessing track.

    In this work, we study the graph neural network (GNN), an important graph workload to mine information from a graph database. We find that the server is able to infer which node is processing during the edge retrieving phase and also learn its neighbor indices during GNN's aggregation phase. This leads to the leakage of the information of graph structure data. In this work, we present SPG, a structure-private graph database with SqueezePIR. Our SPG is built on top of Private Information Retrieval (PIR), which securely hides which nodes/neighbors are accessed. In addition, we propose SqueezePIR, a compression technique to overcome the computation overhead of PIR. Based on our evaluation, our SqueezePIR achieves 11.85× speedup on average with less than 2% accuracy loss when compared to the state-of-the-art FastPIR protocol.

     
    more » « less
  5. We present SimplePIR, the fastest single-server private information retrieval scheme known to date. SimplePIR’s security holds under the learning-with-errors assumption. To answer a client’s query, the SimplePIR server performs fewer than one 32-bit multiplication and one 32-bit addition per database byte. SimplePIR achieves 10 GB/s/core server throughput, which approaches the memory bandwidth of the machine and the performance of the fastest two-server private-information-retrieval schemes (which require non-colluding servers). SimplePIR has relatively large communication costs: to make queries to a 1 GB database, the client must download a 121 MB "hint" about the database contents; thereafter, the client may make an unbounded number of queries, each requiring 242 KB of communication. We present a second single-server scheme, DoublePIR, that shrinks the hint to 16 MB at the cost of slightly higher per-query communication (345 KB) and slightly lower throughput (7.4 GB/s/core). Finally, we apply our new private-information-retrieval schemes, together with a novel data structure for approximate set membership, to the task of private auditing in Certificate Transparency. We achieve a strictly stronger notion of privacy than Google Chrome’s current approach with modest communication overheads: 16 MB of download per month, along with 150 bytes per TLS connection. 
    more » « less