Database driven dynamic spectrum sharing is one of the most promising dynamic spectrum access (DSA) solution to address the spectrum scarcity issue. In such a database driven DSA system, the centralized spectrum management infrastructure, called spectrum access system (SAS), makes its spectrum allocation decisions to secondary users (SUs) according to sensitive operational data of incumbent users (IUs). Since both SAS and SUs are not necessarily fully trusted, privacy protection against untrusted SAS and SUs become critical for IUs that have high operational privacy requirements. To address this problem, many IU privacy preserving solutions emerge recently. However, there is a lack of understanding and comparison of capability in protecting IU operational privacy under these existing approaches. In this paper, thus, we fill in the void by providing a comparative study that investigates existing solutions and explores several existing metrics to evaluate the strength of privacy protection. Moreover, we propose two general metrics to evaluate privacy preserving level and evaluate existing works with them.
more »
« less
PReVer: Towards Private Regulated Verified Data
Data privacy has garnered significant attention recently due to diverse applications that store sensitive data in untrusted infrastructure. From a data management point of view, the focus has been on the privacy of stored data and the privacy of querying data at a large scale. However, databases are not solely query engines on static data, they must support updates on dynamically evolving datasets. In this paper, we lay out a vision for privacy-preserving dynamic data. In particular, we focus on dynamic data that might be stored remotely on untrusted providers. Updates arrive at a provider and are verified and incorporated into the database based on predefined constraints. Depending on the application, the content of the stored data, the content of the updates and the constraints may be private or public. We then propose PReVer, a universal framework for managing regulated dynamic data in a privacy-preserving manner. We explore a set of research challenges that PReVer needs to address in order to guarantee the privacy of data, updates, and/or constraints and address the consistent and verifiable execution of updates. This opens the space of privacy-preserving data management from the narrow perspective of private queries on static datasets to the larger space of private management of dynamic data.
more »
« less
- Award ID(s):
- 1703560
- PAR ID:
- 10331213
- Date Published:
- Journal Name:
- Proceedings of the 25th International Conference on Extending Database Technology, (EDBT'2022)
- Page Range / eLocation ID:
- 454-461
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we consider privacy-preserving update strategies for secure outsourced growing databases. Such databases allow appendonly data updates on the outsourced data structure while analysis is ongoing. Despite a plethora of solutions to securely outsource database computation, existing techniques do not consider the information that can be leaked via update patterns. To address this problem, we design a novel secure outsourced database framework for growing data, DP-Sync, which interoperate with a large class of existing encrypted databases and supports efficient updates while providing differentially-private guarantees for any single update. We demonstrate DP-Sync's practical feasibility in terms of performance and accuracy with extensive empirical evaluations on real world datasets.more » « less
-
This paper studies a distributed optimization problem in the federated learning (FL) framework under differential privacy constraints, whereby a set of clients having local samples are connected to an untrusted server, who wants to learn a global model while preserving the privacy of clients’ local datasets. We propose a new client sampling called self-sampling that reflects the random availability of clients in the learning process in FL. We analyze the differential privacy of the SGD with client self-sampling by composing amplification by sub-sampling along with amplification by shuffling. Furthermore, we analyze the convergence of the proposed SGD algorithm showing that we can get a reasonable learning performance while preserving the privacy of clients’ data even with client self-sampling.more » « less
-
We perform a rigorous study of private matrix analysis when only the last 𝑊 updates to matrices are considered useful for analysis. We show the existing framework in the non-private setting is not robust to noise required for privacy. We then propose a framework robust to noise and use it to give first efficient 𝑜(𝑊) space differentially private algorithms for spectral approximation, principal component analysis (PCA), multi-response linear regression, sparse PCA, and non-negative PCA. Prior to our work, no such result was known for sparse and non-negative differentially private PCA even in the static data setting. We also give a lower bound to demonstrate the cost of privacy in the sliding window model.more » « less
-
Tensor factorization has been demonstrated as an efficient approach for computational phenotyping, where massive electronic health records (EHRs) are converted to concise and meaningful clinical concepts. While distributing the tensor factorization tasks to local sites can avoid direct data sharing, it still requires the exchange of intermediary results which could reveal sensitive patient information. Therefore, the challenge is how to jointly decompose the tensor under rigorous and principled privacy constraints, while still support the model's interpretability. We propose DPFact, a privacy-preserving collaborative tensor factorization method for computational phenotyping using EHR. It embeds advanced privacy-preserving mechanisms with collaborative learning. Hospitals can keep their EHR database private but also collaboratively learn meaningful clinical concepts by sharing differentially private intermediary results. Moreover, DPFact solves the heterogeneous patient population using a structured sparsity term. In our framework, each hospital decomposes its local tensors and sends the updated intermediary results with output perturbation every several iterations to a semi-trusted server which generates the phenotypes. The evaluation on both real-world and synthetic datasets demonstrated that under strict privacy constraints, our method is more accurate and communication-efficient than state-of-the-art baseline methods.more » « less