skip to main content


Title: Rényi Differentially Private ADMM for Non-Smooth Regularized Optimization
In this paper we consider the problem of minimizing composite objective functions consisting of a convex differentiable loss function plus a non-smooth regularization term, such as $L_1$ norm or nuclear norm, under Rényi differential privacy (RDP). To solve the problem, we propose two stochastic alternating direction method of multipliers (ADMM) algorithms: ssADMM based on gradient perturbation and mpADMM based on output perturbation. Both algorithms decompose the original problem into sub-problems that have closed-form solutions. The first algorithm, ssADMM, applies the recent privacy amplification result for RDP to reduce the amount of noise to add. The second algorithm, mpADMM, numerically computes the sensitivity of ADMM variable updates and releases the updated parameter vector at the end of each epoch. We compare the performance of our algorithms with several baseline algorithms on both real and simulated datasets. Experimental results show that, in high privacy regimes (small ε), ssADMM and mpADMM outperform baseline algorithms in terms of classification and feature selection performance, respectively.  more » « less
Award ID(s):
1943046
NSF-PAR ID:
10215646
Author(s) / Creator(s):
;
Date Published:
Journal Name:
CODASPY '20: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy
Page Range / eLocation ID:
319 to 328
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Estimation of Markov Random Field and covariance models from high-dimensional data represents a canonical problem that has received a lot of attention in the literature. A key assumption, widely employed, is that of sparsity of the underlying model. In this paper, we study the problem of estimating such models exhibiting a more intricate structure comprising simultaneously of sparse, structured sparse and dense components. Such structures naturally arise in several scientific fields, including molecular biology, finance and political science. We introduce a general framework based on a novel structured norm that enables us to estimate such complex structures from high-dimensional data. The resulting optimization problem is convex and we introduce a linearized multi-block alternating direction method of multipliers (ADMM) algorithm to solve it efficiently. We illustrate the superior performance of the proposed framework on a number of synthetic data sets generated from both random and structured networks. Further, we apply the method to a number of real data sets and discuss the results. 
    more » « less
  2. Alternating direction method of multiplier (ADMM) is a popular method used to design distributed versions of a machine learning algorithm, whereby local computations are performed on local data with the output exchanged among neighbors in an iterative fashion. During this iterative process the leakage of data privacy arises. A differentially private ADMM was proposed in prior work (Zhang & Zhu, 2017) where only the privacy loss of a single node during one iteration was bounded, a method that makes it difficult to balance the tradeoff between the utility attained through distributed computation and privacy guarantees when considering the total privacy loss of all nodes over the entire iterative process. We propose a perturbation method for ADMM where the perturbed term is correlated with the penalty parameters; this is shown to improve the utility and privacy simultaneously. The method is based on a modified ADMM where each node independently determines its own penalty parameter in every iteration and decouples it from the dual updating step size. The condition for convergence of the modified ADMM and the lower bound on the convergence rate are also derived. 
    more » « less
  3. We consider the problem of subspace clustering with data that is potentially corrupted by both dense noise and sparse gross errors. In particular, we study a recently proposed low rank subspace clustering approach based on a nonconvex modeling formulation. This formulation includes a nonconvex spectral function in the objective function that makes the optimization task challenging, e.g., it is unknown whether the alternating direction method of multipliers (ADMM) framework proposed to solve the nonconvex model formulation is provably convergent. In this paper, we establish that the spectral function is differentiable and give a formula for computing the derivative. Moreover, we show that the derivative of the spectral function is Lipschitz continuous and provide an explicit value for the Lipschitz constant. These facts are then used to provide a lower bound for how the penalty parameter in the ADMM method should be chosen. As long as the penalty parameter is chosen according to this bound, we show that the ADMM algorithm computes iterates that have a limit point satisfying first-order optimality conditions. We also present a second strategy for solving the nonconvex problem that is based on proximal gradient calculations. The convergence and performance of the algorithms is verified through experiments on real data from face and digit clustering and motion segmentation. 
    more » « less
  4. While embracing various machine learning techniques to make effective decisions in the big data era, preserving the privacy of sensitive data poses significant challenges. In this paper, we develop a privacy-preserving distributed machine learning algorithm to address this issue. Given the assumption that each data provider owns a dataset with different sample size, our goal is to learn a common classifier over the union of all the local datasets in a distributed way without leaking any sensitive information of the data samples. Such an algorithm needs to jointly consider efficient distributed learning and effective privacy preservation. In the proposed algorithm, we extend stochastic alternating direction method of multipliers (ADMM) in a distributed setting to do distributed learning. For preserving privacy during the iterative process, we combine differential privacy and stochastic ADMM together. In particular, we propose a novel stochastic ADMM based privacy-preserving distributed machine learning (PS-ADMM) algorithm by perturbing the updating gradients, that provide differential privacy guarantee and have a low computational cost. We theoretically demonstrate the convergence rate and utility bound of our proposed PS-ADMM under strongly convex objective. Through our experiments performed on real-world datasets, we show that PS-ADMM outperforms other differentially private ADMM algorithms under the same differential privacy guarantee. 
    more » « less
  5. SUMMARY

    Repeatedly recording seismic data over a period of months or years is one way to identify trapped oil and gas and to monitor CO2 injection in underground storage reservoirs and saline aquifers. This process of recording data over time and then differencing the images assumes the recording of the data over a particular subsurface region is repeatable. In other words, the hope is that one can recover changes in the Earth when the survey parameters are held fixed between data collection times. Unfortunately, perfect experimental repeatability almost never occurs. Acquisition inconsistencies such as changes in weather (currents, wind) for marine seismic data are inevitable, resulting in source and receiver location differences between surveys at the very least. Thus, data processing aimed at improving repeatability between baseline and monitor surveys is extremely useful. One such processing tool is regularization (or binning) that aligns multiple surveys with different source or receiver configurations onto a common grid. Data binned onto a regular grid can be stored in a high-dimensional data structure called a tensor with, for example, x and y receiver coordinates and time as indices of the tensor. Such a higher-order data structure describing a subsection of the Earth often exhibits redundancies which one can exploit to fill in gaps caused by sampling the surveys onto the common grid. In fact, since data gaps and noise increase the rank of the tensor, seeking to recover the original data by reducing the rank (low-rank tensor-based completion) successfully fills in gaps caused by binning. The tensor nuclear norm (TNN) is defined by the tensor singular value decomposition (tSVD) which generalizes the matrix SVD. In this work we complete missing time-lapse data caused by binning using the alternating direction method of multipliers (or ADMM) to minimize the TNN. For a synthetic experiment with three parabolic events in which the time-lapse difference involves an amplitude increase in one of these events between baseline and monitor data sets, the binning and reconstruction algorithm (TNN-ADMM) correctly recovers this time-lapse change. We also apply this workflow of binning and TNN-ADMM reconstruction to a real marine survey from offshore Western Australia in which the binning onto a regular grid results in significant data gaps. The data after reconstruction varies continuously without the large gaps caused by the binning process.

     
    more » « less