In networks consisting of agents communicating with a central coordinator and working together to solve a global optimization problem in a distributed manner, the agents are often required to solve private proximal minimization subproblems. Such a setting often requires a further decomposition method to solve the global distributed problem, resulting in extensive communication overhead. In networks where communication is expensive, it is crucial to reduce the communication overhead of the distributed optimization scheme. Integrating Gaussian processes (GP) as a learning component to the Alternating Direction Method of Multipliers (ADMM) has proven effective in learning each agent's local proximal operator to reduce the required communication exchange. In this work, we propose to combine this learning method with adaptive uniform quantization in a hybrid approach that can achieve further communication reduction when solving a distributed optimization problem with ADMM. This adaptive quantization first considers setting the mid-value and window length according to the mean and covariance given by GP. In a later stage of our study, this adaptation is extended to also consider the variation of the quantization bit resolution. In addition, a convergence analysis of this setting is derived, leading to convergence conditions and error bounds in the cases where convergence cannot be formally proven. Furthermore, we study the impact of the communication decision-making of the coordinator, leading to the proposition of several query strategies using the agent's uncertainty measures given by the regression process. Extensive numerical experiments of a distributed sharing problem with quadratic cost functions for the agents have been conducted throughout this study. The results have demonstrated that the various algorithms proposed have successfully achieved their primary goal of minimizing the overall communication overhead while ensuring that the global solutions maintain satisfactory levels of accuracy. The favorable accuracy observed in the numerical experiments is consistent with the findings of the derived convergence analysis. In instances where convergence proof is lacking, we have shown that the overall ADMM residual remains bounded by a diminishing threshold. This implies that we can anticipate our algorithmic solutions to closely approximate the actual solution, thus validating the reliability of our approaches.
more »
« less
Distributed and Asynchronous Coordination of a Mixed-Integer Linear System via Surrogate Lagrangian Relaxation
With the emergence of the Internet of Things that allows communications and local computations and with the vision of Industry 4.0, a foreseeable transition is from centralized system planning and operation toward decentralization with interacting components and subsystems, e.g., self-optimizing factories. In this article, a new ``price-based'' decomposition and coordination methodology is developed to efficiently coordinate a system consisting of distributed subsystems such as machines and parts, which are described by mixed-integer linear programming (MILP) formulations, in an asynchronous way. The novel method is a dual approach, whereby the coordination is performed by updating Lagrangian multipliers based on economic principles of ``supply and demand.'' To ensure low communication requirements within the method, exchanges between the ``coordinator'' and subsystems are limited to ``prices'' (Lagrangian multipliers) broadcast by the coordinator and to subsystem solutions sent at the coordinator. Asynchronous coordination, however, may lead to convergence difficulties since the order in which subsystem solutions arrive at the coordinator is not predefined as a result of uncertainties in communication and solving times. Under realistic assumptions of finite communication and solve times, the convergence of our method is proven by innovatively extending the Lyapunov stability theory. Numerical testing of generalized assignment problems through simulation demonstrates that the method converges fast and provides near-optimal results, paving the way for self-optimizing factories in the future. Accompanying CPLEX codes and data are included.
more »
« less
- Award ID(s):
- 1810108
- PAR ID:
- 10175097
- Date Published:
- Journal Name:
- IEEE Transactions on Automation Science and Engineering
- Volume:
- Early Access
- ISSN:
- 1545-5955
- Page Range / eLocation ID:
- 1 to 15
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
We propose a new distributed learning-based framework for stability assessment of a class of networked nonlinear systems, where each subsystem is dissipative. The aim is to learn, in a distributed manner, a Lyapunov function and associated region of attraction for the networked system. We begin by using a neural network function approximation to learn a storage function for each subsystem such that the subsystem satisfies a local dissipativity property. We next use a satisfiability modulo theories (SMT) solver based falsifier that verifies the local dissipativity of each subsystem by deter- mining an absence of counterexamples that violate the local dissipativity property, as established by the neural network approximation. Finally, we verify network-level stability by using an alternating direction method of multipliers (ADMM) approach to update the storage function of each subsystem in a distributed manner until a global stability condition for the network of dissipative subsystems is satisfied. This step also leads to a network-level Lyapunov function that we then use to estimate a region of attraction. We illustrate the proposed algorithm and its advantages on a microgrid interconnection with power electronics interfaces.more » « less
-
Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private. This form of collaborative learning exposes new tradeoffs among model convergence speed, model accuracy, balance across clients, and communication cost, with new challenges including: (1) straggler problem—where clients lag due to data or (computing and network) resource heterogeneity, and (2) communication bottleneck—where a large number of clients communicate their local updates to a central server and bottleneck the server. Many existing FL methods focus on optimizing along only one single dimension of the tradeoff space. Existing solutions use asynchronous model updating or tiering-based, synchronous mechanisms to tackle the straggler problem. However, asynchronous methods can easily create a communication bottleneck, while tiering may introduce biases that favor faster tiers with shorter response latencies. To address these issues, we present FedAT, a novel Federated learning system with Asynchronous Tiers under Non-i.i.d. training data. FedAT synergistically combines synchronous, intra-tier training and asynchronous, cross-tier training. By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler effect with improved convergence speed and test accuracy. FedAT uses a straggler-aware, weighted aggregation heuristic to steer and balance the training across clients for further accuracy improvement. FedAT compresses uplink and downlink communications using an efficient, polyline-encoding-based compression algorithm, which minimizes the communication cost. Results show that FedAT improves the prediction performance by up to 21.09% and reduces the communication cost by up to 8.5×, compared to state-of-the-art FL methods.more » « less
-
In distributed optimization schemes consisting of a group of agents connected to a central coordinator, the optimization algorithm often involves the agents solving private local sub-problems and exchanging data frequently with the coordinator to solve the global distributed problem. In those cases, the query-response mechanism usually causes excessive communication costs to the system, necessitating communication reduction in scenarios where communication is costly. Integrating Gaussian processes (GP) as a learning component to the Alternating Direction Method of Multipliers (ADMM) has proven effective in learning each agent’s local proximal operator to reduce the required communication exchange. A key element for integrating GP into the ADMM algorithm is the querying mechanism upon which the coordinator decides when communication with an agent is required. In this paper, we formulate a general querying decision framework as an optimization problem that balances reducing the communication cost and decreasing the prediction error. Under this framework, we propose a joint query strategy that takes into account the joint statistics of the query and ADMM variables and the total communication cost of all agents in the presence of uncertainty caused by the GP regression. In addition, we derive three different decision mechanisms that simplify the general framework by making the communication decision for each agent individually. We integrate multiple measures to quantify the trade-off between the communication cost reduction and the optimization solution’s accuracy/optimality. The proposed methods can achieve significant communication reduction and good optimization solution accuracy for distributed optimization, as demonstrated by extensive simulations of a distributed sharing problem.more » « less
-
Resource control in heterogeneous computers built with subsystems from different vendors is challenging. There is a tension between the need to quickly generate local decisions in each subsystem and the desire to coordinate the different subsystems for global optimization. In practice, global coordination among subsystems is considered hard, and current commercial systems use centralized controllers. The result is high response time and high design cost due to lack of modularity. To control emerging heterogeneous computers effectively, we propose a new control framework called Tangram that is fast, glob- ally coordinated, and modular. Tangram introduces a new formal controller that combines multiple engines for optimization and safety, and has a standard interface. Building the controller for a subsystem requires knowing only about that subsystem. As a het- erogeneous computer is assembled, the controllers in the different subsystems are connected hierarchically, exchanging standard co- ordination signals. To demonstrate Tangram, we prototype it in a heterogeneous server that we assemble using components from multiple vendors. Compared to state-of-the-art control, Tangram re- duces, on average, the execution time of heterogeneous applications by 31% and their energy-delay product by 39%.more » « less
An official website of the United States government

