skip to main content

Title: Accelerated consensus in multi-agent networks via memory of local averages
Classical mathematical models of information sharing and updating in multi-agent networks use linear operators. In the paradigmatic DeGroot model, for example, agents update their states with linear combinations of their neighbors’ current states. In prior work, an accelerated averaging model employing the use of memory has been suggested to accelerate convergence to a consensus state for undirected networks. There, the DeGroot update on the current states is followed by a linear combination with the previous states. We propose a modification where the DeGroot update is applied to the current and previous states and is then followed by a linear combination step. We show that this simple modification applied to undirected networks permits convergence even for periodic networks. Further, it allows for faster convergence than DeGroot and accelerated averaging models for suitable networks and model parameters.
Award ID(s):
Publication Date:
Journal Name:
Proceedings of the 60th IEEE Conference on Decision and Control (CDC 2021)
Sponsoring Org:
National Science Foundation
More Like this
  1. The effective resistance between a pair of nodes in a weighted undirected graph is defined as the potential difference induced between them when a unit current is injected at the first node and extracted at the second node, treating edge weights as the conductance values of edges. The effective resistance is a key quantity of interest in many applications and fields including solving linear systems, Markov Chains and continuous-time averaging networks. We develop an efficient linearly convergent distributed algorithm for computing effective resistances and demonstrate its performance through numerical studies. We also apply our algorithm to the consensus problem where the aim is to compute the average of node values in a distributed manner. We show that the distributed algorithm we developed for effective resistances can be used to accelerate the convergence of the classical consensus iterations considerably by a factor depending on the network structure.
  2. The significance of the water-side gas transfer velocity for air–sea CO2 gas exchange (k) and its non-linear dependence on wind speed (U) is well accepted. What remains a subject of inquiry are biases associated with the form of the non-linear relation linking k to U (hereafter labeled as f(U), where f(.) stands for an arbitrary function of U), the distributional properties of U (treated as a random variable) along with other external factors influencing k, and the time-averaging period used to determine k from U. To address the latter issue, a Taylor series expansion is applied to separate f(U) into a term derived from time-averaging wind speed (labeled as ⟨U⟩, where ⟨.⟩ indicates averaging over a monthly time scale) as currently employed in climate models and additive bias corrections that vary with the statistics of U. The method was explored for nine widely used f(U) parameterizations based on remotely-sensed 6-hourly global wind products at 10 m above the sea-surface. The bias in k of monthly estimates compared to the reference 6-hourly product was shown to be mainly associated with wind variability captured by the standard deviation σσU around ⟨U⟩ or, more preferably, a dimensionless coefficient of variation Iu= σσU/⟨U⟩. The proposedmore »correction outperforms previous methodologies that adjusted k when using ⟨U⟩ only. An unexpected outcome was that upon setting Iu2 = 0.15 to correct biases when using monthly wind speed averages, the new model produced superior results at the global and regional scale compared to prior correction methodologies. Finally, an equation relating Iu2 to the time-averaging interval (spanning from 6 h to a month) is presented to enable other sub-monthly averaging periods to be used. While the focus here is on CO2, the theoretical tactic employed can be applied to other slightly soluble gases. As monthly and climatological wind data are often used in climate models for gas transfer estimates, the proposed approach provides a robust scheme that can be readily implemented in current climate models.« less
  3. When training machine learning models using stochastic gradient descent (SGD) with a large number of nodes or massive edge devices, the communication cost of synchronizing gradients at every iteration is a key bottleneck that limits the scalability of the system and hinders the benefit of parallel computation. Local-update SGD algorithms, where worker nodes perform local iterations of SGD and periodically synchronize their local models, can effectively reduce the communication frequency and save the communication delay. In this paper, we propose a powerful framework, named Cooperative SGD, that subsumes a variety of local-update SGD algorithms (such as local SGD, elastic averaging SGD, and decentralized parallel SGD) and provides a unified convergence analysis. Notably, special cases of the unified convergence analysis provided by the cooperative SGD framework yield 1) the first convergence analysis of elastic averaging SGD for general non-convex objectives, and 2) improvements upon previous analyses of local SGD and decentralized parallel SGD. Moreover, we design new algorithms such as elastic averaging SGD with overlapped computation and communication, and decentralized periodic averaging which are shown to be 4x or more faster than the baseline in reaching the same training loss.
  4. Incremental Task learning (ITL) is a category of continual learning that seeks to train a single network for multiple tasks (one after another), where training data for each task is only available during the training of that task. Neural networks tend to forget older tasks when they are trained for the newer tasks; this property is often known as catastrophic forgetting. To address this issue, ITL methods use episodic memory, parameter regularization, masking and pruning, or extensible network structures. In this paper, we propose a new incremental task learning framework based on low-rank factorization. In particular, we represent the network weights for each layer as a linear combination of several rank-1 matrices. To update the network for a new task, we learn a rank-1 (or low-rank) matrix and add that to the weights of every layer. We also introduce an additional selector vector that assigns different weights to the low-rank matrices learned for the previous tasks. We show that our approach performs better than the current state-of-the-art methods in terms of accuracy and forgetting. Our method also offers better memory efficiency compared to episodic memory- and mask-based approaches. Our code will be available at
  5. We introduce a model for ant trail formation, building upon previous work on biologically feasible local algorithms that plausibly describe how ants maintain trail networks. The model is a variant of a reinforced random walk on a directed graph, where ants lay pheromone on edges as they traverse them and the next edge to traverse is chosen based on the level of pheromone; this pheromone decays with time. There is a bidirectional flow of ants in the network: the forward flow proceeds along forward edges from source (e.g. the nest) to sink (e.g. a food source), and the backward flow in the opposite direction. Some fraction of ants are lost as they pass through each node (modeling the loss of ants due to exploration observed in the field). We initiate a theoretical study of this model. We note that ant navigation has inspired the field of ant colony optimization, heuristics that have been applied to several combinatorial optimization problems; however the algorithms developed there are considerably more complex and not constrained to being biologically feasible. We first consider the linear decision rule, where the flow divides itself among the next set of edges in proportion to their pheromone level. Here,more »we show that the process converges to the path with minimum leakage when the forward and backward flows do not change over time. On the other hand, when the forward and backward flows increase over time (caused by positive reinforcement from the discovery of a food source, for example), we show that the process converges to the shortest path. These results are for graphs consisting of two parallel paths (a case that has been investigated before in experiments). Through simulations, we show that these results hold for more general graphs drawn from various random graph models; proving this convergence in the general case is an interesting open problem. Further, to understand the behaviour of other decision rules beyond the linear rule, we consider a general family of decision rules. For this family, we show that there is no advantage of using a non-linear decision rule, if the goal is to find the shortest or the minimum leakage path. We also show that bidirectional flow is necessary for convergence to such paths. Our results provide a plausible explanation for field observations, and open up new avenues for further theoretical and experimental investigation.« less