skip to main content


Title: Distributed Corruption Detection in Networks
We consider the problem of distributed corruption detection in networks. In this model each node of a directed graph is either truthful or corrupt. Each node reports the type (truthful or corrupt) of each of its outneighbors. If it is truthful, it reports the truth, whereas if it is corrupt, it reports adversarially. This model, first considered by Preparata, Metze and Chien in 1967, motivated by the desire to identify the faulty components of a digital system by having the other components checking them, became known as the PMC model. The main known results for this model characterize networks in which all corrupt (that is, faulty) nodes can be identified, when there is a known upper bound on their number. We are interested in networks in which a large fraction of the nodes can be classified. It is known that in the PMC model, in order to identify all corrupt nodes when their number is t, all in-degrees have to be at least t. In contrast, we show that in d regular-graphs with strong expansion properties, a 1 - O(1/d) fraction of the corrupt nodes, and a 1 - O(1/d) fraction of the truthful nodes can be identified, whenever there is a majority of truthful nodes. We also observe that if the graph is very far from being a good expander, namely, if the deletion of a small set of nodes splits the graph into small components, then no corruption detection is possible even if most of the nodes are truthful. Finally we discuss the algorithmic aspects and the computational hardness of the problem.  more » « less
Award ID(s):
1855464 1737944
NSF-PAR ID:
10144273
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Theory of Computing
Volume:
16
Issue:
1
ISSN:
1557-2862
Page Range / eLocation ID:
1 to 23
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gilbert, Seth (Ed.)
    Byzantine consensus is a classical problem in distributed computing. Each node in a synchronous system starts with a binary input. The goal is to reach agreement in the presence of Byzantine faulty nodes. We consider the setting where communication between nodes is modelled via an undirected communication graph. In the classical point-to-point communication model all messages sent on an edge are private between the two endpoints of the edge. This allows a faulty node to equivocate, i.e., lie differently to its different neighbors. Different models have been proposed in the literature that weaken equivocation. In the local broadcast model, every message transmitted by a node is received identically and correctly by all of its neighbors. In the hypergraph model, every message transmitted by a node on a hyperedge is received identically and correctly by all nodes on the hyperedge. Tight network conditions are known for each of the three cases. We introduce a more general model that encompasses all three of these models. In the local multicast model, each node u has one or more local multicast channels. Each channel consists of multiple neighbors of u in the communication graph. When node u sends a message on a channel, it is received identically by all of its neighbors on the channel. For this model, we identify tight network conditions for consensus. We observe how the local multicast model reduces to each of the three models above under specific conditions. In each of the three cases, we relate our network condition to the corresponding known tight conditions. The local multicast model also encompasses other practical network models of interest that have not been explored previously, as elaborated in the paper. 
    more » « less
  2. null (Ed.)
    Motivated by kidney exchange, we study the following mechanism-design problem: On a directed graph (of transplant compatibilities among patient--donor pairs), the mechanism must select a simple path (a chain of transplantations) starting at a distinguished vertex (an altruistic donor) such that the total length of this path is as large as possible (a maximum number of patients receive a kidney). However, the mechanism does not have direct access to the graph. Instead, the vertices are partitioned over multiple players (hospitals), and each player reports a subset of her vertices to the mechanism. In particular, a player may strategically omit vertices to increase how many of her vertices lie on the path returned by the mechanism. Our objective is to find mechanisms that limit incentives for such manipulation while producing long paths. Unfortunately, in worst-case instances, competing with the overall longest path is impossible while incentivizing (approximate) truthfulness, i.e., requiring that hiding nodes cannot increase a player's utility by more than a factor of 1 + o(1). We therefore adopt a semi-random model where o(n) random edges are added to worst-case instances. While it remains impossible for truthful mechanisms to compete with the overall longest path, we give a truthful mechanism that competes with a weaker but non-trivial benchmark: the length of any path whose subpaths within each player have a minimum average length. In fact, our mechanism satisfies even a stronger notion of truthfulness, which we call matching-time incentive compatibility. This notion of truthfulness requires that each player not only reports her nodes truthfully but also does not stop the returned path at any of her nodes in order to divert it to a continuation inside her own subgraph. 
    more » « less
  3. This paper considers the Byzantine consensus problem for nodes with binary inputs. The nodes are interconnected by a network represented as an undirected graph, and the system is assumed to be synchronous. Under the classical point-to-point communication model, it is well-known that the following two conditions are both necessary and sufficient to achieve Byzantine consensus among n nodes in the presence of up to ƒ Byzantine faulty nodes: n & 3 #8805; 3 ≥ ƒ+ 1 and vertex connectivity at least 2 ƒ + 1. In the classical point-to-point communication model, it is possible for a faulty node to equivocate, i.e., transmit conflicting information to different neighbors. Such equivocation is possible because messages sent by a node to one of its neighbors are not overheard by other neighbors. This paper considers the local broadcast model. In contrast to the point-to-point communication model, in the local broadcast model, messages sent by a node are received identically by all of its neighbors. Thus, under the local broadcast model, attempts by a node to send conflicting information can be detected by its neighbors. Under this model, we show that the following two conditions are both necessary and sufficient for Byzantine consensus: vertex connectivity at least ⌋ 3 fƒ / 2 ⌊ + 1 and minimum node degree at least 2 ƒ. Observe that the local broadcast model results in a lower requirement for connectivity and the number of nodes n, as compared to the point-to-point communication model. We extend the above results to a hybrid model that allows some of the Byzantine faulty nodes to equivocate. The hybrid model bridges the gap between the point-to-point and local broadcast models, and helps to precisely characterize the trade-off between equivocation and network requirements. 
    more » « less
  4. null (Ed.)
    We study the communication cost (or message complexity) of fundamental distributed symmetry breaking problems, namely, coloring and MIS. While significant progress has been made in understanding and improving the running time of such problems, much less is known about the message complexity of these problems. In fact, all known algorithms need at least Ω(m) communication for these problems, where m is the number of edges in the graph. We addressthe following question in this paper: can we solve problems such as coloring and MIS using sublinear, i.e., o(m) communication, and if sounder what conditions? In a classical result, Awerbuch, Goldreich, Peleg, and Vainish [JACM 1990] showed that fundamental global problems such asbroadcast and spanning tree construction require at least o(m) messages in the KT-1 Congest model (i.e., Congest model in which nodes have initial knowledge of the neighbors' ID's) when algorithms are restricted to be comparison-based (i.e., algorithms inwhich node ID's can only be compared). Thirty five years after this result, King, Kutten, and Thorup [PODC 2015] showed that onecan solve the above problems using Õ(n) messages (n is the number of nodes in the graph) in Õ(n) rounds in the KT-1 Congest model if non-comparison-based algorithms are permitted. An important implication of this result is that one can use the synchronous nature of the KT-1 Congest model, using silence to convey information,and solve any graph problem using non-comparison-based algorithms with Õ(n) messages, but this takes an exponential number of rounds. In the asynchronous model, even this is not possible. In contrast, much less is known about the message complexity of local symmetry breaking problems such as coloring and MIS. Our paper fills this gap by presenting the following results. Lower bounds: In the KT-1 CONGEST model, we show that any comparison-based algorithm, even a randomized Monte Carlo algorithm with constant success probability, requires Ω(n 2) messages in the worst case to solve either (△ + 1)-coloring or MIS, regardless of the number of rounds. We also show that Ω(n) is a lower bound on the number ofmessages for any (△ + 1)-coloring or MIS algorithm, even non-comparison-based, and even with nodes having initial knowledge of up to a constant radius. Upper bounds: In the KT-1 CONGEST model, we present the following randomized non-comparison-based algorithms for coloring that, with high probability, use o(m) messages and run in polynomially many rounds.(a) A (△ + 1)-coloring algorithm that uses Õ(n1.5) messages, while running in Õ(D + √ n) rounds, where D is the graph diameter. Our result also implies an asynchronous algorithm for (△ + 1)-coloring with the same message bound but running in Õ(n) rounds. (b) For any constantε > 0, a (1+ε)△-coloring algorithm that uses Õ(n/ε 2 ) messages, while running in Õ(n) rounds. If we increase our input knowledge slightly to radius 2, i.e.,in the KT-2 CONGEST model, we obtain:(c) A randomized comparison-based MIS algorithm that uses Õ(n 1.5) messages. while running in Õ( √n) rounds. While our lower bound results can be viewed as counterparts to the classical result of Awerbuch, Goldreich, Peleg, and Vainish [JACM 90], but for local problems, our algorithms are the first-known algorithms for coloring and MIS that take o(m) messages and run in polynomially many rounds. 
    more » « less
  5. null (Ed.)
    In many network applications, it may be desirable to conceal certain target nodes from detection by a data collector, who is using a crawling algorithm to explore a network. For example, in a computer network, the network administrator may wish to protect those computers (target nodes) with sensitive information from discovery by a hacker who has exploited vulnerable machines and entered the network. These networks are often protected by hiding the machines (nodes) from external access, and allow only fixed entry points into the system (protection against external attacks). However, in this protection scheme, once one of the entry points is breached, the safety of all internal machines is jeopardized (i.e., the external attack turns into an internal attack). In this paper, we view this problem from the perspective of the data protector. We propose the Node Protection Problem: given a network with known entry points, which edges should be removed/added so as to protect as many target nodes from the data collector as possible? A trivial way to solve this problem would be to simply disconnect either the entry points or the target nodes – but that would make the network non-functional. Accordingly, we impose certain constraints: for each node, only (1 − r) fraction of its edges can be removed, and the resulting network must not be disconnected. We propose two novel scoring mechanisms - the Frequent Path Score and the Shortest Path Score. Using these scores, we propose NetProtect, an algorithm that selects edges to be removed or added so as to best impede the progress of the data collector. We show experimentally that NetProtect outperforms baseline node protection algorithms across several real-world networks. In some datasets, With 1% of the edges removed by NetProtect, we found that the data collector requires up to 6 (4) times the budget compared to the next best baseline in order to discover 5 (50) nodes. 
    more » « less