skip to main content

Title: Intrusion-Tolerant and Confidentiality-Preserving Publish/Subscribe Messaging
We present Chios, an intrusion-tolerant publish/subscribe system which protects against Byzantine failures. Chios is the first publish/subscribe system achieving decentralized confidentiality with fine-grained access control and strong publication order guarantees. This is in contrast to existing publish/subscribe systems achieving much weaker security and reliability properties. Chios is flexible and modular, consisting of four fully-fledged publish/subscribe configurations (each designed to meet different goals). We have deployed and evaluated our system on Amazon EC2. We compare Chios with various publish/subscribe systems. Chios is as efficient as an unreplicated, single-broker publish/subscribe implementation, only marginally slower than Kafka and Kafka with passive replication, and at least an order of magnitude faster than all Hyperledger Fabric modules and publish/subscribe systems using Fabric.
Authors:
; ; ; ; ; ;
Award ID(s):
1919159
Publication Date:
NSF-PAR ID:
10272155
Journal Name:
2020 International Symposium on Reliable Distributed Systems (SRDS)
Page Range or eLocation-ID:
319 to 328
Sponsoring Org:
National Science Foundation
More Like this
  1. Applications and middleware services, such as data placement engines, I/O scheduling, and prefetching engines, require low-latency access to telemetry data in order to make optimal decisions. However, typical monitoring services store their telemetry data in a database in order to allow applications to query them, resulting in significant latency penalties. This work presents Apollo: a low-latency monitoring service that aims to provide applications and middleware libraries with direct access to relational telemetry data. Monitoring the system can create interference and overhead, slowing down raw performance of the resources for the job. However, having a current view of the system can aid middleware services in making more optimal decisions which can ultimately improve the overall performance. Apollo has been designed from the ground up to provide low latency, using Publish–Subscribe (Pub-Sub) semantics, and low overhead, using adaptive intervals in order to change the length of time between polling the resource for telemetry data and machine learning in order to predict changes to the telemetry data between actual resource polling. This work also provides some high level abstractions called I/O curators, which can further aid middleware libraries and applications to make optimal decisions. Evaluations showcase that Apollo can achieve sub-millisecond latency formore »acquiring complex insights with a memory overhead of ~57MB and CPU overhead being only 7% more than existing state-of-the-art systems.« less
  2. Autonomous vehicle (AV) software systems are emerging to enable rapidly developed self-driving functionalities. Since such systems are responsible for safety-critical decisions, it is necessary to secure them in face of cyber attacks. Through an empirical study of representative AV software systems Baidu Apollo and Autoware, we discover a common over-privilege problem with the publish-subscribe communication model widely adopted by AV systems: due to the coarse-grained message design for the publish-subscribe communication, some message fields are over-granted with publish/subscribe permissions. To comply with the least-privilege principle and reduce the attack surface resulting from such problem, we argue that the publish/subscribe permissions should be defined and enforced at the granularity of message fields instead of messages. To systematically address such publish-subscribe over-privilege problems, we present AVGuardian, a system that includes (1) a static analysis tool that detects over-privilege instances in AV software and generates the corresponding access control policies at the message field granularity, and (2) a low-overhead, module-transparent, runtime publish/subscribe permission policy enforcement mechanism to perform online policy violation detection and prevention. Using our detection tool, we are able to automatically detect 581 over-privilege instances in total in Baidu Apollo. To demonstrate the severity, we further constructed several concrete exploits thatmore »can lead to vehicle collision and identity theft for AV owners, which have been reported to Baidu Apollo and confirmed as valid. For defense, we prototype and evaluate the policy enforcement mechanism, and find that it has very low overhead, does not affect original AV decision logic, and also is resilient to message replay attacks.« less
  3. Power grids are undergoing major changes due to rapid growth in renewable energy and improvements in battery technology. Prompted by the increasing complexity of power systems, decentralized IoT solutions are emerging, which arrange local communities into transactive microgrids. The core functionality of these solutions is to provide mechanisms for matching producers with consumers while ensuring system safety. However, there are multiple challenges that these solutions still face: privacy, trust, and resilience. The privacy challenge arises because the time series of production and consumption data for each participant is sensitive and may be used to infer personal information. Trust is an issue because a producer or consumer can renege on the promised energy transfer. Providing resilience is challenging due to the possibility of failures in the infrastructure that is required to support these market based solutions. In this paper, we develop a rigorous solution for transactive microgrids that addresses all three challenges by providing an innovative combination of MILP solvers, smart contracts, and publish-subscribe middleware within a framework of a novel distributed application platform, called Resilient Information Architecture Platform for Smart Grid. Towards this purpose, we describe the key architectural concepts, including fault tolerance, and show the trade-off between market efficiencymore »and resource requirements.« less
  4. Graph-based namespaces are being increasingly used to represent the organization of complex and ever-growing information eco-systems and individual user roles. Timely and accurate information dissemination requires an architecture with appropriate naming frameworks, adaptable to changing roles, focused on content rather than network addresses. Today's complex information organization structures make such dissemination very challenging. To address this, we propose POISE, a name-based publish/subscribe architecture for efficient topic-based and recipient-based content dissemination. POISE proposes an information layer, improving on state-of-the-art Information-Centric Networking solutions in two major ways: 1) support for complex graph-based namespaces, and 2) automatic name-based load-splitting. POISE supports in-network graph-based naming, leveraged in a dissemination protocol which exploits information layer rendezvous points (RPs) that perform name expansions. For improved robustness and scalability, POISE supports adaptive load-sharing via multiple RPs, each managing a dynamically chosen subset of the namespace graph. Excessive workload may cause one RP to turn into a ``hot spot'', impeding performance and reliability. To eliminate such traffic concentration, we propose an automated load-splitting mechanism, consisting of an enhanced, namespace graph partitioning complemented by a seamless, loss-less core migration procedure. Due to the nature of our graph partitioning and its complex objectives, off-the-shelf graph partitioning, e.g., METIS, is inadequate.more »We propose a hybrid, iterative bi-partitioning solution, consisting of an initial and a refinement phase. We also implemented POISE on a DPDK-based platform. Using the important application of emergency response, our experimental results show that POISE outperforms state-of-the-art solutions, demonstrating its effectiveness in timely delivery and load-sharing.« less
  5. Modern intelligent urban mobility applications are underpinned by large-scale, multivariate, spatiotemporal data streams. Working with this data presents unique challenges of data management, processing and presentation that is often overlooked by researchers. Therefore, in this work we present an integrated data management and processing framework for intelligent urban mobility systems currently in use by our partner transit agencies. We discuss the available data sources and outline our cloud-centric data management and stream processing architecture built upon open-source publish-subscribe and NoSQL data stores. We then describe our data-integrity monitoring methods. We then present a set of visualization dashboards designed for our transit agency partners. Lastly, we discuss how these tools are currently being used for AI-driven urban mobility applications that use these tools.