skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Benchmarking Message Queues
Message queues are a way for different software components or applications to communicate with each other asynchronously by passing messages through a shared buffer. This allows a sender to send a message without needing to wait for an immediate response from the receiver, which can help to improve the system’s performance, reduce latency, and allow components to operate independently. In this paper, we compared and evaluated the performance of four popular message queues: Redis, ActiveMQ Artemis, RabbitMQ, and Apache Kafka. The aim of this study was to provide insights into the strengths and weaknesses of each technology and to help practitioners choose the most appropriate solution for their use case. We primarily evaluated each technology in terms of latency and throughput. Our experiments were conducted using a diverse array of workloads to test the message queues under various scenarios. This enables practitioners to evaluate the performance of the systems and choose the one that best meets their needs. The results show that each technology has its own pros and cons. Specifically, Redis performed the best in terms of latency, whereas Kafka significantly outperformed the other three technologies in terms of throughput. The optimal choice depends on the specific requirements of the use case. This paper presents valuable insights for practitioners and researchers working with message queues. Furthermore, the results of our experiments are provided in JSON format as a supplement to this paper.  more » « less
Award ID(s):
1854049
PAR ID:
10438973
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
Telecom
Volume:
4
Issue:
2
ISSN:
2673-4001
Page Range / eLocation ID:
298 to 312
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In-memory key-value caches are widely used as a performance-critical layer in web applications, disk-based storage, and distributed systems. The Least Recently Used (LRU) replacement policy has become the de facto standard in those systems since it exploits workload locality well. However, the LRU implementation can be costly due to the rigid data structure in maintaining object priority, as well as the locks for object order updating. Redis as one of the most effective and prevalent deployed commercial systems adopts an approximated LRU policy, where the least recently used item from a small, randomly sampled set of items is chosen to evict. This random sampling-based policy is lightweight and shows its flexibility. We observe that there can exist a significant miss ratio gap between exact LRU and random sampling-based LRU under different sampling size $$K$$ s. Therefore existing LRU miss ratio curve (MRC) construction techniques cannot be directly applied without loss of accuracy. In this paper, we introduce a new probabilistic stack algorithm named KRR to accurately model random sampling based-LRU, and extend it to handle both fixed and variable objects in key-value caches. We present an efficient stack update algorithm that reduces the expected running time of KRR significantly. To improve the performance of the in-memory multi-tenant key-value cache that utilizes random sampling-based replacement, we propose kRedis, a reference locality- and latency-aware memory partitioning scheme. kRedis guides the memory allocation among the tenants and dynamically customizes $$K$$ to better exploit the locality of each individual tenant. Evaluation results over diverse workloads show that our model generates accurate miss ratio curves for both fixed and variable object size workloads, and enables practical, low-overhead online MRC prediction. Equipped with KRR, kRedis delivers up to a 50.2% average access latency reduction, and up to a 262.8% throughput improvement compared to Redis. Furthermore, by comparing with pRedis, a state-of-the-art design of memory allocation in Redis, kRedis shows up to 24.8% and 61.8% improvements in average access latency and throughput, respectively. 
    more » « less
  2. Fault-tolerant coordination services have been widely used in distributed applications in cloud environments. Recent years have witnessed the emergence of time-sensitive applications deployed in edge computing environments, which introduces both challenges and opportunities for coordination services. On one hand, coordination services must recover from failures in a timely manner. On the other hand, edge computing employs local networked platforms that can be exploited to achieve timely recovery. In this work, we first identify the limitations of the leader election and recovery protocols underlying Apache ZooKeeper, the prevailing open-source coordination service. To reduce recovery latency from leader failures, we then design RT-Zookeeper with a set of novel features including a fast-convergence election protocol, a quorum channel notification mechanism, and a distributed epoch persistence protocol. We have implemented RT-Zookeeper based on ZooKeeper version 3.5.8. Empirical evaluation shows that RT-ZooKeeper achieves 91% reduction in maximum recovery latency in comparison to ZooKeeper. Furthermore, a case study demonstrates that fast failure recovery in RT-ZooKeeper can benefit a common messaging service like Kafka in terms of message latency. 
    more » « less
  3. Virtual switches, used for end-host networking, drop packets when the receiving application is not fast enough to consume them. This is called the slow receiver problem, and it is important because packet loss hurts tail communication latency and wastes CPU cycles, resulting in application-level performance degradation. Further, solving this problem is challenging because application throughput is highly variable over short timescales as it depends on workload, memory contention, and OS thread scheduling. This paper presents Backdraft, a new lossless virtual switch that addresses the slow receiver problem by combining three new components: (1) Dynamic Per-Flow Queuing (DPFQ) to prevent HOL blocking and provide on-demand memory usage; (2) Doorbell queues to reduce CPU overheads; (3) A new overlay network to avoid congestion spreading. We implemented Backdraft on top of BESS and conducted experiments with real applications on a 100 Gbps cluster with both DCTCP and Homa, a state-of-the-art congestion control scheme. We show that an application with Backdraft can achieve up to 20x lower tail latency at the 99th percentile. 
    more » « less
  4. Distributed key-value stores today require frequent key-value shard migration between nodes to react to dynamic workload changes for load balancing, data locality, and service elasticity. In this paper, we propose NetMigrate, a live migration approach for in-memory key-value stores based on programmable network data planes. NetMigrate migrates shards between nodes with zero service interruption and minimal performance impact. During migration, the switch data plane monitors the migration process in a fine-grained manner and directs client queries to the right server in real time, eliminating the overhead of pulling data between nodes. We implement a NetMigrate prototype on a testbed consisting of a programmable switch and several commodity servers running Redis and evaluate it under YCSB workloads. Our experiments demonstrate that NetMigrate improves the query throughput from 6.5% to 416% and maintains low access latency during migration, compared to the state-of-the-art migration approaches. 
    more » « less
  5. Abstract The new wave of Industry 4.0 is transforming manufacturing factories into data-rich environments. This provides an unprecedented opportunity to feed large amounts of sensing data collected from the physical factory into the construction of digital twin (DT) in cyberspace. However, little has been done to fully utilize the DT technology to improve the smartness and autonomous levels of small and medium-sized manufacturing factories. Indeed, only a small fraction of small and medium-sized manufacturers (SMMs) has considered implementing DT technology. There is an urgent need to exploit the full potential of data analytics and simulation-enabled DTs for advanced manufacturing. Hence, this paper presents the design and development of DT models for simulation optimization of manufacturing process flows. First, we develop a multi-agent simulation model that describes nonlinear and stochastic dynamics among a network of interactive manufacturing things, including customers, machines, automated guided vehicles (AGVs), queues, and jobs. Second, we propose a statistical metamodeling approach to design sequential computer experiments to optimize the utilization of AGV under uncertainty. Third, we construct two new graph models—job flow graph and AGV traveling graph—to track and monitor the real-time performance of manufacturing jobshops. The proposed simulation-enabled DT approach is evaluated and validated with experimental studies for the representation of a real-world manufacturing factory. Experimental results show that the proposed methodology effectively transforms a manufacturing jobshop into a new generation of DT-enabled smart factories. The sequential design of experiments effectively reduces the computation overhead of expensive simulations while optimally scheduling the AGV to achieve production throughput cost-effectively. This research is strongly promised to help SMMs fully utilize big data and DT technology for gaining competitive advantages in the global marketplace. 
    more » « less