NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sora: A Latency Sensitive Approach for Microservice Soft Resource Adaptation

https://doi.org/10.1145/3590140.3592851

Liu, Jianshu; Wang, Qingyang; Zhang, Shungeng; Hu, Liting; Da Silva, Dilma (November 2023, Proceedings of the 24th International Middleware Conference (Middleware '23))

Full Text Available
A BlackBox Approach to Profile Runtime Execution Dependencies in Microservices

https://doi.org/10.1109/CIC58953.2023.00024

Gu, Xuhang; Liu, Jianshu; Wang, Qingyang (November 2023, IEEE 9th International Conference on Collaboration and Internet Computing (CIC))

Loosely-coupled and lightweight microservices running in containers are likely to form complex execution dependencies inside the system. The execution dependency arises when two execution paths partially share component microservices, resulting in potential runtime performance interference. In this paper, we present a blackbox approach that utilizes legitimate HTTP requests to accurately profile the internal pairwise dependencies of all supported execution paths in the target microservices application. Concretely, we profile the pairwise dependency of two execution paths through performance interference analysis by sending bursts of two types of requests simultaneously. By characterizing and grouping all the execution paths based on their pairwise dependencies, the blackbox approach can derive a clear dependency graph(s) of the entire backend of the microservices application. We validate the effectiveness of the blackbox approach through experiments of open-source microservices benchmark applications running on real clouds (e.g., EC2, Azure).
more » « less
Full Text Available
μConAdapter: Reinforcement Learning-based Fast Concurrency Adaptation for Microservices in Cloud

https://doi.org/10.1145/3620678.3624980

Liu, Jianshu; Zhang, Shungeng; Wang, Qingyang (October 2023, Proceedings of the 2023 ACM Symposium on Cloud Computing (SoCC '23))

Full Text Available
Coordinating Fast Concurrency Adapting With Autoscaling for SLO-Oriented Web Applications

https://doi.org/10.1109/TPDS.2022.3151512

Liu, Jianshu; Zhang, Shungeng; Wang, Qingyang; Wei, Jinpeng (December 2022, IEEE Transactions on Parallel and Distributed Systems)

Full Text Available
ShadowSync: latency long tail caused by hidden synchronization in real-time LSM-tree based stream processing systems

https://doi.org/10.1145/3528535.3565251

Zhang, Shungeng; Wang, Qingyang; Kanemasa, Yasuhiko; Michaelis, Julius; Liu, Jianshu; Pu, Calton (November 2022, Middleware '22: Proceedings of the 23rd ACM/IFIP International Middleware Conference)

Mission-critical, real-time, continuous stream processing applications that interact with the real world have stringent latency requirements. For example, e-commerce websites like Amazon improve their marketing strategy by performing real-time advertising based on customers' behavior, and latency long tail can cause significant revenue loss. Recent work [39] showed a positive correlation between latency long tail and variance in the execution time of synchronous invocation chains (critical paths) in microservices benchmarks. This paper shows that asynchronous, very short but intense resource demands (called millibottlenecks) outside of critical paths can also cause significant latency long tail. Using a traffic analysis stream processing application benchmark, we evaluated the impact of asynchronous workload bursts generated by a multi-layer data structure called LSM-tree (log-structured merge-tree) for continuous checkpointing. Outside of the critical path, LSM-tree relies on maintenance operations (e.g., flushing/compaction during a checkpoint) to reorganize LSM-tree in memory and on disk to keep data access latency short. Although asynchronous, such recurrent maintenance operations can cause frequent millibottlenecks, particularly when they overlap, a problem we call ShadowSync. For scheduling and statistical reasons, significant latency long tail can arise from ShadowSync caused by asynchronous recurrent operations. Our experimental results show that with typical settings of benchmark components such as RocksDB, ShadowSync can prolong request message latency by up to 2 seconds. We show effective mitigation methods can alleviate both scheduled and statistical ShadowSync reducing the latency long tail to less than 20% of the original at the 99.9th percentile.
more » « less
Full Text Available
Decentralized Allocation of Geo-distributed Edge Resources using Smart Contracts

https://doi.org/10.1109/CCGrid54584.2022.00052

Xu, Jinlai; Palanisamy, Balaji; Wang, Qingyang; Ludwig, Heiko; Gopisetty, Sandeep (May 2022, 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid))

In the Internet of Things (loT) era, edge computing is a promising paradigm to improve the quality of service for latency sensitive applications by filling gaps between the loT devices and the cloud infrastructure. Highly geo-distributed edge computing resources that are managed by independent and competing service providers pose new challenges in terms of resource allocation and effective resource sharing to achieve a globally efficient resource allocation. In this paper, we propose a novel blockchain-based model for allocating computing resources in an edge computing platform that allows service providers to establish resource sharing contracts with edge infrastructure providers apriori using smart contracts in Ethereum. The smart contract in the proposed model acts as the auctioneer and replaces the trusted third-party to handle the auction. The blockchain-based auctioning protocol increases the transparency of the auction-based resource allocation for the participating edge service and infrastructure providers. The design of sealed bids and bid revealing methods in the proposed protocol make it possible for the participating bidders to place their bids without revealing their true valuation of the goods. The truthful auction design and the utility-aware bidding strategies incorporated in the proposed model enables the edge service providers and edge infrastructure providers to maximize their utilities. We implement a prototype of the model on a real blockchain test bed and our extensive experiments demonstrate the effectiveness, scalability and performance efficiency of the proposed approach.
more » « less
Full Text Available
Blockumulus: A Scalable Framework for Smart Contracts on the Cloud

https://doi.org/10.1109/ICDCS51616.2021.00064

Ivanov, Nikolay; Yan, Qiben; Wang, Qingyang (July 2021, 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS))

Public blockchains have spurred the growing popularity of decentralized transactions and smart contracts, especially on the financial market. However, public blockchains exhibit their limitations on the transaction throughput, storage availability, and compute capacity. To avoid transaction gridlock, public blockchains impose large fees and per-block resource limits, making it difficult to accommodate the ever-growing high transaction demand. Previous research endeavors to improve the scalability and performance of blockchain through various technologies, such as side-chaining, sharding, secured off-chain computation, communication network optimizations, and efficient consensus protocols. However, these approaches have not attained a widespread adoption due to their inability in delivering a cloud-like performance, in terms of the scalability in transaction throughput, storage, and compute capacity. In this work, we determine that the major obstacle to public blockchain scalability is their underlying unstructured P2P networks. We further show that a centralized network can support the deployment of decentralized smart contracts. We propose a novel approach for achieving scalable decentralization: instead of trying to make blockchain scalable, we deliver decentralization to already scalable cloud by using an Ethereum smart contract. We introduce Blockumulus, a framework that can deploy decentralized cloud smart contract environments using a novel technique called overlay consensus. Through experiments, we demonstrate that Blockumulus is scalable in all three dimensions: computation, data storage, and transaction throughput. Besides eliminating the current code execution and storage restrictions, Blockumulus delivers a transaction latency between 2 and 5 seconds under normal load. Moreover, the stress test of our prototype reveals the ability to execute 20,000 simultaneous transactions under 26 seconds, which is on par with the average throughput of worldwide credit card transactions.
more » « less
Full Text Available
DoubleFaceAD: A New Datastore Driver Architecture to Optimize Fanout Query Performance

https://doi.org/10.1145/3423211.3425684

Zhang, Shungeng; Wang, Qingyang; Kanemasa, Yasuhiko; Liu, Jianshu; Pu, Calton (December 2020, Proceedings of the 21st International Middleware Conference)
null (Ed.)
The broad adoption of fanout queries on distributed datastores has made asynchronous event-driven datastore drivers a natural choice due to reduced multithreading overhead. However, through extensive experiments using the latest datastore drivers (e.g., MongoDB, HBase, DynamoDB) and YCSB benchmark, we show that an asynchronous datastore driver can cause unexpected performance degradation, especially in fanout-query scenarios. For example, the default MongoDB asynchronous driver adopts the latest Java asynchronous I/O library, which uses a hidden on-demand JVM level thread pool to process fanout query responses, causing a surprising multithreading overhead when the query response size is large. A second instance is the traditional wisdom of modular design of an application server and the embedded asynchronous datastore driver can cause an imbalanced workload between the two components due to lack of coordination, incurring frequent unnecessary system calls. To address the revealed problems, we introduce DoubleFaceAD--a new asynchronous datastore driver architecture that integrates the management of both upstream and downstream workload traffic through a few shared reactor threads, with fanout-query-aware priority-based scheduling to reduce the overall query waiting time. Our experimental results on two representative application scenarios (YCSB and DBLP) show DoubleFaceAD outperforms all other types of datastore drivers up to 34% on throughput and 1.9\texttimes{} faster on 99th percentile response time.
more » « less
Full Text Available
Mitigating Large Response Time Fluctuations through Fast Concurrency Adapting in Clouds

https://doi.org/10.1109/IPDPS47924.2020.00046

Liu, Jianshu; Zhang, Shungeng; Wang, Qingyang; Wei, Jinpeng (May 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS))
null (Ed.)
Dynamically reallocating computing resources to handle bursty workloads is a common practice for web applications (e.g., e-commerce) in clouds. However, our empirical analysis on a standard n-tier benchmark application (RUBBoS) shows that simply scaling an n-tier application by reallocating hardware resources without fast adapting soft resources (e.g., server threads, connections) may lead to large response time fluctuations. This is because soft resources control the workload concurrency of component servers in the system: adding or removing hardware resources such as Virtual Machines (VMs) can implicitly change the workload concurrency of dependent servers, causing either under- or over-utilization of the critical hardware resource in the system. To quickly identify the optimal soft resource allocation of each server in the system and stabilize response time fluctuation, we propose a novel Scatter-Concurrency-Throughput (SCT) model based on the monitoring of each server's real-time concurrency and throughput. We then implement a Concurrency-aware system Scaling (ConScale) framework which integrates the SCT model to fast adapt the soft resource allocations of key servers during the system scaling process. Our experiments using six realistic bursty workload traces show that ConScale can effectively mitigate the response time fluctuations of the target web application compared to the state-of-the-art cloud scaling strategies such as EC2-AutoScaling.
more » « less
Full Text Available

Search for: All records