NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A price-aware congestion control protocol for cloud services

https://doi.org/10.1186/s13677-021-00271-5

Sun, Xiaocui; Wang, Zhijun; Wu, Yunxiang; Che, Hao; Jiang, Hong (November 2021, Journal of Cloud Computing)

Abstract In current infrastructure-as-a service (IaaS) cloud services, customers are charged for the usage of computing/storage resources only, but not the network resource. The difficulty lies in the fact that it is nontrivial to allocate network resource to individual customers effectively, especially for short-lived flows, in terms of both performance and cost, due to highly dynamic environments by flows generated by all customers. To tackle this challenge, in this paper, we propose an end-to-end Price-Aware Congestion Control Protocol (PACCP) for cloud services. PACCP is a network utility maximization (NUM) based optimal congestion control protocol. It supports three different classes of services (CoSes), i.e., best effort service (BE), differentiated service (DS), and minimum rate guaranteed (MRG) service. In PACCP, the desired CoS or rate allocation for a given flow is enabled by properly setting a pair of control parameters, i.e., a minimum guaranteed rate and a utility weight, which in turn, determines the price paid by the user of the flow. Two pricing models, i.e., a coarse-grained VM-Based Pricing model (VBP) and a fine-grained Flow-Based Pricing model (FBP), are proposed. The optimality of PACCP is verified by both large scale simulation and small testbed implementation. The price-performance consistency of PACCP are evaluated using real datacenter workloads. The results demonstrate that PACCP provides minimum rate guarantee, high bandwidth utilization and fair rate allocation, commensurate with the pricing models.
more » « less
FedSLO: Towards SLO Guarantee for Federated Computing

https://doi.org/10.1109/SEC62691.2024.00058

Che, Hao; Rosenkrantz, Todd; Shen, Xiaoyan; Jiang, Hong; Wang, Zhijun (December 2024, IEEE)

Federated computing, including federated learning and federated analytics, needs to meet certain task Service Level Objective (SLO) in terms of various performance metrics, e.g., mean task response time and task tail latency. The lack of control and access to client activities requires a carefully crafted client selection process for each round of task processing to meet a designated task SLO. To achieve this, one must be able to predict task performance metrics for a given client selection per round of task execution. In this paper, we develop, FedSLO, a general framework that allows task performance in terms of a wide range of performance metrics of practical interest to be predicted for synchronous federated computing systems, in line with the Google federated learning system architecture. Specifically, with each task performance metric expressed as a cost function of the task response time, a relationship between the task performance measure - the mean cost and task/subtask response time distributions is established, allowing for unified task performance prediction algorithms to be developed. Practical issues concerning the computational complexity, measurement cost and implementation of FedSLO are also addressed. Finally, we propose preliminary ideas on how to apply FedSLO to the client selection process to enable task SLO guarantee.
more » « less
Free, publicly-accessible full text available December 4, 2025
FluidKV: Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint on Ultra-Fast Storage

https://doi.org/10.14778/3648160.3648177

Lu, Ziyi; Cao, Qiang; Jiang, Hong; Chen, Yuxing; Yao, Jie; Pan, Anqun (February 2024, Proceedings of the VLDB Endowment)

Our extensive experiments reveal that existing key-value stores (KVSs) achieve high performance at the expense of a huge memory footprint that is often impractical or unacceptable. Even with the emerging ultra-fast byte-addressable persistent memory (PM), KVSs fall far short of delivering the high performance promised by PM's superior I/O bandwidth. To find the root causes and bridge the huge performance/memory-footprint gap, we revisit the architectural features of two representative indexing mechanisms (single-stage and multi-stage) and propose a three-stage KVS called FluidKV. FluidKV effectively consolidates these indexes by fast and seamlessly running incoming key-value request stream from the write-concurrent frontend stage to the memory-efficient backend stage across an intermediate stage. FluidKV also designs important enabling techniques, such as thread-exclusive logging, PM-friendly KV-block structures, and dual-grained indexes, to fully utilize both parallel-processing and high-bandwidth capabilities of ultra-fast storage hardware while reducing the overhead. We implemented a FluidKV prototype and evaluated it under a variety of workloads. The results show that FluidKV outperforms the state-of-the-art PM-aware KVSs, including ListDB and FlatStore with different indexes, by up to 9× and 3.9× in write and read throughput respectively, while cutting up to 90% of the DRAM footprint.
more » « less
Full Text Available
HF-LDPC: HLS-friendly QC-LDPC FPGA Decoder with High Throughput and Flexibility

https://doi.org/10.1109/ICCD58817.2023.00091

Zhang, Yifan; Cao, Qiang; Wang, Shaohua; Yao, Jie; Jiang, Hong (November 2023, IEEE)

LDPC (Low-Density Parity-Check) codes have become a cornerstone of transforming a noise-filled physical channel into a reliable and high-performance data channel in communication and storage systems. FPGA (Field-Programmable Gate Array) based LDPC hardware, especially for decoding with high complexity, is essential to realizing the high-bandwidth channel prototypes. HLS (High-Level Synthesis) is introduced to speed up the FPGA development of LDPC hardware by automatically compiling high-level abstract behavioral descriptions into RTL-level implementations, but often sub-optimally due to lacking effective low-level descriptions. To overcome this problem, this paper proposes an HLS-friendly QC-LDPC FPGA decoder architecture, HF-LDPC, that employs HLS not only to precisely characterize high-level behaviors but also to effectively optimize low-level RTL implementation, thus achieving both high throughput and flexibility. First, HF-LDPC designs a multi-unit framework with a balanced I/O-computing dataflow to adaptively match code parameters with FPGA configurations. Second, HFLDPC presents a novel fine-grained task-level pipeline with interleaved updating to eliminate stalls due to data interdependence within each updating task. HF-LDPC also presents several HLSenhanced approaches. We implement and evaluate HF-LDPC on Xilinx U50, which demonstrates that HF-LDPC outperforms existing implementations by 4× to 84× with the same parameter and linearly scales to up to 116 Gbps actual decoding throughput with high hardware efficiency.
more » « less
Full Text Available
User Disengagement-Oriented Target Enforcement for Multi-Tenant Database Systems

https://doi.org/10.1145/3620678.3624668

Li, Ning; Jiang, Hong; Che, Hao; Wang, Zhijun; Nguyen, Minh; Rosenkrantz, Todd (October 2023, ACM)

Unexpected long query latency of a database system can cause domino effects on all the upstream services and severely degrade end users' experience with unpredicted long waits, resulting in an increasing number of users disengaged with the services and thus leading to a high user disengagement ratio (UDR). A high UDR usually translates to reduced revenue for service providers. This paper proposes UTSLO, a UDR-oriented SLO guaranteed system, which enables a database system to support multi-tenant UDR targets in a cost-effective fashion through UDR-oriented capacity planning and dynamic UDR target enforcement. The former aims to estimate the feasibility of UDR targets while the latter dynamically tracks and regulates per-connection query latency distribution needed for accurate UDR target guarantee. In UTSLO, the database service capacity can be fully exploited to efficiently accommodate tenants while minimizing resources required for UDR target guarantee.
more » « less
Full Text Available
L oo p D e l t a : E m b e dd i n g L o c a li t y - a w a r e O pp o r t un i s t i c D e l t a C o m p r e ss i o n i n I n li n e D e dup li c a t i o n f o r Hi g h l y E ffic i e n t D a t a R e du c t i o n

Zhang, Yucheng; Jiang, Hong; Feng, Dan; Jiang Nan; Qui, Taorong; Huang, Wei (July 2023, USENIX annual technical conference)

A s a c om pl e men t t o da ta d edupli cat ion , de lta c om p ress i on fu r- t he r r edu c es t h e dat a vo l u m e by c o m pr e ssi n g n o n - dup li c a t e d ata chunk s r e l a t iv e to t h e i r s i m il a r chunk s (bas e chunk s). H ow ever, ex is t i n g p o s t - d e dup li c a t i o n d e l t a c o m pr e ssi o n a p- p ro a ches fo r bac kup s t or ag e e i t h e r su ffe r f ro m t h e l ow s i m - il a r i t y b e twee n m any de te c ted c hun ks o r m i ss so me po t e n - t i a l s i m il a r c hunks , o r su ffer f r om l ow (ba ckup and r es t ore ) th r oug hpu t du e t o extr a I/ Os f or r e a d i n g b a se c hun ks o r a dd a dd i t i on a l s e r v i c e - d i s r up t ive op e r a t i on s to b a ck up s ys t em s. I n t h i s pa p e r, w e pr opo se L oop D e l t a t o a dd ress the above - m e n t i on e d prob l e m s by an e nha nced em b e ddi n g d e l t a c o m p - r e ss i on sc heme i n d e dup li c a t i on i n a non - i n t ru s ive way. T h e e nha nce d d elt a c o mpr ess ion s che m e co m b in e s f our key t e c h - ni qu e s : (1) du a l - l o c a li t y - b a s e d s i m il a r i t y t r a c k i n g to d e t ect po t e n t i a l si m il a r chun k s b y e x p l o i t i n g both l o g i c a l and ph y - s i c a l l o c a li t y, ( 2 ) l o c a li t y - a wa r e pr e f e t c h i n g to pr efe tc h ba se c hun ks to a vo i d ex t ra I/ Os fo r r e a d i n g ba s e chun ks on t h e w r i t e p at h , (3) c a che -aware fil t e r to avo i d ext r a I/Os f or b a se c hunk s on t he read p at h, a nd (4) i nver sed de l ta co mpressi on t o perf orm de lt a co mpress i o n fo r d at a chunk s t hat a re o th e r wi se f o r b i dd e n to s er ve as ba se c hunk s by r ew r i t i n g t e c hn i qu e s d e s i g n e d t o i m p r ove r es t o re pe rf o rma nc e. E x p e r i m e n t a l re su lts indi ca te t hat L oop D e l t a i ncr ea se s t he c o m pr e ss i o n r a t i o by 1 .2410 .97 t i m e s on t op of d e dup li c a - t i on , wi t hou t no t a b l y a ffe c t i n g th e ba ck up th rou ghpu t, a nd i t i m p r ove s t he res to re p er fo r m an ce b y 1.23.57 t i m e
more » « less
Full Text Available
TailGuard: Tail Latency SLO Guaranteed Task Scheduling for Data-Intensive User-Facing Applications

https://doi.org/10.1109/ICDCS57875.2023.00042

Wang, Zhijun; Li, Huiyang; Sun, Lin; Rosenkrantz, Todd; Che, Hao; Jiang, Hong (July 2023, IEEE)

Full Text Available
Improving scalability of database systems by reshaping user parallel I/O

https://doi.org/10.1145/3492321.3519570

Li, Ning; Jiang, Hong; Che, Hao; Wang, Zhijun; Nguyen, Minh Q. (March 2022, eurosys)

Full Text Available
An Incast-Coflow-Aware Minimum-Rate-Guaranteed Congestion Control Protocol for Datacenter Applications

https://doi.org/10.1109/NAS51552.2021.9605478

Wang, Zhijun; Wu, Yunxiang; Stoddard, Rosenkrantz; Li, Ning; Nguyen, Minh; Che, Hao (October 2021, The 15th International Conference on Networking, Architecture and Storage)

Full Text Available
CurTail: Distributed Cotask Scheduling with Guaranteed Tail-Latency SLO

Zhijun Wang, Hao Che (May 2021, The 7th International Conference on Networking and Services)

Full Text Available

« Prev Next »

Search for: All records