skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Eiffel: Efficient and Flexible Software Packet Scheduling
Packet scheduling determines the ordering of packets in a queuing data structure with respect to some ranking function that is mandated by a scheduling policy. It is the core component in many recent innovations to optimize network performance and utilization. Our focus in this paper is on the design and deployment of packet scheduling in soft-ware. Software schedulers have several advantages over hardware including shorter development cycle and flexibility in functionality and deployment location. We substantially improve current software packet scheduling performance,while maintaining flexibility, by exploiting underlying features of packet ranking; namely, packet ranks are integers and, at any point in time, fall within a limited range of values.We introduce Eiffel, a novel programmable packet scheduling system. At the core of Eiffel is an integer priority queue based on the Find First Set (FFS) instruction and designed to support a wide range of policies and ranking functions efficiently. As an even more efficient alternative, we also pro-pose a new approximate priority queue that can outperform FFS-based queues for some scenarios. To support flexibility,Eiffel introduces novel programming abstractions to express scheduling policies that cannot be captured by current, state-of-the-art scheduler programming models. We evaluate Eiffel in a variety of settings and in both kernel and userspace deployments. We show that it outperforms state of the art systems by 3-40x in terms of either number of cores utilized for network processing or number of flows given fixed processing capacity  more » « less
Award ID(s):
1816331
PAR ID:
10095127
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’19)
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. With the commercialization and deployment of 5G, efforts are beginning to explore the design of the next generation of cellular networks, called 6G. New and constantly evolving use cases continue to place performance demands, especially for low latency communications, as these are still challenges for the 3GPP-specified 5G design, and will have to be met by the 6G design. Therefore, it is helpful to re-examine several aspects of the current cellular network’s design and implementation.Based on our understanding of the 5G cellular network specifications, we explore different implementation options for a dis-aggregated 5G core and their performance implications. To improve the data plane performance, we consider advanced packet classification mechanisms to support fast packet processing in the User Plane Function (UPF), to improve the poor performance and scalability of the current design based on linked lists. Importantly, we implement the UPF function on a SmartNIC for forwarding and tunneling. The SmartNIC provides the fastpath for device traffic, while more complex functions of buffering and processing flows that suffer a miss on the SmartNIC P4 tables are processed by the host-based UPF. Compared to an efficient DPDK-based host UPF, the SmartNIC UPF increases the throughput for 64 Byte packets by almost 2×. Furthermore, we lower the packet forwarding latency by 3.75× by using the SmartNIC. In addition, we propose a novel context-level QoS mechanism that dynamically updates the Packet Detection Rule priority and resource allocation of a flow based on the user context. By combining our innovations, we can achieve low latency and high throughput that will help us evolve to the next generation 6G cellular networks. 
    more » « less
  2. Advanced high-speed network cards have made packet processing in host operating systems a major performance bottleneck. The kernel network stack gives rise to various sources of overheads that limit the throughput and lengthen the per-packet processing latency. The problem is further exacerbated for short-lived, latency-sensitive network flows such as control packets, online gaming, database requests, etc. — in a highly utilized system, especially in virtualized (containerized) cloud environments, short flows can experience excessively long in-kernel queuing delays. As a consequence, recent research works propose to bypass the kernel network stack to enable lightweight, custom userspace network stacks for improved performance, but at a heavy cost of compatibility and security. In this paper, we take a different approach: We first analyze various sources of inefficiencies in the kernel network stack and propose ways to mitigate them without compromising systems compatibility, security, or flexibility. Further, we propose PRISM, a novel mechanism in the kernel network stack to differentiate incoming packets based on their performance requirements and streamline the processing stages of multi-stage packet processing pipelines (e.g., in container overlay networks). Our evaluation demonstrates that PRISM can significantly improve the latency of high-priority flows in container overly networks in the presence of heavy low-priority background traffic. 
    more » « less
  3. The shortest-remaining-processing-time (SRPT) scheduling policy has been extensively studied, for more than 50 years, in single-server queues with infinitely patient jobs. Yet, much less is known about its performance in multiserver queues. In this paper, we present the first theoretical analysis of SRPT in multiserver queues with abandonment. In particular, we consider the M/GI/s+GI queue and demonstrate that, in the many-sever overloaded regime, performance in the SRPT queue is equivalent, asymptotically in steady state, to a preemptive two-class priority queue where customers with short service times (below a threshold) are served without wait, and customers with long service times (above a threshold) eventually abandon without service. We prove that the SRPT discipline maximizes, asymptotically, the system throughput, among all scheduling disciplines. We also compare the performance of the SRPT policy to blind policies and study the effects of the patience-time and service-time distributions. This paper was accepted by Baris Ata, stochastic models & simulation. 
    more » « less
  4. Preemptive scheduling policies, which allow pausing jobs mid-service, are ubiquitous because they allow important jobs to receive service ahead of unimportant jobs that would otherwise delay their completion. The canonical example is Shortest Remaining Processing Time (SRPT), which preemptively serves the job with least remaining work at every moment in time [9]. There is a robust literature analyzing response time (elapsed time between a job's arrival and completion) in the M/G/1 queue under many preemptive policies [6, 10, 11], shedding light on questions such as how preemption affects the mean and tail of response time, and whether preemption is unfair towards low-priority jobs. 
    more » « less
  5. null (Ed.)
    Core-Stateless Fair Queueing (CSFQ) is a scalable algorithm proposed more than two decades ago to achieve fair queueing without keeping per-flow state in the network. Unfortunately, CSFQ did not take off, in part because it required protocol changes (i.e., adding new fields to the packet header), and hardware support to process packets at line rate. In this paper, we argue that two emerging trends are making CSFQ relevant again: (1) cloud computing which makes it feasible to change the protocol within the same datacenter or across datacenters owned by the same provider, and (2) programmable switches which can implement sophisticated packet processing at line rate. To this end, we present the first realization of CSFQ using programmable switches. In addition, we generalize CSFQ to a multi-level hierarchy, which naturally captures the traffic in today's datacenters, e.g., tenants at the first level and flows of each tenant at the second level of the hierarchy. We call this scheduler Hierarchical Core-Stateless Fair Queueing (HCSFQ), and show that it is able to accurately approximate hierarchical fair queueing. HCSFQ is highly scalable: it uses just a single FIFO queue, does not perform per-packet scheduling, and only needs to maintain state for the interior nodes of the hierarchy. We present analytical results to prove the lower bounds of HCSFQ. Our testbed experiments and large-scale simulations show that CSFQ and HCSFQ can provide fair bandwidth allocation and ensure isolation. 
    more » « less