null
(Ed.)
Traditional end-host network stacks are struggling to keep up with
rapidly increasing datacenter access link bandwidths due to their
unsustainable CPU overheads. Motivated by this, our community is
exploring a multitude of solutions for future network stacks: from
Linux kernel optimizations to partial hardware o!oad to clean-slate
userspace stacks to specialized host network hardware. The design
space explored by these solutions would bene"t from a detailed
understanding of CPU ine#ciencies in existing network stacks.
This paper presents measurement and insights for Linux kernel
network stack performance for 100Gbps access link bandwidths.
Our study reveals that such high bandwidth links, coupled with
relatively stagnant technology trends for other host resources (e.g.,
core speeds and count, cache sizes, NIC bu$er sizes, etc.), mark a
fundamental shift in host network stack bottlenecks. For instance,
we "nd that a single core is no longer able to process packets at line
rate, with data copy from kernel to application bu$ers at the receiver
becoming the core performance bottleneck. In addition, increase in
bandwidth-delay products have outpaced the increase in cache sizes,
resulting in ine#cient DMA pipeline between the NIC and the CPU.
Finally, we "nd that traditional loosely-coupled design of network
stack and CPU schedulers in existing operating systems becomes a
limiting factor in scaling network stack performance across cores.
Based on insights from our study, we discuss implications to design
of future operating systems, network protocols, and host hardware.
more »
« less