skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Verifying the Verifier: eBPF Range Analysis Verification
This paper proposes an automated method to check the cor- rectness of range analysis used in the Linux kernel’s eBPF verifier. We provide the specification of soundness for range analysis performed by the eBPF verifier. We automatically generate verification conditions that encode the operation of the eBPF verifier directly from the Linux kernel’s C source code and check it against our specification. When we discover instances where the eBPF verifier is unsound, we propose a method to generate an eBPF program that demonstrates the mismatch between the abstract and the concrete semantics. Our prototype automatically checks the soundness of 16 versions of the eBPF verifier in the Linux kernel versions ranging from 4.14 to 5.19. In this process, we have discovered new bugs in older versions and proved the soundness of range analysis in the latest version of the Linux kernel.  more » « less
Award ID(s):
2019302
PAR ID:
10467089
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
Computer Aided Verification (CAV)
Date Published:
Format(s):
Medium: X
Location:
Paris, France
Sponsoring Org:
National Science Foundation
More Like this
  1. The Linux Kernel is a world-class operating system controlling most of our computing infrastructure: mobile devices, Internet routers and services, and most of the supercomputers. Linux is also an example of low-level software with no comprehensive regression test suite (for good reasons). The kernel’s tremendous societal importance imposes strict stability and correctness requirements. These properties make Linux a challenging and relevant target for static automated program repair (APR). Over the past decade, a significant progress has been made in dynamic APR. However, dynamic APR techniques do not translate naturally to systems without tests. We present a static APR technique addressing sequentiallocking API misusebugs in the Linux Kernel. We attack the key challenge of static APR, namely, the lack of detailed program specification, by combining static analysis with machine learning to complement the information presented by the static analyzer. In experiments on historical real-world bugs in the kernel, we were able to automatically re-produce or propose equivalent patches in 85% of the human-made patches, and automatically rank them among the top three candidates for 64% of the cases and among the top five for 74%. 
    more » « less
  2. The eBPF technology in the Linux kernel has been widely adopted for different applications, such as networking, tracing, and security, thanks to the programmability it provides. By allowing user-supplied eBPF programs to be executed directly in the kernel, it greatly increases the flexibility and efficiency of deploying customized logic. However, eBPF also introduces a new and wide attack surface: malicious eBPF programs may try to exploit the vulnerabilities in the eBPF subsystem in the kernel. Fuzzing is a promising technique to find such vulnerabilities. Unfortunately, our experiments with the stateof-the-art kernel fuzzer, Syzkaller, show that it cannot effectively fuzz the eBPF runtime, those components that are in charge of executing an eBPF program, for two reasons. First, the eBPF verifier (which is tasked with verifying the safety of eBPF programs) rejects many fuzzing inputs because (1) they do not comply with its required semantics or (2) they miss some dependencies, i.e., other syscalls that need to be issued before the program is loaded. Second, Syzkaller fails to attach and trigger the execution of eBPF programs most of the times. This paper introduces the BPF Runtime Fuzzer (BRF), a fuzzer that can satisfy the semantics and dependencies required by the verifier and the eBPF subsystem. Our experiments show, in 48-hour fuzzing sessions, BRF can successfully execute 8× more eBPF programs compared to Syzkaller (and 32× more programs compared to Buzzer, an eBPF fuzzer released recently from Google). Moreover, eBPF programs generated by BRF are much more expressive than Syzkaller’s. As a result, BRF achieves 101% higher code coverage. Finally, BRF has so far managed to find 6 vulnerabilities (2 of them have been assigned CVE numbers) in the eBPF runtime, proving its effectiveness. 
    more » « less
  3. With the emergence of microsecond-scale NVMe storage devices, the Linux kernel storage stack overhead has become significant, almost doubling access times. We present XRP, a framework that allows applications to execute user-defined storage functions, such as index lookups or aggregations, from an eBPF hook in the NVMe driver, safely bypassing most of the kernel’s storage stack. To preserve file system semantics, XRP propagates a small amount of kernel state to its NVMe driver hook where the user-registered eBPF functions are called. We show how two key-value stores, BPF-KV, a simple B+-tree key-value store, and WiredTiger, a popular log-structured merge tree storage engine, can leverage XRP to significantly improve throughput and latency. 
    more » « less
  4. Software APIs exhibit rich diversity and complexity which not only renders them a common source of programming errors but also hinders program analysis tools for checking them. Such tools either expect a precise API specification, which requires program analysis expertise, or presume that correct API usages follow simple idioms that can be automatically mined from code, which suffers from poor accuracy. We propose a new approach that allows regular programmers to find API misuses. Our approach interacts with the user to classify valid and invalid usages of each target API method. It minimizes user burden by employing an active learning algorithm that ranks API usages by their likelihood of being invalid. We implemented our approach in a tool called ARBITRAR for C/C++ programs, and applied it to check the uses of 18 API methods in 21 large real-world programs, including OpenSSL and Linux Kernel. Within just 3 rounds of user interaction on average per API method, ARBITRAR found 40 new bugs, with patches accepted for 18 of them. Moreover, ARBITRAR finds all known bugs reported by a state-of-the-art tool APISAN in a benchmark suite comprising 92 bugs with a false positive rate of only 51.5% compared to APISAN’s 87.9% 
    more » « less
  5. Abstract We present a novel attack for detecting the presence of an active TCP connection between a remote Linux server and an arbitrary client machine. The attack takes advantage of side-channels present in the Linux kernel’s handling of the values used to populate an IPv4 packet’s IPID field and applies to kernel versions of 4.0 and higher. We implement and test this attack and evaluate its real world effectiveness and performance when used on active connections to popular web servers. Our evaluation shows that the attack is capable of correctly detecting the IP-port 4-tuple representing an active TCP connection in 84% of our mock attacks. We also demonstrate how the attack can be used by the middle onion router in a Tor circuit to test whether a given client is connected to the guard entry node associated with a given circuit. In addition we discuss the potential issues an attacker would face when attempting to scale it to real world attacks, as well as possible mitigations against the attack. Our attack does not exhaust any global resource, and therefore challenges the notion that there is a direct one-to-one connection between shared, limited resources and non-trivial network side-channels. This means that simply enumerating global shared resources and considering the ways in which they can be exhausted will not suffice for certifying a kernel TCP/IP network stack to be free of privacy risk side-channels. 
    more » « less