skip to main content

Title: UPGRADVISOR: Early Adopting Dependency Updates Using Hybrid Program Analysis and Hardware Tracing
Applications often have fast-paced release schedules, but adoption of software dependency updates can lag by years, leaving applications susceptible to security risks and unexpected breakage. To address this problem, we present UPGRADVISOR, a system that reduces developer effort in evaluating dependency updates and can, in many cases, automatically determine which updates are backward-compatible versus API-breaking. UPGRADVISOR introduces a novel co-designed static analysis and dynamic tracing mechanism to gauge the scope and effect of dependency updates on an application. Static analysis prunes changes irrelevant to an application and clusters relevant ones into targets. Dynamic tracing needs to focus only on whether targets affect an application, making it fast and accurate. UPGRADVISOR handles dynamic interpreted languages and introduces call graph over-approximation to account for their lack of type information and selective hardware tracing to capture program execution while ignoring interpreter machinery. We have implemented UPGRADVISOR for Python and evaluated it on 172 dependency updates previously blocked from being adopted in widely-used open-source software, including Django, aws-cli, tfx, and Celery. UPGRADVISOR automatically determined that 56% of dependencies were safe to update and reduced by more than an order of magnitude the number of code changes that needed to be considered by dynamic tracing. Evaluating UPGRADVISOR’s tracer in a production-like environment incurred only 3% overhead on average, making it fast enough to deploy in practice. We submitted safe updates that were previously blocked as pull requests for nine projects, and their developers have already merged most of them.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ;
Date Published:
Journal Name:
USENIX Symposium on Operating Systems Design and Implementation
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Modern desktop applications involve many asynchronous, concurrent interactions that make performance issues difficult to diagnose. Although prior work has used causal tracing for debugging performance issues in distributed systems, we find that these techniques suffer from high inaccuracies for desktop applications. We present Argus, a fast, effective causal tracing tool for debugging performance anomalies in desktop applications. Argus introduces a novel notion of strong and weak edges to explicitly model and annotate trace graph ambiguities, a new beam-search-based diagnosis algorithm to select the most likely causal paths in the presence of ambiguities, and a new way to compare causal paths across normal and abnormal executions. We have implemented Argus across multiple versions of macOS and evaluated it on 12 infamous spinning pinwheel issues in popular macOS applications. Argus diagnosed the root causes for all issues, 10 of which were previously unknown, some of which have been open for several years. Argus incurs less than 5% CPU overhead when its system-wide tracing is enabled, making always-on tracing feasible. 
    more » « less
  2. Amber is a system-on-chip (SoC) with a coarse-grained reconfigurable array (CGRA) for acceleration of dense linear algebra applications, such as machine learning (ML), image processing, and computer vision. It is designed using an agile accelerator-compiler co-design flow; the compiler updates automatically with hardware changes, enabling continuous application-level evaluation of the hardware-software system. To increase hardware utilization and minimize reconfigurability overhead, Amber features the following: 1) dynamic partial reconfiguration (DPR) of the CGRA for higher resource utilization by allowing fast switching between applications and partitioning resources between simultaneous applications; 2) streaming memory controllers supporting affine access patterns for efficient mapping of dense linear algebra; and 3) low-overhead transcendental and complex arithmetic operations. The physical design of Amber features a unique clock distribution method and timing methodology to efficiently layout its hierarchical and tile-based design. Amber achieves a peak energy efficiency of 538 INT16 GOPS/W and 483 BFloat16 GFLOPS/W. Compared with a CPU, a GPU, and a field-programmable gate array (FPGA), Amber has up to 3902x, 152x, and 107x better energy-delay product (EDP), respectively. 
    more » « less
  3. Static verification is used to ensure the correctness of programs. While useful in critical applications, the high overhead associated with writing specifications limits its general applicability. Similarly, the run-time costs introduced by dynamic verification limit its practicality. Gradual verification validates partially specified code statically where possible and dynamically where necessary. As a result, software developers gain granular control over the trade-offs between static and dynamic verification. This paper contains an end-to-end presentation of gradual verification in action, with a focus on applying it to 𝐶0 (a safe subset of C) and implementing the required dynamic verification. 
    more » « less
  4. null (Ed.)
    While distributed application-layer tracing is widely used for performance diagnosis in microservices, its coarse granularity at the service level limits its applicability towards detecting more fine-grained system level issues. To address this problem, cross-layer stitching of tracing information has been proposed. However, all existing cross-layer stitching approaches either require modification of the kernel or need updates in the application-layer tracing library to propagate stitching information, both of which add further complex modifications to existing tracing tools. This paper introduces Deepstitch, a deep learning based approach to stitch cross-layer tracing information without requiring any changes to existing application layer tracing tools. Deepstitch leverages a global view of a distributed application composed of multiple services and learns the global system call sequences across all services involved. This knowledge is then used to stitch system call sequences with service-level traces obtained from a deployed application. Our proof of concept experiments show that the proposed approach successfully maps application-level interaction into the system call sequences and can identify thread-level interactions. 
    more » « less
  5. Data privacy has garnered significant attention recently due to diverse applications that store sensitive data in untrusted infrastructure. From a data management point of view, the focus has been on the privacy of stored data and the privacy of querying data at a large scale. However, databases are not solely query engines on static data, they must support updates on dynamically evolving datasets. In this paper, we lay out a vision for privacy-preserving dynamic data. In particular, we focus on dynamic data that might be stored remotely on untrusted providers. Updates arrive at a provider and are verified and incorporated into the database based on predefined constraints. Depending on the application, the content of the stored data, the content of the updates and the constraints may be private or public. We then propose PReVer, a universal framework for managing regulated dynamic data in a privacy-preserving manner. We explore a set of research challenges that PReVer needs to address in order to guarantee the privacy of data, updates, and/or constraints and address the consistent and verifiable execution of updates. This opens the space of privacy-preserving data management from the narrow perspective of private queries on static datasets to the larger space of private management of dynamic data. 
    more » « less