Using GUI-based workflows for data analysis is an iterative process. During each iteration, an analyst makes changes to the workflow to improve it, generating a new version each time. The results produced by executing these versions are materialized to help users refer to them in the future. In many cases, a new version of the workflow, when submitted for execution, produces a result equivalent to that of a previous one. Identifying such equivalence can save computational resources and time by reusing the materialized result. One way to optimize the performance of executing a new version is to compare the current version with a previous one and test if they produce the same results using a workflow version equivalence verifier. As the number of versions grows, this testing can become a computational bottleneck. In this paper, we present Raven, an optimization framework to accelerate the execution of a new version request by detecting and reusing the results of previous equivalent versions with the help of a version equivalence verifier. Raven ranks and prunes the set of prior versions to quickly identify those that may produce an equivalent result to the version execution request. Additionally, when the verifier performs computation to verify the equivalence of a version pair, there may be a significant overlap with previously tested version pairs. Raven identifies and avoids such repeated computations by extending the verifier to reuse previous knowledge of equivalence tests. We evaluated the effectiveness of Raven compared to baselines on real workflows and datasets.
more »
« less
CHEX: Multiversion Replay with Ordered Checkpoints.
In scientific computing and data science disciplines, it is often necessary to share application workflows and repeat results. Current tools containerize application workflows, and share the resulting container for repeating results. These tools, due to containerization, do improve sharing of results. However, they do not improve the efficiency of replay. In this paper, we present the multiversion replay problem, which arises when multiple versions of an application are containerized, and each version must be replayed to repeat results. To avoid executing each version separately, we develop CHEX, which checkpoints program state and determines when it is permissible to reuse program state across versions. It does so using system call-based execution lineage. Our capability to identify common computations across versions enables us to consider optimizing replay using an in-memory cache, based on a checkpoint-restore-switch system. We show the multiversion replay problem is NP-hard, and propose efficient heuristics for it. CHEX reduces overall replay time by sharing common computations but avoids storing a large number of checkpoints. We demonstrate that CHEX maintains lightweight package sharing, and improves the total time of multiversion replay by 50% on average.
more »
« less
- PAR ID:
- 10325811
- Editor(s):
- J. Freire and Xuemin Lin
- Date Published:
- Journal Name:
- Proceedings of the Very Large Databases
- Volume:
- 15
- Issue:
- 6
- Page Range / eLocation ID:
- 1297-1310
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
When a security vulnerability or other critical bug is not detected by the developers' test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This can be challenging when factors in a specific user environment triggered the bug. If enabled, however, record-replay technology faithfully replays the execution in the developer environment as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. Many modern record-replay tools integrate interactive debuggers, to help locate the root cause, but don't help the developers test whether their patch indeed eliminates the bug under those same conditions. In particular, modern record-replay tools that reproduce intermediate program state cannot replay recordings made with one version of a program using a different version of the program where the differences affect program state. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the bug, enabling developers to check whether a patch indeed fixes that bug. The tests essentially replay recordings made with one version of a program using a different version of the program, even when the the differences impact program state, by manipulating both the binary executable and the recorded log to result in an execution consistent with what would have happened had the the patched version executed in the user environment under the same conditions where the bug manifested with the original version. Our approach also enables users to make new recordings of their own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not break functionality they rely on.more » « less
-
When a security vulnerability or other critical bug is not detected by the developers’ test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before cyberattackers pounce on exposed user installations. This can be challenging when the bug discovery was due to factors that arose, perhaps transiently, in a specific user environment. If recording execution traces when the bad behavior occurred, record-replay technology faithfully replays the execution, in the developer environment, as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. So the bug is reproduced, and many modern record-replay tools also integrate bug reproduction with interactive debuggers to help locate the root cause, but how do developers check whether their patch indeed eliminates the bug under those same conditions? State-of-the-art record-replay does not support replaying candidate patches that modify the program in ways that diverge program state from the original recording, but successful repairs necessarily diverge so the bug no longer manifests. This work builds on recordreplay, and binary rewriting, to automatically generate and run tests for candidate patches. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the vulnerability, to check whether a patch indeed closes the vulnerability but does not modify the corresponding segment of the program’s core semantics. Unlike conventional ad hoc testing, each test is reproducible and can be applied to as many prospective patches as needed until developers are satisfied. The proposed approach also enables users to make new recordings of her own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not introduce new problems before adopting.more » « less
-
Intermittent systems operate embedded devices without a source of constant reliable power, relying instead on an unreliable source such as an energy harvester. They overcome the limitation of intermittent power by retaining and restoring system state as checkpoints across periods of power loss. Previous works have addressed a multitude of problems created by the intermittent paradigm, but do not consider securing intermittent systems. In this paper, we address the security concerns created through the introduction of checkpoints to an embedded device. When the non-volatile memory that holds checkpoints can be tampered, the checkpoints can be replayed or duplicated. We propose secure application continuity as a defense against these attacks. Secure application continuity provides assurance that an application continues where it left off upon power loss. In our secure continuity solution, we define a protocol that adds integrity, authenticity, and freshness to checkpoints. We develop two solutions for our secure checkpointing design. The first solution uses a hardware accelerated implementation of AES, while the second one is based on a software implementation of a lightweight cryptographic algorithm, Chaskey. We analyze the feasibility and overhead of these designs in terms of energy consumption, execution time, and code size across several application configurations. Then, we compare this overhead to a non-secure checkpointing system. We conclude that securing application continuity does not come cheap and that it increases the overhead of checkpoint restoration from 3.79 μJ to 42.96 μJ with the hardware accelerated solution and 57.02 μJ with the software based solution. To our knowledge, no one has yet considered the cost to provide security guarantees for intermittent operations. Our work provides future developers with an empirical evaluation of this cost, and with a problem statement for future research in this area.more » « less
-
null (Ed.)In modern Machine Learning, model training is an iterative, experimental process that can consume enormous computation resources and developer time. To aid in that process, experienced model developers log and visualize program variables during training runs. Exhaustive logging of all variables is infeasible, so developers are left to choose between slowing down training via extensive conservative logging, or letting training run fast via minimalist optimistic logging that may omit key information. As a compromise, optimistic logging can be accompanied by program checkpoints; this allows developers to add log statements post-hoc, and "replay" desired log statements from checkpoint---a process we refer to as hindsight logging. Unfortunately, hindsight logging raises tricky problems in data management and software engineering. Done poorly, hindsight logging can waste resources and generate technical debt embodied in multiple variants of training code. In this paper, we present methodologies for efficient and effective logging practices for model training, with a focus on techniques for hindsight logging. Our goal is for experienced model developers to learn and adopt these practices. To make this easier, we provide an open-source suite of tools for Fast Low-Overhead Recovery (flor) that embodies our design across three tasks: (i) efficient background logging in Python, (ii) adaptive periodic checkpointing, and (iii) an instrumentation library that codifies hindsight logging for efficient and automatic record-replay of model-training. Model developers can use each flor tool separately as they see fit, or they can use flor in hands-free mode, entrusting it to instrument their code end-to-end for efficient record-replay. Our solutions leverage techniques from physiological transaction logs and recovery in database systems. Evaluations on modern ML benchmarks demonstrate that flor can produce fast checkpointing with small user-specifiable overheads (e.g. 7%), and still provide hindsight log replay times orders of magnitude faster than restarting training from scratch.more » « less