skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on November 17, 2025

Title: Reshaping High Energy Physics Applications for Near-Interactive Execution Using TaskVine
High energy physics experiments produce petabytes of data annually that must be reduced to gain insight into the laws of nature. Early-stage reduction executes long-running high-throughput workflows across thousands of nodes spanning multiple facilities to produce shared datasets. Later stages are typically written by individuals or small groups and must be refined and re-run many times for correctness. Reducing iteration times of later stages is key to accelerating discovery. We demonstrate our experience reshaping late-stage analysis applications on thousands of nodes. It is not enough merely to increase scale: it is necessary to make changes throughout the stack, including storage systems, data management, task scheduling, and application design. We demonstrate these changes when applied to two analysis applications built on open source data analysis frameworks (Coffea, Dask, TaskVine). We evaluate the performance of the applications on opportunistic campus clusters, showing effective scaling up to 7200 cores, thus producing significant speedup.  more » « less
Award ID(s):
1931348
PAR ID:
10567688
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-5291-7
Page Range / eLocation ID:
1 to 13
Format(s):
Medium: X
Location:
Atlanta, GA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Roth, A (Ed.)
    It is well understood that a system built from individually fair components may not itself be individually fair. In this work, we investigate individual fairness under pipeline composition. Pipelines differ from ordinary sequential or repeated composition in that individuals may drop out at any stage, and classification in subsequent stages may depend on the remaining “cohort” of individuals. As an example, a company might hire a team for a new project and at a later point promote the highest performer on the team. Unlike other repeated classification settings, where the degree of unfairness degrades gracefully over multiple fair steps, the degree of unfairness in pipelines can be arbitrary, even in a pipeline with just two stages. Guided by a panoply of real-world examples, we provide a rigorous framework for evaluating different types of fairness guarantees for pipelines. We show that naïve auditing is unable to uncover systematic unfairness and that, in order to ensure fairness, some form of dependence must exist between the design of algorithms at different stages in the pipeline. Finally, we provide constructions that permit flexibility at later stages, meaning that there is no need to lock in the entire pipeline at the time that the early stage is constructed. 
    more » « less
  2. An increasing number of distributed applications operate by dispatching function invocations across the nodes of a distributed system. To operate correctly, the code and data dependencies of the function must be distributed along with the invocations in some way. When translating applications to work on large scale distributed systems, managing these dependencies becomes challenging: delivery must be scalable to thousands of nodes; the dependencies must be consistent across the system; and the method must be usable by an unprivileged developer. As a solution, in this paper we present PONCHO, which is a lightweight Python based toolkit which allows users to discover, package, and deploy dependencies as an integral part of distributed applications. PONCHO encapsulates a set of commands to be executed within an environment. PONCHO offers a lightweight solution to create and manage environments increasing the portability of scientific applications as well as reproducibility. In this paper, we evaluate PONCHO with real-world applications in the fields of physics, computational chemistry, and hyperparameter optimization, We observe the challenges that arise when creating and distributing an environment and measure the overheads that emerge as a result. 
    more » « less
  3. Growth control is essential to establish organism size, so organisms must have mechanisms to both sense and adjust growth. Studies of single cells have revealed that size homeostasis can be achieved using distinct control methods: Sizer, Timer, and Adder. In multicellular organisms, mechanisms that regulate body size must not only control single cell growth but also integrate it across organs and tissues during development to generate adult size and shape. To investigate body size and growth control in metazoans, we can leverage the roundworm Caenorhabditis elegans as a scalable and tractable model. We collected precise growth measurements of thousands of individuals throughout larval development, measured feeding behavior to pinpoint larval transitions, and quantified highly accurate changes in animal size and shape during development. We find differences in the growth of animal length and width during larval transitions. Using a combination of quantitative measurements and mathematical modeling, we present two physical mechanisms by which C. elegans can control growth. First, constraints on cuticle stretch generate mechanical signals through which animals sense body size and initiate larval-stage transitions. Second, mechanical control of food intake drives growth rate within larval stages, but between stages, regulatory mechanisms influence growth. These results suggest how physical constraints control developmental timing and growth rate in C. elegans. 
    more » « less
  4. null (Ed.)
    Building information modeling (BIM) provides a novel way of information management for all lifecycle phases of a building project. It is facilitating the processes of a construction project, such as architectural design, structural analysis, and construction management. Industry foundation classes (IFC) is an open standard for information exchange between different BIM applications in the architecture, engineering, and construction (AEC) domain. It represents project information in an interoperable way that contains geometric information, material information, and other physical and functional information needed of analyzing and managing a project. Structural analysis aims to simulate the structural performance of a building under different types of loads to make sure the structure is safe. The needed information for structural analysis mainly include geometric, material, and load information. These information come from architectural design and selected analysis scenarios. The information should be represented in an interoperable way to allow information transfer between different phases and different stakeholders. Information missing is a crucial problem during the interoperable use of BIM, which may cause misunderstandings between different stakeholders and therefore erroneous structural analysis result and misleading information to feed construction process later on. In this paper, the authors focus on analyzing the use of IFC at three stages in structural analysis, namely, intrinsic modeling stage, extrinsic modeling stage, and the analysis stage. The authors compared IFC files at these three stages with original BIM software text files in terms of information coverage, and identified information missing cases. This is the first systematic investigation of BIM interoperability at detailed work stages of structural analysis and provides insights in how BIM usage should be improved in this domain. 
    more » « less
  5. Streaming computations often exhibit substantial data parallelism that makes them well-suited to SIMD architectures. However, many such computations also exhibit irregularity, in the form of data-dependent, dynamic data rates, that makes efficient SIMD execution challenging. One aspect of this challenge is the need to schedule execution of a computation realized as a pipeline of stages connected by finite queues. A scheduler must both ensure high SIMD occupancy by gathering queued items into vectors and minimize costs associated with switching execution between stages. In this work, we present the AFIE (Active Full, Inactive Empty) scheduling policy for irregular streaming applications on SIMD processors. AFIE provably groups inputs to each stage of a pipeline into a minimal number of SIMD vectors while incurring a bounded number of switches relative to the best possible policy. These results apply even though irregularity forbids a priori knowledge of how many outputs will be generated from each input to each stage. We have implemented AFIE as an extension to the MERCATOR system for building irregular streaming applications on NVIDIA GPUs. We describe how the AFIE scheduler simplifies MERCATOR’s runtime code and empirically measure the new scheduler’s improved performance on irregular streaming applications. 
    more » « less