Numerous devices, from network switches and servers to industrial control systems, can be unreliable if they are not configured properly. Even if a device’s implementation has been proven correct, it must still be configured to meet the specific functional and security requirements of its stakeholders. However, manual configuration remains labor intensive and error-prone even for experts. Automated configuration synthesis presents a promising way forward. Unfortunately, as we show, existing counterexample-guided algorithms can perform poorly if the system model allows configuration changes during execution. Yet disallowing such changes can hide significant problems, such as privilege escalation. We present a new synthesis algorithm that exploits structure inherent in state-machine models where the system configuration changes. We implement it using the Kodkod relational model finder, and show that it favorably solves a number of configuration-synthesis tasks. 
                        more » 
                        « less   
                    
                            
                            Dynamic Task Shaping for High Throughput Data Analysis Applications in High Energy Physics
                        
                    
    
            Distributed data analysis frameworks are widely used for processing large datasets generated by instruments in scientific fields such as astronomy, genomics, and particle physics. Such frameworks partition petabyte-size datasets into chunks and execute many parallel tasks to search for common patterns, locate unusual signals, or compute aggregate properties. When well-configured, such frameworks make it easy to churn through large quantities of data on large clusters. However, configuring frameworks presents a challenge for end users, who must select a variety of parameters such as the blocking of the input data, the number of tasks, the resources allocated to each task, and the size of nodes on which they run. If poorly configured, the result may perform many orders of magnitude worse than optimal, or the application may even fail to make progress at all. Even if a good configuration is found through painstaking observations, the performance may change drastically when the input data or analysis kernel changes. This paper considers the problem of automatically configuring a data analysis application for high energy physics (TopEFT) built upon standard frameworks for physics analysis (Coffea) and distributed tasking (Work Queue). We observe the inherent variability within the application, demonstrate the problems of poor configuration, and then develop several techniques for automatically sizing tasks to meet goals of resource consumption, and overall application completion. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 1931348
- PAR ID:
- 10356916
- Date Published:
- Journal Name:
- IPDPS International Parallel and Distributed Processing Symposium
- Page Range / eLocation ID:
- 346 to 356
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            IntroductionReconstructing low-level particle tracks in neutrino physics can address some of the most fundamental questions about the universe. However, processing petabytes of raw data using deep learning techniques poses a challenging problem in the field of High Energy Physics (HEP). In the Exa.TrkX Project, an illustrative HEP application, preprocessed simulation data is fed into a state-of-art Graph Neural Network (GNN) model, accelerated by GPUs. However, limited GPU memory often leads to Out-of-Memory (OOM) exceptions during training, due to the large size of models and datasets. This problem is exacerbated when deploying models on High-Performance Computing (HPC) systems designed for large-scale applications. MethodsWe observe a high workload imbalance issue during GNN model training caused by the irregular sizes of input graph samples in HEP datasets, contributing to OOM exceptions. We aim to scale GNNs on HPC systems, by prioritizing workload balance in graph inputs while maintaining model accuracy. Our paper introduces diverse balancing strategies aimed at decreasing the maximum GPU memory footprint and avoiding the OOM exception, across various datasets. ResultsOur experiments showcase memory reduction of up to 32.14% compared to the baseline. We also demonstrate the proposed strategies can avoid OOM in application. Additionally, we create a distributed multi-GPU implementation using these samplers to demonstrate the scalability of these techniques on the HEP dataset. DiscussionBy assessing the performance of these strategies as data loading samplers across multiple datasets, we can gauge their effectiveness in both single-GPU and distributed environments. Our experiments, conducted on datasets of varying sizes and across multiple GPUs, broaden the applicability of our work to various GNN applications that handle input datasets with irregular graph sizes.more » « less
- 
            null (Ed.)Many scientific applications operate on large datasets that can be partitioned and operated on concurrently.The existing approaches for concurrent execution generally rely on statically partitioned data. This static partitioning can lock performance in a sub-optimal configuration, leading to higher execution time and an inability to respond to dynamic resources.We present the Continuously Divisible Job abstraction which allows statically defined applications to have their component tasks dynamically sized responding to system behaviour. The Continuously Divisible Job abstraction defines a simple interface that dictates how work can be recursively divided, executed,and merged. Implementing this abstraction allows scientific applications to leverage dynamic job coordinators for execution.We also propose the Virtual File abstraction which allows read-only subsets of large files to be treated as separate files.In exploring the Continuously Divisible Job abstraction, two applications were implemented using the Continuously Divisible Job interface: a bioinformatics application and a high-energy physics event analysis. These were tested using an abstract job interface and several job coordinators. Comparing these against a previous static partitioning implementation we show comparable or better performance without having to make static decisions or implement complex dynamic application handling.more » « less
- 
            High energy physics experiments produce petabytes of data annually that must be reduced to gain insight into the laws of nature. Early-stage reduction executes long-running high-throughput workflows across thousands of nodes spanning multiple facilities to produce shared datasets. Later stages are typically written by individuals or small groups and must be refined and re-run many times for correctness. Reducing iteration times of later stages is key to accelerating discovery. We demonstrate our experience reshaping late-stage analysis applications on thousands of nodes. It is not enough merely to increase scale: it is necessary to make changes throughout the stack, including storage systems, data management, task scheduling, and application design. We demonstrate these changes when applied to two analysis applications built on open source data analysis frameworks (Coffea, Dask, TaskVine). We evaluate the performance of the applications on opportunistic campus clusters, showing effective scaling up to 7200 cores, thus producing significant speedup.more » « less
- 
            In emergency response scenarios, autonomous small Unmanned Aerial Systems (sUAS) must be configured and deployed quickly and safely to perform mission-specific tasks. In this paper, we present \DR, a Software Product Line for rapidly configuring and deploying a multi-role, multi-sUAS mission whilst guaranteeing a set of safety properties related to the sequencing of tasks within the mission. Individual sUAS behavior is governed by an onboard state machine, combined with coordination handlers which are configured dynamically within seconds of launch and ultimately determine the sUAS' behaviors, transition decisions, and interactions with other sUAS, as well as human operators. The just-in-time manner in which missions are configured precludes robust upfront testing of all conceivable combinations of features -- both within individual sUAS and across cohorts of collaborating ones. To ensure the absence of common types of configuration failures and to promote safe deployments, we check vital properties of the dynamically generated sUAS specifications and coordination handlers before sUAS are assigned their missions. We evaluate our approach in two ways. First, we perform validation tests to show that the end-to-end configuration process results in correctly executed missions, and second, we apply fault-based mutation testing to show that our safety checks successfully detect incorrect task sequences.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                    