skip to main content


Title: DAPT: A package enabling distributed automated parameter testing
Modern agent-based models (ABM) and other simulation models require evaluation and testing of many different parameters. Managing that testing for large scale parameter sweeps (grid searches), as well as storing simulation data, requires multiple, potentially customizable steps that may vary across simulations. Furthermore, parameter testing, processing, and analysis are slowed if simulation and processing jobs cannot be shared across teammates or computational resources. While high-performance computing (HPC) has become increasingly available, models can often be tested faster with the use of multiple computers and HPC resources. To address these issues, we created the Distributed Automated Parameter Testing (DAPT) Python package. By hosting parameters in an online (and often free) “database”, multiple individuals can run parameter sets simultaneously in a distributed fashion, enabling ad hoc crowdsourcing of computational power. Combining this with a flexible, scriptable tool set, teams can evaluate models and assess their underlying hypotheses quickly. Here, we describe DAPT and provide an example demonstrating its use.  more » « less
Award ID(s):
1735095
NSF-PAR ID:
10290192
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Gigabyte
Volume:
2021
ISSN:
2709-4715
Page Range / eLocation ID:
1 to 10
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. High-throughput screening (HTS) can significantly accelerate the design of new materials, allowing for automatic testing of a large number of material compositions and process parameters. Using HTS in Integrated Computational Materials Engineering (ICME), the computational evaluation of multiple combinations can be performed before empirical testing, thus reducing the use of material and resources. Conducting computational HTS involves the application of high-throughput computing (HTC) and developing suitable tools to handle such calculations. Among multiple ICME methods compatible with HTS and HTC, the calculation of phase diagrams known as the CALPHAD method has gained prominence. When combining thermodynamic modeling with kinetic simulations, predicting the entire history of precipitation behavior is possible. However, most reported CALPHAD-based HTS frameworks are restricted to thermodynamic modeling or not accessible. The present work introduces CAROUSEL—an open-sourCe frAmewoRk fOr high-throUghput microStructurE simuLations. It is designed to explore various alloy compositions, processing parameters, and CALPHAD implementations. CAROUSEL offers a graphical interface for easy interaction, scripting workflow for advanced simulations, the calculation distribution system, and simulation data management. Additionally, CAROUSEL incorporates visual tools for exploring the generated data and integrates through-process modeling, accounting for the interplay between solidification and solid-state precipitation. The application area is various metal manufacturing processes where the precipitation behavior is crucial. The results of simulations can be used in upscale material models, thus covering different microstructural phenomena. The present work demonstrates how CAROUSEL can be used for additive manufacturing (AM), particularly for investigating different chemical compositions and heat treatment parameters (e.g., temperature, duration 
    more » « less
  2. As large-scale scientific simulations and big data analyses become more popular, it is increasingly more expensive to store huge amounts of raw simulation results to perform post-analysis. To minimize the expensive data I/O, “in-situ” analysis is a promising approach, where data analysis applications analyze the simulation generated data on the fly without storing it first. However, it is challenging to organize, transform, and transport data at scales between two semantically different ecosystems due to the distinct software and hardware difference. To tackle these challenges, we design and implement the X-Composer framework. X-Composer connects cross-ecosystem applications to form an “in-situ” scientific workflow, and provides a unified approach and recipe for supporting such hybrid in-situ workflows on distributed heterogeneous resources. X-Composer reorganizes simulation data as continuous data streams and feeds them seamlessly into the Cloud-based stream processing services to minimize I/O overheads. For evaluation, we use X-Composer to set up and execute a cross-ecosystem workflow, which consists of a parallel Computational Fluid Dynamics simulation running on HPC, and a distributed Dynamic Mode Decomposition analysis application running on Cloud. Our experimental results show that X-Composer can seamlessly couple HPC and Big Data jobs in their own native environments, achieve good scalability, and provide high-fidelity analytics for ongoing simulations in real-time. 
    more » « less
  3. null (Ed.)
    High Performance Computing (HPC) stands at the forefront of engineering innovation. With affordable and advanced HPC resources more readily accessible than ever before, computational simulation of complex physical phenomena becomes an increasingly attractive strategy to predict the physical behavior of diverse engineered systems. Furthermore, novel applications of HPC in engineering are highly interdisciplinary, requiring advanced skills in mathematical modeling, algorithm development as well as programming skills for parallel, distributed and concurrent architectures and environments. This and other possible reasons have created a shortage of qualified workforce to conduct the much-needed research and development in these areas. This paper describes our experience with mentoring a cohort of ten high achieving undergraduate students in Summer 2019 to conduct engineering HPC research for ten weeks in X University. Our mentoring activity was informed and motivated by an initial informal study with the goal to learn the roles and status of HPC in engineering research and what can be improved to make more effective use of it. Through a combination of email surveys, in-person interviews, and a manual analysis of faculty research profiles in X University, we learn several lessons. First, a large proportion of the engineering faculty conducts research that is highly mathematical and computational and driven by disciplinary sciences, where simulation and HPC are widely needed as solutions. Second, due to the lack of resources to provide the necessary training in software development to their students, the interviewed engineering groups are limited in their ability to fully leveraging HPC capability in their research. Therefore, novel pathways for training and educating engineering researchers in HPC software development must be explored in order to further advance the engineering research capability in HPC. With a multi-year support from NSF, our summer research mentoring activities were able to accommodate ten high-achieving undergraduate students recruited from across the USA and their faculty mentors on the theme of HPC applications in engineering research. We describe the processes of students recruitment and selection, training and engagement, research mentoring, and professional development for the students. Best practices and lessons learned are identified and summarized based on our own observations and the evaluation conducted by an independent evaluator. In particular, improvements are being planned so as to deliver a more wholistic and rigorous research experience for future cohorts. 
    more » « less
  4. Regional extent and spatiotemporal dynamics of Arctic permafrost disturbances remain poorly quantified. High spatial resolution commercial satellite imagery enables transformational opportunities to observe, map, and document the micro-topographic transitions occurring in Arctic polygonal tundra at multiple spatial and temporal frequencies. The entire Arctic has been imaged at 0.5 m or finer resolution by commercial satellite sensors. The imagery is still largely underutilized, and value-added Arctic science products are rare. Knowledge discovery through artificial intelligence (AI), big imagery, high performance computing (HPC) resources is just starting to be realized in Arctic science. Large-scale deployment of petabyte-scale imagery resources requires sophisticated computational approaches to automated image interpretation coupled with efficient use of HPC resources. In addition to semantic complexities, multitude factors that are inherent to sub-meter resolution satellite imagery, such as file size, dimensions, spectral channels, overlaps, spatial references, and imaging conditions challenge the direct translation of AI-based approaches from computer vision applications. Memory limitations of Graphical Processing Units necessitates the partitioning of an input satellite imagery into manageable sub-arrays, followed by parallel predictions and post-processing to reconstruct the results corresponding to input image dimensions and spatial reference. We have developed a novel high performance image analysis framework –Mapping application for Arctic Permafrost Land Environment (MAPLE) that enables the integration of operational-scale GeoAI capabilities into Arctic science applications. We have designed the MAPLE workflow to become interoperable across HPC architectures while utilizing the optimal use of computing resources.

     
    more » « less
  5. null ; null ; null ; null ; null ; null (Ed.)
    The National Ecological Observatory Network (NEON) is a continental-scale observatory with sites across the US collecting standardized ecological observations that will operate for multiple decades. To maximize the utility of NEON data, we envision edge computing systems that gather, calibrate, aggregate, and ingest measurements in an integrated fashion. Edge systems will employ machine learning methods to cross-calibrate, gap-fill and provision data in near-real time to the NEON Data Portal and to High Performance Computing (HPC) systems, running ensembles of Earth system models (ESMs) that assimilate the data. For the first time gridded EC data products and response functions promise to offset pervasive observational biases through evaluating, benchmarking, optimizing parameters, and training new ma- chine learning parameterizations within ESMs all at the same model-grid scale. Leveraging open-source software for EC data analysis, we are al- ready building software infrastructure for integration of near-real time data streams into the International Land Model Benchmarking (ILAMB) package for use by the wider research community. We will present a perspective on the design and integration of end-to-end infrastructure for data acquisition, edge computing, HPC simulation, analysis, and validation, where Artificial Intelligence (AI) approaches are used throughout the distributed workflow to improve accuracy and computational performance. 
    more » « less