skip to main content


Title: Constraints on Future Analysis Metadata Systems in High Energy Physics
Abstract

In high energy physics (HEP), analysis metadata comes in many forms—from theoretical cross-sections, to calibration corrections, to details about file processing. Correctly applying metadata is a crucial and often time-consuming step in an analysis, but designing analysis metadata systems has historically received little direct attention. Among other considerations, an ideal metadata tool should be easy to use by new analysers, should scale to large data volumes and diverse processing paradigms, and should enable future analysis reinterpretation. This document, which is the product of community discussions organised by the HEP Software Foundation, categorises types of metadata by scope and format and gives examples of current metadata solutions. Important design considerations for metadata systems, including sociological factors, analysis preservation efforts, and technical factors, are discussed. A list of best practices and technical requirements for future analysis metadata systems is presented. These best practices could guide the development of a future cross-experimental effort for analysis metadata tools.

 
more » « less
NSF-PAR ID:
10375325
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more » ; ; ; ; ; « less
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Computing and Software for Big Science
Volume:
6
Issue:
1
ISSN:
2510-2036
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    Although engineering graduates are well prepared in the technical aspects of engineering, it is widely acknowledged that there is a need for a greater understanding of the socio-economic contexts in which they will practice their profession. The National Academy of Engineering (NAE) reinforces the critical role that engineers should play in addressing both problems and opportunities that are technical, social, economic, and political in nature in solving the grand challenges. This paper provides an overview of a nascent effort to address this educational need. Through a National Science Foundation (NSF) funded program, a team of researchers at West Virginia University has launched a Holistic Engineering Project Experience (HEPE). This undergraduate course provides the opportunity for engineering students to work with social science students from the fields of economics and strategic communication on complex and open-ended transportation engineering problems. This course involves cross-disciplinary teams working under diverse constraints of real-world social considerations, such as economic impacts, public policy concerns, and public perception and outreach factors, considering the future autonomous transportation systems. The goal of the HEPE platform is for engineering students to have an opportunity to build non-technical—but highly in-demand—professional skills that promote collaboration with others involved in the socio-economic context of engineering matters. Conversely, the HEPE approach provides an opportunity for non-engineering students to become exposed to key concepts and practices in engineering. This paper outlines the initial implementation of the HEPE program, by placing the effort in context of broader trends in education, by outlining the overall purposes of the program, discussing the course design and structure, reviewing the learning experience and outcomes assessment process, and providing preliminary results of a baseline survey that gauges students interests and attitudes towards collaborative and interdisciplinary learning. 
    more » « less
  2. null (Ed.)
    One of the most costly factors in providing a global computing infrastructure such as the WLCG is the human effort in deployment, integration, and operation of the distributed services supporting collaborative computing, data sharing and delivery, and analysis of extreme scale datasets. Furthermore, the time required to roll out global software updates, introduce new service components, or prototype novel systems requiring coordinated deployments across multiple facilities is often increased by communication latencies, staff availability, and in many cases expertise required for operations of bespoke services. While the WLCG (and distributed systems implemented throughout HEP) is a global service platform, it lacks the capability and flexibility of a modern platform-as-a-service including continuous integration/continuous delivery (CI/CD) methods, development-operations capabilities (DevOps, where developers assume a more direct role in the actual production infrastructure), and automation. Most importantly, tooling which reduces required training, bespoke service expertise, and the operational effort throughout the infrastructure, most notably at the resource endpoints (sites), is entirely absent in the current model. In this paper, we explore ideas and questions around potential NoOps models in this context: what is realistic given organizational policies and constraints? How should operational responsibility be organized across teams and facilities? What are the technical gaps? What are the social and cybersecurity challenges? Conversely what advantages does a NoOps model deliver for innovation and for accelerating the pace of delivery of new services needed for the HL-LHC era? We will describe initial work along these lines in the context of providing a data delivery network supporting IRIS-HEP DOMA R&D. 
    more » « less
  3. SUMMARY

    We present a new compilation and analysis of broad-band ocean bottom seismometer noise properties from 15 yr of seismic deployments. We compile a comprehensive data set of representative four-component (seismometer and pressure gauge) noise spectra and cross-spectral properties (coherence, phase and admittance) for 551 unique stations spanning 18 U.S.-led experiments. This is matched with a comprehensive compilation of metadata parameters related to instrumentation and environmental properties for each station. We systematically investigate the similarity of noise spectra by grouping them according to these metadata parameters to determine which factors are the most important in determining noise characteristics. We find evidence for improvements in similarity of noise properties when grouped across parameters, with groupings by seismometer type and deployment water depth yielding the most significant and interpretable results. Instrument design, that is the entire deployed package, also plays an important role, although it strongly covaries with seismometer and water depth. We assess the presence of traditional sources of tilt, compliance, and microseismic noise to characterize their relative role across a variety of commonly used seismic frequency bands. We find that the presence of tilt noise is primarily dependent on the type of seismometer used (covariant with a particular subset of instrument design), that compliance noise follows anticipated relationships with water depth, and that shallow, oceanic shelf environments have systematically different microseism noise properties (which are, in turn, different from instruments deployed in shallow lake environments). These observations have important implications for the viability of commonly used seismic analysis techniques. Finally, we compare spectra and coherences before and after vertical channel tilt and compliance noise removal to evaluate the efficacy and limitations of these now standard processing techniques. These findings may assist in future experiment planning and instrument development, and our newly compiled noise data set serves as a building block for more targeted future investigations by the marine seismology community.

     
    more » « less
  4. null (Ed.)
    The systemic challenges of the COVID-19 pandemic require cross-disciplinary collaboration in a global and timely fashion. Such collaboration needs open research practices and the sharing of research outputs, such as data and code, thereby facilitating research and research reproducibility and timely collaboration beyond borders. The Research Data Alliance COVID-19 Working Group recently published a set of recommendations and guidelines on data sharing and related best practices for COVID-19 research. These guidelines include recommendations for researchers, policymakers, funders, publishers and infrastructure providers from the perspective of different domains (Clinical Medicine, Omics, Epidemiology, Social Sciences, Community Participation, Indigenous Peoples, Research Software, Legal and Ethical Considerations). Several overarching themes have emerged from this document such as the need to balance the creation of data adherent to FAIR principles (findable, accessible, interoperable and reusable), with the need for quick data release; the use of trustworthy research data repositories; the use of well-annotated data with meaningful metadata; and practices of documenting methods and software. The resulting document marks an unprecedented cross-disciplinary, cross-sectoral, and cross-jurisdictional effort authored by over 160 experts from around the globe. This letter summarises key points of the Recommendations and Guidelines, highlights the relevant findings, shines a spotlight on the process, and suggests how these developments can be leveraged by the wider scientific community. 
    more » « less
  5. This paper discusses three aspects of nonlinear dynamic analysis (NDA) practices that are important for evaluating the seismic performance of geotechnical structures affected by liquefaction or cyclic softening: (1) selection and calibration of constitutive models, (2) comparison of NDA results using two or more constitutive models, and (3) documentation. The ability of the selected constitutive models and calibration protocols to approximate the loading responses important to the system being analyzed is one of several technical factors affecting the quality of results from an NDA. Comparisons of single element simulations against empirical data for a broad range of loading conditions are essential for evaluating this factor. Critical comparisons of NDAs using two or more constitutive models are valuable for evaluating modeling uncertainty for specific systems and for identifying modeling limitations that need improvement. The utility of an NDA study depends on the documentation being sufficiently thorough to facilitate effective reviews, advance best practices, and support future reexaminations of a system's seismic performance. 
    more » « less