skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Recorder 2.0: Efficient Parallel I/O Tracing and Analysis
Recorder is a multi-level I/O tracing tool that captures HDF5, MPI-I/O, and POSIX I/O calls. In this paper, we present a new version of Recorder that adds support for most metadata POSIX calls such as stat, link, and rename. We also introduce a compressed tracing format to reduce trace file size and run time overhead incurred from collecting the trace data. Moreover, we add a set of post-mortem and visualization routines to our new version of Recorder that manage the compressed trace data for users. Our experiments with four HPC applications show a file size reduction of over 2× and reduced post-processing time by 20% when using our new compressed trace file format.  more » « less
Award ID(s):
1763540
PAR ID:
10155948
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
The IEEE International Workshop on High-Performance Storage
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modern applications use storage systems in complex and often surprising ways. Tracing system calls is a common approach to understanding applications' behavior, allowing offline analysis and enabling replay in other environments. But current system-call tracing tools have drawbacks: (1) they often omit some information---such as raw data buffers---needed for full analysis; (2) they have high overheads; (3) they often use non-portable trace formats; and (4) they may not offer useful and scalable analysis and replay tools. We have developed Re-Animator, a powerful system-call tracing tool that focuses on storage-related calls and collects maximal information, capturing complete data buffers and writing all traces in the standard DataSeries format. We also created a prototype replayer that focuses on calls related to file-system state. We evaluated our system on long-running server applications such as key-value stores and databases. Our tracer has an average overhead of only 1.8-2.3×, but the overhead can be as low as 5% for I/O-bound applications. Our replayer verifies that its actions are correct, and faithfully reproduces the logical file system state generated by the original application. 
    more » « less
  2. Many scientific applications operate on data sets that span hundreds of Gigabytes or even Terabytes in size. Large data sets often use compression to reduce the size of the files. Yet as of today, parallel I/O libraries do not support reading and writing compressed files, necessitating either expensive sequential compression/decompression operations before/after the simulation, or omitting advanced features of parallel I/O libraries, such as collective I/O operations. This paper introduces parallel I/O on compressed data files, discusses the key challenges, requirements, and solutions for supporting compressed data files in MPI I/O, as well as limitations on some MPI I/O operations when using compressed data files. The paper details handling of individual read and write operations of compressed data files, and presents an extension to the two-phase collective I/O algorithm to support data compression. The paper further presents and evaluates an implementation based on the Snappy compression library and the OMPIO parallel I/O framework. The performance evaluation using multiple data sets demonstrate significant performance benefits when using data compression on a parallel BeeGFS file system. 
    more » « less
  3. We present SplitFS, a file system for persistent memory (PM) that reduces software overhead significantly compared to state-of-the-art PM file systems. SplitFS presents a novel split of responsibilities between a user-space library file system and an existing kernel PM file system. The user-space library file system handles data operations by intercepting POSIX calls, memory-mapping the underlying file, and serving the read and overwrites using processor loads and stores. Metadata operations are handled by the kernel PM file system (ext4 DAX). SplitFS introduces a new primitive termed relink to efficiently support file appends and atomic data operations. SplitFS provides three consistency modes, which different applications can choose from, without interfering with each other. SplitFS reduces software overhead by up-to 4× compared to the NOVA PM file system, and 17× compared to ext4 DAX. On a number of micro-benchmarks and applications such as the LevelDB key-value store running the YCSB benchmark, SplitFS increases application performance by up to 2× compared to ext4 DAX and NOVA while providing similar consistency guarantees. 
    more » « less
  4. During 2016 - 2023, during the bird breeding season, we collected 99,778 files of bioacoustic recordings in and near the Hubbard Brook Experimental Forest in New Hampshire. Here, we provide a manifest of the sound files. Most files are one-hour recordings collected at 32 kHz and saved in FLAC format (~ 25 MB per file, ~ 13 TB total). Typical recording configuration was 05:00 - 08:00 and 17:30 - 20:30 local time. The full sound files have been saved in three respositories: two copies at Dartmouth College (Ayres lab) and one copy at the Macauley Library, Cornell Laboratory of Ornithology. The full sound files are available upon request. The file attributes within the manifest include date, start time, and recorder group: e.g., Main, 10ha, Oven, VW, AshBirch, and Ridge. Each recorder group had 5 - 20 recorders at plots separated by >100 m. Coordinates of each recorder are associated with plot names within metadata. The bird species expected to occur in these recordings are those from Holmes et al. (2021). These data were gathered as part of the Hubbard Brook Ecosystem Study (HBES). The HBES is a collaborative effort at the Hubbard Brook Experimental Forest, which is operated and maintained by the USDA Forest Service, Northern Research Station. Holmes, R., S. Sillett, and M. Hallworth. 2021. Bird species recorded within the Hubbard Brook Experimental Forest and vicinity (1963-2020; updated January 2021). ver 1. Environmental Data Initiative. https://doi.org/10.6073/pasta/da6cbb1ed8142d52a9d72762983742d8 (Accessed 2024-10-24). 
    more » « less
  5. Mikolaj Bojanczyk; Emanuela Merelli; David P. Woodruff (Ed.)
    Two equal length strings are a parameterized match (p-match) iff there exists a one-to-one function that renames the symbols in one string to those in the other. The Parameterized Suffix Tree (PST) [Baker, STOC' 93] is a fundamental data structure that handles various string matching problems under this setting. The PST of a text T[1,n] over an alphabet Σ of size σ takes O(nlog n) bits of space. It can report any entry in (parameterized) (i) suffix array, (ii) inverse suffix array, and (iii) longest common prefix (LCP) array in O(1) time. Given any pattern P as a query, a position i in T is an occurrence iff T[i,i+|P|-1] and P are a p-match. The PST can count the number of occurrences of P in T in time O(|P|log σ) and then report each occurrence in time proportional to that of accessing a suffix array entry. An important question is, can we obtain a compressed version of PST that takes space close to the text’s size of nlogσ bits and still support all three functionalities mentioned earlier? In SODA' 17, Ganguly et al. answered this question partially by presenting an O(nlogσ) bit index that can support (parameterized) suffix array and inverse suffix array operations in O(log n) time. However, the compression of the (parameterized) LCP array and the possibility of faster suffix array and inverse suffix array queries in compact space were left open. In this work, we obtain a compact representation of the (parameterized) LCP array. With this result, in conjunction with three new (parameterized) suffix array representations, we obtain the first set of PST representations in o(nlog n) bits (when logσ = o(log n)) as follows. Here ε > 0 is an arbitrarily small constant. - Space O(n logσ) bits and query time O(log_σ^ε n); - Space O(n logσlog log_σ n) bits and query time O(log log_σ n); and - Space O(n logσ log^ε_σ n) bits and query time O(1). The first trade-off is an improvement over Ganguly et al.’s result, whereas our third trade-off matches the optimal time performance of Baker’s PST while squeezing the space by a factor roughly log_σ n. We highlight that our trade-offs match the space-and-time bounds of the best-known compressed text indexes for exact pattern matching and further improvement is highly unlikely. 
    more » « less