AwkwardForth: accelerating Uproot with an internal DSL

Pivarski, Jim; Osborne, Ianna; Das, Pratyush; Lange, David; Elmer, Peter

doi:10.1051/epjconf/202125103002

Citation Details

AwkwardForth: accelerating Uproot with an internal DSL

File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializing data into Awkward Arrays. As a language, it is not intended for humans to write, but it loosens the coupling between Uproot and Awkward Array. AwkwardForth programs for deserializing record-oriented formats (ROOT and Avro) are about as fast as C++ ROOT and 10–80× faster than fastavro. Columnar formats (simple TTrees, RNTuple, and Parquet) only require specialization to interpret metadata and are therefore faster with precompiled code. more »

Award ID(s):: 1836650

NSF-PAR ID:: 10354369

Author(s) / Creator(s):: Pivarski, Jim; Osborne, Ianna; Das, Pratyush; Lange, David; Elmer, Peter

Editor(s):: Biscarat, C.; Campana, S.; Hegner, B.; Roiser, S.; Rovelli, C.I.; Stewart, G.A.

Date Published:: 2021-01-01

Journal Name:: EPJ Web of Conferences

Volume:: 251

ISSN:: 2100-014X

Page Range / eLocation ID:: 03002

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1051/epjconf/202125103002

More Like this