skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Recent developments in histogram libraries
Boost.Histogram, a header-only C++14 library that provides multidimensional histograms and profiles, became available in Boost 1.70. It is extensible, fast, and uses modern C++ features. Using template metaprogramming, the most efficient code path for any given configuration is automatically selected. The library includes key features designed for the particle physics community, such as optional under- and overflow bins, weighted increments, reductions, growing axes, thread-safe filling, and memory-efficient counters with high-dynamic range. Python bindings for Boost.Histogram are being developed in the Scikit-HEP project to provide a fast, easy-to-install package as a backend for other Python libraries and for advanced users to manipulate histograms. Versatile and efficient histogram filling, effective manipulation, multithreading support, and other features make this a powerful tool. This library has also driven package distribution efforts in Scikit-HEP, allowing binary packages hosted on PyPI to be available for a very wide variety of platforms. Two other libraries fill out the remainder of the Scikit-HEP Python histogramming effort. Aghast is a library designed to provide conversions between different forms of histograms, enabling interaction between histogram libraries, often without an extra copy in memory. This enables a user to make a histogram in one library and then save it in another form, such as saving a Boost.Histogram in ROOT. And Hist is a library providing friendly, analyst-targeted syntax and shortcuts for quick manipulations and fast plotting using these two libraries.  more » « less
Award ID(s):
1836650
PAR ID:
10256978
Author(s) / Creator(s):
; ;
Editor(s):
Doglioni, C.; Kim, D.; Stewart, G.A.; Silvestris, L.; Jackson, P.; Kamleh, W.
Date Published:
Journal Name:
EPJ Web of Conferences
Volume:
245
ISSN:
2100-014X
Page Range / eLocation ID:
05014
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Analysis on HEP data is an iterative process in which the results of one step often inform the next. In an exploratory analysis, it is common to perform one computation on a collection of events, then view the results (often with histograms) to decide what to try next. Awkward Array is a Scikit-HEP Python package that enables data analysis with array-at-a-time operations to implement cuts as slices, combinatorics as composable functions, etc. However, most C++ HEP libraries, such as FastJet, have an imperative, one-particle-at-a-time interface, which would be inefficient in Python and goes against the grain of the array-at-a-time logic of scientific Python. Therefore, we developed fastjet, a pip-installable Python package that provides FastJet C++ binaries, the classic (particle-at-a-time) Python interface, and the new array-oriented interface for use with Awkward Array. The new interface streamlines interoperability with scientific Python software beyond HEP, such as machine learning. In one case, adopting this library along with other array-oriented tools accelerated HEP analysis code by a factor of 20. It was designed to be easily integrated with libraries in the Scikit-HEP ecosystem, including Uproot (file I/O), hist (histogramming), Vector (Lorentz vectors), and Coffea (high-level glue). We discuss the design of the fastjet Python library, integrating the classic interface with the array oriented interface and with the Vector library for Lorentz vector operations. The new interface was developed as open source. 
    more » « less
  2. Vanschoren, J (Ed.)
    As data are generated more and more from multiple disparate sources, multiview data sets, where each sample has features in distinct views, have grown in recent years. However, no comprehensive package exists that enables non-specialists to use these methods easily. mvlearn is a Python library which implements the leading multiview machine learning methods. Its simple API closely follows that of scikit-learn for increased ease-of-use. The package can be installed from Python Package Index (PyPI) and the conda package manager and is released under the MIT open-source license. The documentation, detailed examples, and all releases are available at https://mvlearn.github.io/. 
    more » « less
  3. piqtree is an easy to use, open-source Python package that directly exposes IQ-TREE’s phylogenetic inference engine. It offers Python functions for performing many of IQ-TREE’s capabilities including phylogenetic reconstruction, ultrafast bootstrapping, branch length optimisation, ModelFinder, rapid neighbour-joining, and more. By exposing IQ-TREE’s algorithms within Python, piqtree greatly simplifies the development of new phylogenetic workflows through seamless interoperability with other Python libraries and tools mediated by the cogent3 package. It also enables users to perform interactive analyses with IQ-TREE through, for instance, Jupyter notebooks. We present the key features available in the piqtree library and a small case study that showcases its interoperability. The piqtree library can be installed withpip install piqtree, with the documentation available at https://piqtree.readthedocs.io and source at https://github.com/iqtree/piqtree. 
    more » « less
  4. The Montage image mosaic engine has found wide applicability in astronomy research, integration into processing environments, and is an examplar application for the development of advanced cyber-infrastructure. It is written in C to provide performance and portability. Linking C/C++ libraries to the Python kernel at run time as binary extensions allows them to run under Python at compiled speeds and enables users to take advantage of all the functionality in Python. We have built Python binary extensions of the 59 ANSI-C modules that make up version 5 of the Montage toolkit. This has involved a turning the code into a C library, with driver code fully separated to reproduce the calling sequence of the command-line tools; and then adding Python and C linkage code with the Cython library, which acts as a bridge between general C libraries and the Python interface. We will demonstrate how to use these Python binary extensions to perform image processing, including reprojecting and resampling images, rectifying background emission to a common level, creation of image mosaics that preserve the calibration and astrometric fidelity of the input images, creating visualizations with an adaptive stretch algorithm, processing HEALPix images, and analyzing and managing image metadata. 
    more » « less
  5. Abstract We report the implementation of a hierarchical equations of motion (HEOM) module within the open‐source Libra software. It includes the standard and scaled HEOM algorithms for computing the dynamics of open quantum systems interacting with a harmonic bath. The module allows the computing of the evolution of the reduced density matrix, as well as spectral lineshapes. The truncation, filtering, and “update list” schemes, as well as OpenMP parallelization, allow for further computational saving. The package is written in a mix of C++ and Python languages, delivering the best compromise between user friendliness and efficiency. The Python layer of the package takes advantage of standard Python libraries, such as h5py, which allows efficient storage and retrieval of the generated results. The package can be seamlessly used within Jupyter notebooks; its careful design shall provide the maximal convenience and intuitiveness to its users. 
    more » « less