Search for: All records

Creators/Authors contains: "Ngadiuba, Jennifer"

« Prev Next »

Total Resources

11

Resource Type
Conference Paper

0

Conference Proceeding

0

Dataset

0

Journal Article

11

Workshop Report

0

Availability
Full Text / Resource Available

11

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Real-time semantic segmentation on FPGAs for autonomous vehicles with hls4ml

https://doi.org/10.1088/2632-2153/ac9cb5

Ghielmetti, Nicolò ; Loncar, Vladimir ; Pierini, Maurizio ; Roed, Marcel ; Summers, Sioni ; Aarrestad, Thea ; Petersson, Christoffer ; Linander, Hampus ; Ngadiuba, Jennifer ; Lin, Kelvin ; et al ( November 2022 , Machine Learning: Science and Technology)

Abstract In this paper, we investigate how field programmable gate arrays can serve as hardware accelerators for real-time semantic segmentation tasks relevant for autonomous driving. Considering compressed versions of the ENet convolutional neural network architecture, we demonstrate a fully-on-chip deployment with a latency of 4.9 ms per image, using less than 30% of the available resources on a Xilinx ZCU102 evaluation board. The latency is reduced to 3 ms per image when increasing the batch size to ten, corresponding to the use case where the autonomous vehicle receives inputs from multiple cameras simultaneously. We show, through aggressive filter reduction and heterogeneous quantization-aware training, and an optimized implementation of convolutional layers, that the power consumption and resource utilization can be significantly reduced while maintaining accuracy on the Cityscapes dataset.
more » « less
Full Text Available
A Reconfigurable Neural Network ASIC for Detector Front-End Data Compression at the HL-LHC

https://doi.org/10.1109/TNS.2021.3087100

Guglielmo, Giuseppe Di ; Fahim, Farah ; Herwig, Christian ; Valentin, Manuel Blanco ; Duarte, Javier ; Gingu, Cristian ; Harris, Philip ; Hirschauer, James ; Kwok, Martin ; Loncar, Vladimir ; et al ( August 2021 , IEEE Transactions on Nuclear Science)
null (Ed.)
Full Text Available
Fast convolutional neural networks on FPGAs with hls4ml

https://doi.org/10.1088/2632-2153/ac0ea1

Aarrestad, Thea ; Loncar, Vladimir ; Ghielmetti, Nicolò ; Pierini, Maurizio ; Summers, Sioni ; Ngadiuba, Jennifer ; Petersson, Christoffer ; Linander, Hampus ; Iiyama, Yutaro ; Di Guglielmo, Giuseppe ; et al ( July 2021 , Machine Learning: Science and Technology)
null (Ed.)
Full Text Available
Compressing deep neural networks on FPGAs to binary and ternary precision with hls4ml

https://doi.org/10.1088/2632-2153/aba042

Ngadiuba, Jennifer ; Loncar, Vladimir ; Pierini, Maurizio ; Summers, Sioni ; Di Guglielmo, Giuseppe ; Duarte, Javier ; Harris, Philip ; Rankin, Dylan ; Jindariani, Sergo ; Liu, Mia ; et al ( December 2020 , Machine Learning: Science and Technology)
Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs

Heinz, Aneesh ; Razavimaleki, Vasall ; Duarte, Javier ; DeZoort, Gage ; Ojalvo, Isobel ; Thais, Savannah ; Atkinson, Markus ; Neubauer, Mark ; Gray, Lindsey ; Jindariani, Sergo ; et al ( November 2020 , ArXivorg)
null (Ed.)
We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous platforms, and hls4ml, a high-level-synthesis-based compiler for neural network to firmware conversion. We evaluate and compare the resource usage, latency, and tracking performance of our implementations based on a benchmark dataset. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing workflows and the FPGA-based Level-1 trigger at the CERN Large Hadron Collider.
more » « less
Full Text Available
The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider

https://doi.org/10.21468/SciPostPhys.12.1.043

Aarrestad, Thea ; van Beekveld, Melissa ; Bona, Marcella ; Boveia, Antonio ; Caron, Sascha ; Davies, Joe ; de Simone, Andrea ; Doglioni, Caterina ; Duarte, Javier ; Farbin, Amir ; et al ( January 2022 , SciPost Physics)

We describe the outcome of a data challenge conducted as part of the Dark Machines (https://www.darkmachines.org) initiative and the Les Houches 2019 workshop on Physics at TeV colliders. The challenged aims to detect signals of new physics at the Large Hadron Collider (LHC) using unsupervised machine learning algorithms. First, we propose how an anomaly score could be implemented to define model-independent signal regions in LHC searches. We define and describe a large benchmark dataset, consisting of >1 billion simulated LHC events corresponding to 10\, fb^{-1} 10 f b − 1 of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at https://www.phenoMLdata.org. Code to reproduce the analysis is provided at https://github.com/bostdiek/DarkMachines-UnsupervisedChallenge.
more » « less
Full Text Available
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics

https://doi.org/10.3389/fdata.2020.598927

Iiyama, Yutaro ; Cerminara, Gianluca ; Gupta, Abhijay ; Kieseler, Jan ; Loncar, Vladimir ; Pierini, Maurizio ; Qasim, Shah Rukh ; Rieger, Marcel ; Summers, Sioni ; Van Onsem, Gerrit ; et al ( January 2021 , Frontiers in Big Data)
null (Ed.)
Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than one μs on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the hls4ml library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage.
more » « less
Full Text Available
Applications and Techniques for Fast Machine Learning in Science

https://doi.org/10.3389/fdata.2022.787421

Deiana, Allison McCarn ; Tran, Nhan ; Agar, Joshua ; Blott, Michaela ; Di Guglielmo, Giuseppe ; Duarte, Javier ; Harris, Philip ; Hauck, Scott ; Liu, Mia ; Neubauer, Mark S. ; et al ( April 2022 , Frontiers in Big Data)

In this community review report, we discuss applications and techniques for fast machine learning (ML) in science—the concept of integrating powerful ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs.
more » « less
Full Text Available
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Fahim, Farah ; Hawks, Benjamin ; Herwig, Christian ; Hirschauer, James ; Jindariani, Serge ; Nhan, Trần ; Carloni, Luca ; DiGuglielmo, Giuseppe ; Harris, Phillip ; Krupa, Jeffrey ; et al ( April 2021 , ArXivorg)
null (Ed.)
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.
more » « less
Full Text Available
FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing

https://doi.org/10.1007/s41781-019-0027-2

Duarte, Javier ; Harris, Philip ; Hauck, Scott ; Holzman, Burt ; Hsu, Shih-Chieh ; Jindariani, Sergo ; Khan, Suffian ; Kreis, Benjamin ; Lee, Brian ; Liu, Mia ; et al ( December 2019 , Computing and Software for Big Science)

Full Text Available

« Prev Next »