Search for: All records

Award ID contains: 1763795

« Prev Next »

Total Resources

7

Resource Type
Conference Paper

6

Conference Proceeding

0

Dataset

0

Journal Article

1

Workshop Report

0

Availability
Full Text / Resource Available

7

Citation Only

0

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

FPGA-Based Acceleration of Time Series Similarity Prediction: From Cloud to Edge

https://doi.org/10.1145/3555810

Kalantar, Amin ; Zimmerman, Zachary ; Brisk, Philip ( July 2022 , ACM Transactions on Reconfigurable Technology and Systems)

With the proliferation of low-cost sensors and the Internet of Things, the rate of producing data far exceeds the compute and storage capabilities of today’s infrastructure. Much of this data takes the form of time series, and in response, there has been increasing interest in the creation of time series archives in the last decade, along with the development and deployment of novel analysis methods to process the data. The general strategy has been to apply a plurality of similarity search mechanisms to various subsets and subsequences of time series data in order to identify repeated patterns and anomalies; however, the computational demands of these approaches renders them incompatible with today’s power-constrained embedded CPUs. To address this challenge, we present FA-LAMP, an FPGA-accelerated implementation of the Learned Approximate Matrix Profile (LAMP) algorithm, which predicts the correlation between streaming data sampled in real-time and a representative time series dataset used for training. FA-LAMP lends itself as a real-time solution for time series analysis problems such as classification. We present the implementation of FA-LAMP on both edge- and cloud-based prototypes. On the edge devices, FA-LAMP integrates accelerated computation as close as possible to IoT sensors, thereby eliminating the need to transmit and store data in the cloud for posterior analysis. On the cloud-based accelerators, FA-LAMP can execute multiple LAMP models on the same board, allowing simultaneous processing of incoming data from multiple data sources across a network. LAMP employs a Convolutional Neural Network (CNN) for prediction. This work investigates the challenges and limitations of deploying CNNs on FPGAs using the Xilinx Deep Learning Processor Unit (DPU) and the Vitis AI development environment. We expose several technical limitations of the DPU, while providing a mechanism to overcome them by attaching custom IP block accelerators to the architecture. We evaluate FA-LAMP using a low-cost Xilinx Ultra96-V2 FPGA as well as a cloud-based Xilinx Alveo U280 accelerator card and measure their performance against a prototypical LAMP deployment running on a Raspberry Pi 3, an Edge TPU, a GPU, a desktop CPU, and a server-class CPU. In the edge scenario, the Ultra96-V2 FPGA improved performance and energy consumption compared to the Raspberry Pi; in the cloud scenario, the server CPU and GPU outperformed the Alveo U280 accelerator card, while the desktop CPU achieved comparable performance; however, the Alveo card offered an order of magnitude lower energy consumption compared to the other four platforms. Our implementation is publicly available at https://github.com/aminiok1/lamp-alveo.
more » « less
Full Text Available
Matrix Profile Index Approximation for Streaming Time Series

https://doi.org/10.1109/BigData52589.2021.9671484

Shahcheraghi, Maryam ; Cappon, Trevor ; Oymak, Samet ; Papalexakis, Evangelos ; Keogh, Eamonn ; Zimmerman, Zachary ; Brisk, Philip ( December 2021 , IEEE International Conference on Big Data)

Full Text Available
FA-LAMP: FPGA-Accelerated Learned Approximate Matrix Profile for Time Series Similarity Prediction

https://doi.org/10.1109/FCCM51124.2021.00013

Kalantar, Amin ; Zimmerman, Zachary ; Brisk, Philip ( May 2021 , International Symposium on Field-Programmable Custom Computing Machines)
null (Ed.)
With the proliferation of low-cost sensors and the Internet-of-Things (IoT), the rate of producing data far exceeds the compute and storage capabilities of today’s infrastructure. Much of this data takes the form of time series, and in response, there has been increasing interest in the creation of time series archives in the last decade, along with the development and deployment of novel analysis methods to process the data. The general strategy has been to apply a plurality of similarity search mechanisms to various subsets and subsequences of time series data in order to identify repeated patterns and anomalies; however, the computational demands of these approaches renders them incompatible with today’s power-constrained embedded CPUs. To address this challenge, we present FA-LAMP, an FPGA-accelerated implementation of the Learned Approximate Matrix Profile (LAMP) algorithm, which predicts the correlation between streaming data sampled in real-time and a representative time series dataset used for training. FA-LAMP lends itself as a real-time solution for time series analysis problems such as classification and anomaly detection, among others. FA-LAMP provides a mechanism to integrate accelerated computation as close as possible to IoT sensors, thereby eliminating the need to transmit and store data in the cloud for posterior analysis. At its core, LAMP and FA-LAMP employ Convolution Neural Networks (CNNs) to perform prediction. This work investigates the challenges and limitations of deploying CNNs on FPGAs when using state-of-the-art commercially-supported frameworks built for this purpose, namely, the Xilinx Deep Learning Processor Unit (DPU) overlay and the Vitis AI development environment. This work exposes several technical limitations of the DPU, while providing a mechanism to overcome these limits by attaching our own hand-optimized IP block accelerators to the DPU overlay. We evaluate FA-LAMP using a low-cost Xilinx Ultra96-V2 FPGA, demonstrating performance and energy improvements of more than an order of magnitude compared to a prototypical LAMP deployment running on a Raspberry Pi 3. Our implementation is publicly available at https://github.com/fccm2021sub/fccm-lamp.
more » « less
Full Text Available
An FPGA-based Programmable Vector Engine for Fast Fully Homomorphic Encryption over the Torus

Gener, Serhan ; Newton, Parker ; Tan, Daniel ; Richelson, Silas ; Lemieux, Guy ; Brisk, Philip ( January 2021 , SPSL: Secure and Private Systems for Machine Learning (ISCA Workshop))

This paper describes an FPGA-based vector engine to accelerate the bootstrapping procedure of Fast Fully Homomorphic Encryption over the Torus (TFHE), a popular and high-performance fully homomorphic encryption scheme. Most TFHE bootstraping comprises many matrix-vector operations that are implemented using Torus polynomials, which are not efficiently implemented on today's standard arithmetic hardware. Our implementation achieves linear performance scaling with up to 16 vector lanes. Future work will switch to an FFT-based polynomial multiplication scheme and switch to larger FPGA parts to accommodate more vector lanes.
more » « less
Full Text Available
Matrix Profile Index Prediction for Streaming Time Series

Shahcheraghi, Maryam ; Cappon, Trevor ; Oymak, Samet ; Papalexakis, Evangelos ; Keogh, Eamonn ; Zimmerman, Zachary ; Brisk, Philip ( January 2020 , Workshop on ML for Systems at NeurIPS 2020)

Discovery and classification of motifs (repeated patterns) and discords (anomalies) in time series is fundamental to many scientific fields. These and related problems have effectively been solved for offline analysis of time series; however, these approaches are computationally intensive and do not lend themselves to streaming time series, such as those produced by IoT sensors, where the sampling rate imposes real-time constraints on computation and there is strong desire to locate computation as close as possible to the sensor. One promising solution is to use low-cost machine learning models to provide approximate answers to these problems. For example, prior work has trained models to predict the similarity of the most recently sampled window of data points to the time series used for training. This work addresses a more challenging problem, which is to predict not only the “strength” of the match, but also the relative location in the representative time series where the strongest matching subsequences occur. We evaluate our approach on two different real world datasets; we demonstrate speedups as high as about 30x compared to exact computations, with predictive accuracy as high as 87.95%, depending on the granularity of the prediction.
more » « less
Full Text Available
Matrix Profile XIV: Scaling Time Series Motif Discovery with GPUs to Break a Quintillion Pairwise Comparisons a Day and Beyond

https://doi.org/10.1145/3357223.3362721

Zimmerman, Zachary ; Kamgar, Kaveh ; Senobari, Nader Shakibay ; Crites, Brian ; Funning, Gareth ; Brisk, Philip ; Keogh, Eamonn ( November 2019 , ACM Symposium on Cloud Computing)

Full Text Available
Matrix Profile XVIII: Time Series Mining in the Face of Fast Moving Streams using a Learned Approximate Matrix Profile

https://doi.org/10.1109/ICDM.2019.00104

Zimmerman, Zachary ; Shakibay Senobari, Nader ; Funning, Gareth ; Papalexakis, Evangelos ; Oymak, Samet ; Brisk, Philip ; Keogh, Eamonn ( November 2019 , International Conference on Data Mining)

Full Text Available