Abstract—Hyperdimensional Computing (HDC) is a neurallyinspired computation model working based on the observation that the human brain operates on high-dimensional representations of data, called hypervector. Although HDC is significantly powerful in reasoning and association of the abstract information, it is weak on features extraction from complex data such as image/video. As a result, most existing HDC solutions rely on expensive pre-processing algorithms for feature extraction. In this paper, we propose StocHD, a novel end-to-end hyperdimensional system that supports accurate, efficient, and robust learning over raw data. Unlike prior work that used HDC for learning tasks, StocHD expands HDC functionality to the computing area by mathematically defining stochastic arithmetic over HDC hypervectors. StocHD enables an entire learning application (including feature extractor) to process using HDC data representation, enabling uniform, efficient, robust, and highly parallel computation. We also propose a novel fully digital and scalable Processing In-Memory (PIM) architecture that exploits the HDC memorycentric nature to support extensively parallel computation. Our evaluation over a wide range of classification tasks shows that StocHD provides, on average, 3.3x and 6.4x (52.3x and 143.Sx) faster and higher energy efficiency as compared to state-of-the-art HDC algorithm running on PIM (NVIDIA GPU), while providing 16x higher computationalmore »
This content will become publicly available on May 31, 2023
Store-n-Learn: Classification and Clustering with Hyperdimensional Computing across Flash Hierarchy
Processing large amounts of data, especially in learning algorithms, poses a challenge for current embedded computing systems. Hyperdimensional (HD) computing (HDC) is a brain-inspired computing paradigm that works with high-dimensional vectors called hypervectors . HDC replaces several complex learning computations with bitwise and simpler arithmetic operations at the expense of an increased amount of data due to mapping the data into high-dimensional space. These hypervectors, more often than not, cannot be stored in memory, resulting in long data transfers from storage. In this article, we propose Store-n-Learn, an in-storage computing solution that performs HDC classification and clustering by implementing encoding, training, retraining, and inference across the flash hierarchy. To hide the latency of training and enable efficient computation, we introduce the concept of batching in HDC. We also present on-chip acceleration for HDC encoding in flash planes. This enables us to exploit the high parallelism provided by the flash hierarchy and encode multiple data points in parallel in both batched and non-batched fashion. Store-n-Learn also implements a single top-level FPGA accelerator with novel implementations for HDC classification training, retraining, inference, and clustering on the encoded data. Our evaluation over 10 popular datasets shows that Store-n-Learn is on average 222× (543×) more »
- Publication Date:
- NSF-PAR ID:
- 10345531
- Journal Name:
- ACM Transactions on Embedded Computing Systems
- Volume:
- 21
- Issue:
- 3
- Page Range or eLocation-ID:
- 1 to 25
- ISSN:
- 1539-9087
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Hyperdimensional computing (HDC) has emerged as a new light-weight learning algorithm with smaller computation and energy requirements compared to conventional techniques. In HDC, data points are represented by high dimensional vectors (hypervectors), which are mapped to high dimensional space (hyperspace). Typically, a large hypervector dimension (≥1000) is required to achieve accuracies comparable to conventional alternatives. However, unnecessarily large hypervectors increase hardware and energy costs, which can undermine their benefits. This paper presents a technique to minimize the hypervector dimension while maintaining the accuracy and improving the robustness of the classifier. To this end, we formulate hypervector design as a multi-objective optimization problem for the first time in the literature. The proposed approach decreases the hypervector dimension by more than 128× while maintaining or increasing the accuracy achieved by conventional HDC. Experiments on a commercial hardware platform show that the proposed approach achieves more than two orders of magnitude reduction in model size, inference time, and energy consumption. We also demonstrate the trade-off between accuracy and robustness to noise and provide Pareto front solutions as a design parameter in our hypervector design.
-
Hyperdimensional (HD) computing is built upon its unique data type referred to as hypervectors. The dimension of these hypervectors is typically in the range of tens of thousands. Proposed to solve cognitive tasks, HD computing aims at calculating similarity among its data. Data transformation is realized by three operations, including addition, multiplication and permutation. Its ultra-wide data representation introduces redundancy against noise. Since information is evenly distributed over every bit of the hypervectors, HD computing is inherently robust. Additionally, due to the nature of those three operations, HD computing leads to fast learning ability, high energy efficiency and acceptable accuracy in learning and classification tasks. This paper introduces the background of HD computing, and reviews the data representation, data transformation, and similarity measurement. The orthogonality in high dimensions presents opportunities for flexible computing. To balance the tradeoff between accuracy and efficiency, strategies include but are not limited to encoding, retraining, binarization and hardware acceleration. Evaluations indicate that HD computing shows great potential in addressing problems using data in the form of letters, signals and images. HD computing especially shows significant promise to replace machine learning algorithms as a light-weight classifier in the field of internet of things (IoTs).
-
Obeid, I. (Ed.)The Neural Engineering Data Consortium (NEDC) is developing the Temple University Digital Pathology Corpus (TUDP), an open source database of high-resolution images from scanned pathology samples [1], as part of its National Science Foundation-funded Major Research Instrumentation grant titled “MRI: High Performance Digital Pathology Using Big Data and Machine Learning” [2]. The long-term goal of this project is to release one million images. We have currently scanned over 100,000 images and are in the process of annotating breast tissue data for our first official corpus release, v1.0.0. This release contains 3,505 annotated images of breast tissue including 74 patients with cancerous diagnoses (out of a total of 296 patients). In this poster, we will present an analysis of this corpus and discuss the challenges we have faced in efficiently producing high quality annotations of breast tissue. It is well known that state of the art algorithms in machine learning require vast amounts of data. Fields such as speech recognition [3], image recognition [4] and text processing [5] are able to deliver impressive performance with complex deep learning models because they have developed large corpora to support training of extremely high-dimensional models (e.g., billions of parameters). Other fields that do notmore »
-
The high computation and memory storage of large deep neural networks (DNNs) models pose intensive challenges to the conventional Von-Neumann architecture, incurring substantial data movements in the memory hierarchy. The memristor crossbar array has emerged as a promising solution to mitigate the challenges and enable low-power acceleration of DNNs. Memristor-based weight pruning and weight quantization have been separately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model. However, there has been no systematic investigation of memristor-based neuromorphic computing (NC) systems considering both weight pruning and weight quantization. In this paper, we propose an unified and systematic memristor-based framework considering both structured weight pruning and weight quantization by incorporating alternating direction method of multipliers (ADMM) into DNNs training. We consider hardware constraints such as crossbar blocks pruning, conductance range, and mismatch between weight value and real devices, to achieve high accuracy and low power and small area footprint. Our framework is mainly integrated by three steps, i.e., memristor- based ADMM regularized optimization, masked mapping and retraining. Experimental results show that our proposed frame- work achieves 29.81× (20.88×) weight compression ratio, with 98.38% (96.96%) and 98.29% (97.47%) power and area reduction on VGG-16 (ResNet-18) networkmore »