<?xml-model href='http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng' schematypens='http://relaxng.org/ns/structure/1.0'?><TEI xmlns="http://www.tei-c.org/ns/1.0">
	<teiHeader>
		<fileDesc>
			<titleStmt><title level='a'>uHD: Unary Processing for Lightweight and Dynamic Hyperdimensional Computing</title></titleStmt>
			<publicationStmt>
				<publisher>IEEE</publisher>
				<date>03/25/2024</date>
			</publicationStmt>
			<sourceDesc>
				<bibl> 
					<idno type="par_id">10557688</idno>
					<idno type="doi">10.23919/DATE58400.2024.10546545</idno>
					
					<author>Sercan Aygun</author><author>Mehran Shoushtari Moghadam</author><author>M Hassan Najafi</author>
				</bibl>
			</sourceDesc>
		</fileDesc>
		<profileDesc>
			<abstract><ab><![CDATA[Hyperdimensional computing (HDC) is a novel computational paradigm that operates on long-dimensional vectors known as hypervectors. The hypervectors are constructed as long bit-streams and form the basic building blocks of HDC systems. In HDC, hypervectors are generated from scalar values without considering bit significance. HDC is efficient and robust for various data processing applications, especially computer vision tasks. To construct HDC models for vision applications, the current state-of-the-art practice utilizes two parameters for data encoding: pixel intensity and pixel position. However, the intensity and position information embedded in high-dimensional vectors are generally not generated dynamically in the HDC models. Consequently, the optimal design of hypervectors with high model accuracy requires powerful computing platforms for training. A more efficient approach is to generate hypervectors dynamically during the training phase. To this aim, this work uses low-discrepancy sequences to generate intensity hypervectors, while avoiding position hypervectors. Doing so eliminates the multiplication step in vector encoding, resulting in a power-efficient HDC system. For the first time in the literature, our proposed approach employs lightweight vector generators utilizing unary bit-streams for efficient encoding of data instead of using conventional comparator-based generators.]]></ab></abstract>
		</profileDesc>
	</teiHeader>
	<text><body xmlns="http://www.tei-c.org/ns/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xlink="http://www.w3.org/1999/xlink">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I. INTRODUCTION</head><p>Traditional computing systems based on positional binary radix encounter practical limitations in the efficient hardware design of today's big data applications. These systems suffer from extremely high power and memory consumption, particularly for cognitive tasks with iterative and complex learning procedures. Emerging computing technologies such as Hyperdimensional Computing (HDC), Stochastic Computing (SC), Unary Bit-stream Computing (UBC), Quantum Computing (QC), and Approximate Computing (AC) are shaping the next generation of computing systems. Among these, HDC has recently gained significant attention due to its lightweight, robust, and efficient solutions for various learning and cognitive tasks <ref type="bibr">[1]</ref>, <ref type="bibr">[2]</ref>, particularly for natural language processing [3] and image classification <ref type="bibr">[4]</ref>. HDC encodes information using holographic hyperdimensional vectors, known as hypervectors, consisting of randomly distributed binary values of -1 (logic-0) and +1 (logic-1). This unconventional representation enables fast, robust, efficient, and fully parallel processing of large sets of data <ref type="bibr">[5]</ref>.</p><p>For high-quality HDC, hypervectors are expected to be orthogonal, i.e., uncorrelated with each other. By generating pseudo-random vectors, prior works encode data to hypervectors that are only nearly orthogonal. This work introduces a novel hypervector encoding scheme that radically differs from the encoding methods currently used in HDC systems. We propose a simpler and more effective method to achieve or-thogonality by drawing an analogy between HDC and SC <ref type="bibr">[6]</ref>. Instead of relying on pseudo-randomness, we leverage quasirandomness provided by low-discrepancy (LD) sequences <ref type="bibr">[7]</ref> to generate high-quality hypervectors. In addition, for the first time, to the best of our knowledge, we take advantage of UBC and its unary data representation <ref type="bibr">[8]</ref> for the lightweight design of HDC systems. In what follows, we summarize the primary contributions of this work. d Utilizing quantized LD sequences for hypervector encoding for the first time in the literature. e Eliminating position hypervectors in HDC system, alleviating the total memory consumption, vector generation load, and arithmetic operations. f Developing uHD, a hybrid HDC system integrating unary bit-streams and hypervector processing. g Developing a lightweight combinational logic to compare unary bit-streams for dynamic generation of hypervectors. d A new circuitry for the binarization operation needed in HDC systems. e Achieving a higher image classification accuracy compared to the baseline HDC with pseudo-random hypervectors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>II. BACKGROUND AND MOTIVATION</head><p>HDC maps raw input data into a high-dimensional space with hypervectors of +1s and -1s <ref type="bibr">[9]</ref>. Each dimension in this space corresponds to a feature or attribute in the original data. HDC consists of two primary steps: hypervector generation and encoding, of which the latter creates another hypervector. While the encoding step has been extensively discussed in the literature <ref type="bibr">[2]</ref>, <ref type="bibr">[10]</ref>, vector generation is typically left to the performance of pseudo-randomness <ref type="bibr">[11]</ref>. When a scalar value X is to be represented using a hypervector, its numerical value can be used for vector generation. However, when X is symbolic data (e.g., a letter), a proper vector should be attributed to the symbol. The term proper emphasizes the importance of orthogonality, as each symbol without numerical information should be treated equally and embedded in hypervectors without any bias towards one symbol over another. In other words, each hypervector should have an equal number of +1s and -1s with an independent random distribution. This representation requires good randomness to ensure hypervectors remain uncorrelated with each other. An important target of this work is to produce hypervectors with ideal orthogonality. For the scalar case, X can be a grayscale pixel value <ref type="bibr">[4]</ref> (0 &#8804; X &#8804; 255 for 8-bit representation), the amplitude of a discrete signal <ref type="bibr">[2]</ref>, or a numerical feature of data <ref type="bibr">[9]</ref>. This work will follow the convention for image classification, so we assume X is a pixel value. Fig. <ref type="figure">1</ref>(a) shows a sample image pixel and its corresponding position (P ) and level (L) hypervectors. Hypervectors are assigned with a dimension or size of D. P s are obtained from symbolic data, and Ls from scalar values. P s are generated by comparing some random (R) numbers (0 &#8804; R 1..D &#8804; 1) with a threshold value (t = 0.5; no-bias point between 0 and 1). Ls are typically generated by bit flipping <ref type="bibr">[11]</ref>. Hence, closer numerical scalars have similar hypervectors, while different numerical scalars have more uncorrelated hypervectors. In generating both P and L, a +1 or -1 value is returned for any hypervector position. If R &gt; t, the corresponding position is set to -1; otherwise, it is set to +1 <ref type="bibr">[2]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Traditional Hypervector Encoding with Positional Hypervectors</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>P o s i t i o n I n t e n s i t y (</head><p>Fig. <ref type="figure">1</ref>(b) illustrates the remaining encoding steps on the generated hypervectors. These steps compose the class hypervector (C), which holds the overall representation of a class (e.g., Fig. <ref type="figure">1</ref> shows an image from class-5 and its contribution to the corresponding class hypervector). All images in the training set contribute to building the class hypervectors by processing the hypervectors (P and L) of all their pixels. The generated hypervectors are first multiplied element-wise (via bit-wise XOR). This is known as binding. The multiplied hypervectors coming from each pixel (L &#8853; P ) are accumulated by traversing positions. Hypervectors are added to each other by another element-wise processing (bitwise popcount). This is known as bundling <ref type="bibr">[12]</ref>. Then, the final values are evaluated for class hypervectors after scanning all data samples of the same class. Finally, a binarization operation is performed via a sign function (thresholding with a comparator or a subtractor) [3]. For each class in training, the labeled data are processed to build their corresponding class hypervector. This operation is performed only once, different from the conventional learning systems with iterative forward passes throughout the batches and epochs.</p><p>When all class hypervectors are defined (C 1..q with q-class dataset), the inference step measures the accuracy of the testing dataset. The same encoding steps are followed for any testing data to obtain a testing hypervector (C test ). The final classification is performed using a similarity check between C test vs. C 1 , C 2 , ..., and C q . In this work, we use cosine similarity. The highest similarity between C test and one of the trained classes gives the classification decision [3].</p><p>Generating pseudo-random hypervectors with high orthogonality during training can be very time-and memoryconsuming. To obtain a high classification accuracy, the best performing P and L random hypervectors are assigned iteratively. Hypervectors with different distributions are generated iteratively to find those with the highest orthogonality. One of our goals in this work is to minimize the number of vector operations. The bit-wise XOR operations in the binding process involve both P and L hypervectors. We use an encoding for level hypervectors that does not need iteration and provides accurate encoding deterministically <ref type="bibr">[13]</ref>. Instead of pseudo-randomness, we provide high orthogonality via quasirandomness d. Our approach eliminates the need for position encoding and their corresponding multiplications e. Thus, single-iteration vector optimization is guaranteed thanks to the properties of LD sequences <ref type="bibr">[14]</ref>.</p><p>As the first work of its kind, we use unary bit-streams in HDC systems f. UBC utilizes unary (aka thermometer) coding, representing data using bit-streams with logic-1s (or logic-0s) aligned to the beginning or end of the bit-stream. For instance, X1 &#8594; 0 0 0 0 0 1 1 X2 &#8594; 0 0 1 1 1 1 1 are two unary bit-streams of size N =7 representing 2 and 5. UBC can be exploited for the lightweight design of HDC systems. Hypervector generation in current HDC systems requires conventional binary comparators, which are complex and consume significant power. We employ UBC to design a new lightweight comparator logic for dynamic hypervector generation g.</p><p>In addition to optimizing vector generation and minimizing the operations in encoding, we improve the hardware design of the final stage with accumulation and binarization. We propose a concurrent binarization during popcounting; Processing over binary data allows using popcount to count only the number of logic-1s. The binary output is obtained after D cycles to be compared or subtracted from a threshold value. This requires a separate module for thresholding or subtraction. We simplify the binarization module to make the decision on the spot while performing popcount d.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>III. EFFICIENT HYPERVECTOR ENCODING WITH UHD</head><p>We call the new unary HDC system uHD. uHD enjoys a lightweight architecture by taking advantage of unary processing. It also provides a higher accuracy by exploiting the uncorrelation and recurrence properties of LD sequences.</p><p>uHD radically alters the encoding approach in HDC systems. Conventional HDC systems are bounded by the spatial information of discrete data. LD sequences provide built-in indexes to be used for the positional information. Fig. <ref type="figure">2</ref> depicts the encoding using LD Sobol <ref type="bibr">[7]</ref> scalars and indexes. We eliminate P s and only encode Ls by using Sobol scalars. As shown in Fig. <ref type="figure">2</ref>, for encoding image data, we compare LD Sobol sequences (S i ) with image intensity values. We do not encode positions; instead, we use the corresponding index of any Sobol sequence (S i ) ranging from S 1 to S row&#215;column . Finally, the non-binary image hypervector formula turns into</p><p>Any pixel intensity is encoded based on the pixel position corresponding to the Sobol index. The normalized intensity value (by D) is compared with each element in the corresponding Sobol sequence. If the normalized intensity is smaller than the Sobol number, the hypervector position gets -1; otherwise, it gets +1. After obtaining L, we perform the accumulation without the encoding's multiplication step. Thus, our novel approach achieves a multiplier-less vector encoding for HDC.</p><p>From efficient hypervector encoding to complete hardware design, we focus on extended design perspectives for efficient HDC system design. Most prior works present hardware design for the inference. However, training on edge devices is a more challenging task. For high accuracy, the baseline HDC requires iterative hypervector generation and processing. Hence, singlepass data processing can significantly reduce runtime and energy consumption. Our proposed encoding achieves this benefit with a deterministic and reliable one-time iteration.</p><p>uHD reads two sets of data from memory: (i) processing data such as image pixels or features and (ii) Sobol sequences. We quantize both input data and Sobol scalar values in the proposed Utilizing Sobol scalar and index encoding removes the need for P s and corresponding multiplications. We use unary bit-streams instead of the conventional binary radix encoding, bringing UBC into HDC systems for the first time. Now, let us take a look at the overall design. With Mbit quantization, only M -bit data is stored in memory. The input data size depends on the features or raw data size, such as the image's row&#215;column. Each Sobol sequence has a length of D (i.e., has D Sobol numbers), where D is typically in the range of 1K to 10K. Storing all Sobol data in registers may not be possible as they may exceed the memory size of the resource-constrained devices. Therefore, we use block RAM (BRAM) in a re-configurable design platform to store the quantized Sobol data. The processing data are relatively lighter in size, so we keep them in registers. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>REGs</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>U0 Unary Stream Table (UST)</head><p>0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 data quantization does not affect the system's accuracy. Even though hypervector generation may experience some flipped bits (+1 instead of -1, or vice versa), the accumulated values yield large scalars (non-quantized class hypervector), and the sign of accumulation is not easily affected.</p><p>Let us now discuss how we convert the data to unary bitstreams. Unary bit-streams are conventionally generated by using a pair of M -bit binary counter and comparator <ref type="bibr">[16]</ref>, as shown in Fig. <ref type="figure">3(b</ref>). This design is compact, especially for dynamic bit-stream generation with large sizes. However, our HDC design works only on N =16-bit sequences, and so all possible sequences can be pre-stored in memory. The data in memory are converted to unary bit-streams on the fly. Fig. <ref type="figure">3(c)</ref> shows how we fetch the pre-stored unary bit-streams from an associative memory. The binary scalar in REGs or BRAM points to the corresponding index of a Unary Stream Table (UST), and the target bit-stream is fetched. We put the first design checkpoint here &#175;and compare energy consumption. We synthesize uHD and the baseline design using Synopsys Design Compiler with a 45-nm cell library. We compare the energy consumption of the two approaches for generating one bit of the hypervector. We observed that uHD consumes 0.77f J energy while the baseline design consumes 0.167pJ (both designed for D=1K).</p><p>We generate hypervectors by comparing the data and Sobol scalars. Instead of directly comparing quantized scalars via conventional comparators, we use a novel unary bit-stream &lt; 0000011 1100000 1100011 809 11 00000 11111 00 Proposed Unary Bitstream Comparator Minimum of inputs All 1s? OR Any 0s? N-input AND 2nd 1st &gt;= 2nd 1st 809 819 2 5 2&lt;5 Fig. 5. Proposed unary bit-stream comparator.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>I n t e n s i t y ( o r F e a t u r e s</head><p>comparator. Unary bit-streams (with the same length) are correlated, and ANDing them gives the minimum bit-stream. Fig. <ref type="figure">5</ref> illustrates the proposed unary comparator for comparing two N =7-bit unary inputs. The data-(</p><p>) and Sobol ( ) unary bit-streams are compared to generate one bit of the hypervector: If the 1 st operand (here, data) is greater than or equal to the 2 nd operand (here, Sobol value), then the corresponding output is logic-1; otherwise, it is logic-0. The bit-wise AND operation determines the minimum input. The inverted unary Sobol is checked (by an OR gate) to see if the minimum equals this. The example in Fig. <ref type="figure">5</ref> compares two unary bit-streams corresponding to values 2 and 5. After finding the minimum (bronze color in Fig. <ref type="figure">5</ref>), the inverted Sobol is checked to see if bit-wise ORing gives all-1s or at least one logic-0 at the output. If the Sobol number is the minimum, the OR operation returns a logic-1, and ANDing N consecutive bits at the output generates a logic-1 bit for the hypervector. If there is at least one logic-0 after OR, the final AND detects it and resets the bit of the hypervector. Since, in this example, the data (value of 2 in Fig. <ref type="figure">5</ref>) is less than the Sobol number (value of 5 in Fig. <ref type="figure">5</ref>), the output is not all-1s, and a logic-0 is generated. Here, we have the second design checkpoint to evaluate the energy consumption of hypervector generation when using the proposed unary comparator. The baseline HDC with conventional comparators consumes 2.49pJ while uHD consumes 0.24pJ (both designed for D=1K).</p><p>After encoding the level hypervectors (Ls) using the steps above, the next step is the accumulation and binarization. Fig. <ref type="figure">4</ref> illustrates the overall design of uHD. The main contributions are underscored in this figure. The red-colored L is traversed for accumulation. For each hypervector bit, a popcount operation returns a binary output, counting the number of logic-1s. The D-type flip-flops are used in our design for this purpose. We propose a new binarization method that operates concurrently with the popcount instead of using an extra subtractor or comparator for the Threshold of Binarization (TOB). The size of the incoming data (the total pixel or feature count), H, is the maximum value to count up. Hence, a +log 2 H+-bit counter is required. H 2 = T OB is the critical threshold reached by popcount output for the decision on the logic-1s-in-majority. When the threshold is reached, the sign bit is set for the corresponding bit of the class hypervector; otherwise, it is 0. We propose to use a masking logic for capturing TOB, which is in binary: (B +log 2 H+ ...B 2 B 1 ) 2 . As shown in Fig. <ref type="figure">4</ref>, the masking logic is hardwired, feeding +log 2 H+ bits to an AND gate. When popcount reaches TOB, this hardwired threshold guarantees logic-1 at the output of AND; otherwise, it remains logic-0 <ref type="bibr">[17]</ref>. Here, we set another design checkpoint &#175;and compare the energy consumption of the baseline and uHD design for accumulate-and-binarize operation. We observe that uHD consumes 34.7pJ energy per feature of the incoming image, while for the same data, the baseline design consumes 68.7pJ energy (both designed for D=1K).</p><p>TABLE I ENERGY CONSUMPTION AND Area &#215; Delay COMPARISON OF UHD &amp; THE BASELINE HDC FOR EACH HYPERVECTOR (HV) AND IMAGE Design Approach Energy Consumption (pJ) Area &#215; Delay (m 2 &#215; s) D=1K D=2K D=8K D=1K D=2K D=8K uHD per HV 0.79 1.58 6.32 40.60&#215;10 -12 81.20&#215;10 -12 324.80&#215;10 -12 uHD per image (MNIST) 113.76 227.52 910.08 5.83&#215;10 -9 11.67&#215;10 -9 46.69&#215;10 -9 Baseline per HV 171.42 415.41 4023.82 11.79&#215;10 -9 25.55&#215;10 -9 230.33&#215;10 -9 Baseline per image (MNIST) 24.68&#215;10 3 59.80&#215;10 3 57.94&#215;10 4 1.70&#215;10 -6 3.70&#215;10 -6 33.17&#215;10 -6</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>IV. DESIGN EVALUATION AND RESULTS</head><p>We evaluate the performance, hardware cost, energy consumption, and area&#215;delay of uHD compared to the baseline architecture and the prior state-of-the-art (SOTA). We utilize the standard MNIST dataset for accuracy evaluations <ref type="bibr">[18]</ref>. We compare the hardware costs of the baseline and uHD architecture, specifically for the hypervector generation process. The baseline design follows the dynamic and independent training target. Linear-feedback shift registers (LFSRs) are used for hypervector generation in the baseline design. Table <ref type="table">I</ref> compares the energy consumption and area&#215;delay numbers to evaluate the hardware efficiency of the proposed design. While an iterative design is required for the baseline design (like i=100 different attempts to get the best-performing hypervectors), we credit it by assuming that hypervectors are the best, and only a single-run is sufficient for high accuracy. However, for realistic baseline training on an edge device, more iterations are needed for high accuracy, which accordingly increases the energy consumption of the baseline design. We estimate the energy consumption for generating each P &#215; L hypervector in the baseline HDC (Fig. <ref type="figure">1(b)</ref>) and for each L hypervector in the uHD design (Fig. <ref type="figure">2</ref>). As can be seen in the reported numbers, our proposed uHD is more hardware-efficient than the baseline HDC.</p><p>Table <ref type="table">II</ref> compares the SOTA HDC architectures implemented on a central processing unit or microprocessor. A thorough benchmarking is outlined in the HDC surveys by Hassan et al. <ref type="bibr">[19]</ref> and <ref type="bibr">Chang et al. [20]</ref>. This table ranks the top energy-efficient frameworks for overall architecture (including hypervector generation, binding, bundling, and binarization) from the aforementioned surveys and contrasts them with our proposed architecture, which benefits from a novel approach for hypervector generation. As it can be seen, our proposed HDC architecture provides remarkable energy efficiency by exploiting UBC.</p><p>Table <ref type="table">III</ref> compares the accuracy performance of the baseline HDC and uHD. The baseline architecture is monitored at different iterations of generating hypervectors (P and L). At each random hypervector assignment in the training phase (i=1...100), the test accuracy is recorded. The table reports the average accuracy values at different checkpoints of i. uHD utilizes LD Sobol sequences and completes its deterministic hypervector (only L) assignment in a single iteration (i=1). The MNIST dataset is segmented to separate the training and testing images. For the sake of fair accuracy comparison between the two designs, there is no retaining, no neural network (NN) assistance, and no prior optimizations. Some prior work heavily relies on these optimizations; however, these optimizations and the use of other machine learning</p><p>TABLE II ENERGY EFFICIENCY OVER BASELINE ARCHITECTURES <ref type="bibr">[19]</ref>, <ref type="bibr">[20]</ref> HDC Framework Platform Energy Efficiency Semi-HD <ref type="bibr">[21]</ref> Raspberry Pi 12.60&#215; Voice-HD <ref type="bibr">[22]</ref> Central Processing Unit 11.90&#215; tiny-HD <ref type="bibr">[23]</ref> Microprocessor 11.20&#215; PULP-HD <ref type="bibr">[24]</ref> ARM Microprocessor 9.9&#215; Hierarchical-MHD <ref type="bibr">[25]</ref> Central Processing Unit 6.60&#215; AdaptHD <ref type="bibr">[26]</ref> Raspberry Pi 6.30&#215; Laelaps <ref type="bibr">[27]</ref> Central Processing Unit 1.40&#215; This work ARM Microprocessor 31.83&#215;</p><p>In a comprehensive survey conducted in <ref type="bibr">[19]</ref> and <ref type="bibr">[20]</ref>, several HDC frameworks were benchmarked based on their energy efficiency in comparison to reference baseline models. This table lists the top energy-efficient architectures from <ref type="bibr">[19]</ref> and <ref type="bibr">[20]</ref>. All frameworks, including ours, report the overall (including memory read/write, hypervector generation, binding, and bundling) system energy consumption.</p><p>TABLE III MNIST ACCURACY PERFORMANCE (%) OF Baseline HDC AND UHD e D Baseline HDC (Average) uHD i=1 i=1..5 i=1..20 i=1..50 i=1..75 i=1..100 i=1 1K 82.93 83.60 83.49 82.70 82.88 82.63 84.44 2K 86.24 86.58 87.05 86.35 86.37 86.53 87.04 8K 88.30 88.55 88.25 88.13 88.14 88.13 88.41</p><p>(ML) techniques affect the cost of the training hardware. The impact of using random vectors in the baseline HDC is reported in Fig. <ref type="figure">6</ref>(a). The fluctuations in the testing accuracy underscore the importance of having an iterative process for selecting the best vectors. Fig. <ref type="figure">6</ref>(b) reports the accuracy of prior SOTA HDC systems (the ones without NN assistance, complex optimizations in training, or multi-models -only with (w/) or without (w/o) the retraining efforts) in performing MNIST classification. As can be seen, uHD with single-pass learning achieves better accuracy compared to the baseline and SOTA designs.</p><p>To extend our evaluations, we utilized various image-based datasets, including CIFAR-10, BloodMNIST, BreastMNIST, FashionMNIST, and SVHN <ref type="bibr">[30]</ref>. Table <ref type="table">IV</ref> presents the accuracy comparison between the baseline HDC and our proposed uHD. We note that these accuracy results were obtained without employing any optimization (e.g., retraining, NN assistance, or transfer learning). The findings demonstrate the effectiveness and versatility of uHD across different datasets, showcasing its potential for various machine vision applications. The ISO-accuracy values presented show that uHD exhibits superior hardware efficiency compared to conventional learning frameworks (ML, deep neural networks), which often require resource-intensive hardware setups. As a result, uHD offers a more efficient and cost-effective solution to achieve the same accuracy level. TABLE IV ACCURACY (%) COMPARISON OF THE BASELINE HDC AND UHD FOR DIFFERENT IMAGE DATASETS. Datasets D=1K D=2K D=8K Ours Baseline Ours Baseline Ours Baseline CIFAR-10 39.29 38.21 40.28 40.26 41.97 41.71 Blood MNIST 53.05 48.52 55.86 51.20 57.88 51.82 Breast MNIST 68.59 68.47 69.23 69.11 71.15 70.93 Fashion MNIST 68.60 54.19 70.06 69.97 71.37 70.87 SVHN 60.29 60.06 61.73 61.24 62.87 62.82</p><p>For the Baseline HDC, the P and L hypervectors were generated using conventional random sequence generation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>V. CONCLUSIONS</head><p>This study proposed a hybrid HDC system, uHD, by employing UBC in HDC for the first time. The new design simplifies hardware implementation, providing significant hardware cost savings compared to the baseline HDC. uHD utilizes LD sequences for deterministic and high-quality generation of hypervectors. It achieves higher accuracy compared to the baseline HDC while offering single-iteration training. We propose a novel hypervector generator by representing data in the unary domain and comparing data using a novel unary comparator. The impact of adding the proposed modules is studied by comparing the energy consumption of the proposed design with the baseline HDC at different design checkpoints. We hope this work opens new avenues for HDC by employing the complementary advantages of emerging computing technologies such as SC and UBC in HDC.</p></div><note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_0"><p>&amp; &amp; &amp; 2 8 &#215; 2 8 &#215; D +1 +1 -1 +1 -1 -1</p></note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" xml:id="foot_1"><p>&amp; &amp; traversing</p></note>
		</body>
		</text>
</TEI>
