skip to main content


This content will become publicly available on June 4, 2025

Title: Advancing Hyperdimensional Computing Based on Trainable Encoding and Adaptive Training for Efficient and Accurate Learning

Hyperdimensional computing (HDC) is a computing paradigm inspired by the mechanisms of human memory, characterizing data through high-dimensional vector representations, known as hypervectors. Recent advancements in HDC have explored its potential as a learning model, leveraging its straightforward arithmetic and high efficiency. The traditional HDC frameworks are hampered by two primary static elements: randomly generated encoders and fixed learning rates. These static components significantly limit model adaptability and accuracy. The static, randomly generated encoders, while ensuring high-dimensional representation, fail to adapt to evolving data relationships, thereby constraining the model’s ability to accurately capture and learn from complex patterns. Similarly, the fixed nature of the learning rate does not account for the varying needs of the training process over time, hindering efficient convergence and optimal performance. This paper introduces\(\mathsf {TrainableHD} \), a novel HDC framework that enables dynamic training of the randomly generated encoder depending on the feedback of the learning data, thereby addressing the static nature of conventional HDC encoders.\(\mathsf {TrainableHD} \)also enhances the training performance by incorporating adaptive optimizer algorithms in learning the hypervectors. We further refine\(\mathsf {TrainableHD} \)with effective quantization to enhance efficiency, allowing the execution of the inference phase in low-precision accelerators. Our evaluations demonstrate that\(\mathsf {TrainableHD} \)significantly improves HDC accuracy by up to 27.99% (averaging 7.02%) without additional computational costs during inference, achieving a performance level comparable to state-of-the-art deep learning models. Furthermore,\(\mathsf {TrainableHD} \)is optimized for execution speed and energy efficiency. Compared to deep learning on a low-power GPU platform like NVIDIA Jetson Xavier,\(\mathsf {TrainableHD} \)is 56.4 times faster and 73 times more energy efficient. This efficiency is further augmented through the use of Encoder Interval Training (EIT) and adaptive optimizer algorithms, enhancing the training process without compromising the model’s accuracy.

 
more » « less
Award ID(s):
2312517 2321840 2319198 2127780
NSF-PAR ID:
10531073
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM Transactions on Design Automation of Electronic Systems
Date Published:
Journal Name:
ACM Transactions on Design Automation of Electronic Systems
ISSN:
1084-4309
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Vision Transformer (ViT) has demonstrated promising performance in various computer vision tasks, and recently attracted a lot of research attention. Many recent works have focused on proposing new architectures to improve ViT and deploying it into real-world applications. However, little effort has been made to analyze and understand ViT’s architecture design space and its implication of hardware-cost on different devices. In this work, by simply scaling ViT’s depth, width, input size, and other basic configurations, we show that a scaled vanilla ViT model without bells and whistles can achieve comparable or superior accuracy-efficiency trade-off than most of the latest ViT variants. Specifically, compared to DeiT-Tiny, our scaled model achieves a\(\uparrow 1.9\% \)higher ImageNet top-1 accuracy under the same FLOPs and a\(\uparrow 3.7\% \)better ImageNet top-1 accuracy under the same latency on an NVIDIA Edge GPU TX2. Motivated by this, we further investigate the extracted scaling strategies from the following two aspects: (1) “can these scaling strategies be transferred across different real hardware devices?”; and (2) “can these scaling strategies be transferred to different ViT variants and tasks?”. For (1), our exploration, based on various devices with different resource budgets, indicates that the transferability effectiveness depends on the underlying device together with its corresponding deployment tool; for (2), we validate the effective transferability of the aforementioned scaling strategies obtained from a vanilla ViT model on top of an image classification task to the PiT model, a strong ViT variant targeting efficiency, as well as object detection and video classification tasks. In particular, when transferred to PiT, our scaling strategies lead to a boosted ImageNet top-1 accuracy of from\(74.6\% \)to\(76.7\% \)(\(\uparrow 2.1\% \)) under the same 0.7G FLOPs; and when transferred to the COCO object detection task, the average precision is boosted by\(\uparrow 0.7\% \)under a similar throughput on a V100 GPU.

     
    more » « less
  2. In this article, we show how a single function,join, can be used to implement parallelbalanced binary search trees(BSTs) simply and efficiently. Based onjoin, our approach applies to multiple balanced tree data structures, and a variety of functions for ordered sets and maps. We describe our technique as an algorithmic framework calledjoin-based algorithms. We show that thejoinfunction fully captures what is needed for rebalancing trees for a variety of tree algorithms, as long as the balancing scheme satisfies certain properties, which we refer to asjoinabletrees. We discuss four balancing schemes that are joinable: AVL trees, red-black trees, weight-balanced trees, and treaps. We present a variety of tree algorithms that apply to joinable trees, includinginsert,delete,union,intersection,difference,split,range,filter, and so on, most of them also parallel. These algorithms are generic across balancing schemes. Many algorithms are optimal in the comparison model, and we provide a general proof to show the efficiency in work for joinable trees. The algorithms are highly parallel, all with polylogarithmic span (parallel dependence). Specifically, the set-set operationsunion,intersection, anddifferencehave work\( O(m\log (\frac{n}{m}+1)) \)and polylogarithmic span for input set sizes\( n \)and\( m\le n \).

    We implemented and tested our algorithms on the four balancing schemes. In general, all four schemes have quite similar performance, but the weight-balanced tree slightly outperforms the others. They have the same speedup characteristics, getting around 73\( \times \)speedup on 72 cores (144 hyperthreads). Experimental results also show that our implementation outperforms existing parallel implementations, and our sequential version achieves close or much better performance than the sequential merging algorithm in C++ Standard Template Library (STL) on various input sizes.

     
    more » « less
  3. Current computing device authentication often presents accessibility barriers for people withupper extremity impairments (UEI). In this article, we present a framework calledAccessible image-Association-based Authentication for Computing devices (A3C), a novel recognition-based graphical authentication framework specifically designed for people with UEI to authenticate to their computing devices. A3C requires users to provide a set of primary images the user knows that are recognizable to them and subsequently associate each primary image with a secondary image. To evaluate the efficacy of the A3C framework, we instantiated the framework by implementing a version of A3C calledA3C-FA, which uses images of faces of people the user knows as the primary image and animal images as the secondary image. We then performed three studies to evaluate A3C-FA: a shoulder-surfing attack study (N\(=\)319), a close-adversary attack study (N\(=\)268), and a usability study with people with UEI (N\(=\)14). We found that A3C was robust against both shoulder-surfing and close-adversary attacks. We also performed a detailed study to evaluate the accessibility of A3C-FA. Our participants reported that A3C-FA was more usable and more secure than the authentication approaches with which they were familiar. Based on these findings, we suggest four areas of future research to further improve the design of the A3C framework.

     
    more » « less
  4. In the field of multi-agent autonomous transportation, such as automated payload delivery or highway on-ramp merging, agents routinely exchange knowledge to optimize their shared objective and adapt to environmental novelties through Cooperative Multi-Agent Reinforcement Learning (CMARL) algorithms. This knowledge exchange between agents allows these systems to operate efficiently and adapt to dynamic environments. However, this cooperative learning process is susceptible to adversarial poisoning attacks, as highlighted by contemporary research. Particularly, the poisoning attacks where malicious agents inject deceptive information camouflaged within the differential noise, a pivotal element for differential privacy (DP)-based CMARL algorithms, pose formidable challenges to identify and overcome. The consequences of not addressing this issue are far-reaching, potentially jeopardizing safety-critical operations and the integrity of data privacy in these applications. Existing research has strived to develop anomaly detection-based defense models to counteract conventional poisoning methods. Nonetheless, the recurring necessity for model offloading and retraining with labeled anomalous data undermines their practicality, considering the inherently dynamic nature of the safety-critical autonomous transportation applications. Further, it is imperative to maintain data privacy, ensure high performance, and adapt to environmental changes. Motivated by these challenges, this paper introduces a novel defense mechanism against stealthy adversarial poisoning attacks in the autonomous transportation domain, termed Reinforcing Autonomous Multi-agent Protection through Adversarial Resistance in Transportation (RAMPART). Leveraging a GAN model at each local node, RAMPART effectively filters out malicious advice in an unsupervised manner, whilst generating synthetic samples for each state-action pair to accommodate environmental uncertainties and eliminate the need for labeled training data. Our extensive experimental analysis, conducted in a Private Payload Delivery Network (PPDN) —a common application in the autonomous multi-agent transportation domain—demonstrates thatRAMPART successfully defends against a DP-exploited poisoning attack with a\(30\% \)attack ratio, achieving an F1 score of 0.852 and accuracy of\(96.3\% \)in heavy-traffic environments.

     
    more » « less
  5. On hypergraphs withmhyperedges andnvertices, wherepdenotes the total size of the hyperedges, we provide the following results:

    We give an algorithm that runs in\(\widetilde{O}(mn^{2k-2})\)time for finding a minimumk-cut in hypergraphs of arbitrary rank. This algorithm betters the previous best running time for the minimumk-cut problem, fork> 2.

    We give an algorithm that runs in\(\widetilde{O}(n^{\max \lbrace r,2k-2\rbrace })\)time for finding a minimumk-cut in hypergraphs of constant rankr. This algorithm betters the previous best running times for both the minimum cut and minimumk-cut problems for dense hypergraphs.

    Both of our algorithms are Monte Carlo, i.e., they return a minimumk-cut (or minimum cut) with high probability. These algorithms are obtained as instantiations of a genericbranching randomized contractiontechnique on hypergraphs, which extends the celebrated work of Karger and Stein on recursive contractions in graphs. Our techniques and results also extend to the problems of minimum hedge-cut and minimum hedge-k-cut on hedgegraphs, which generalize hypergraphs.

     
    more » « less