skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Hardware Implementation of Digital Memcomputing on Small-Size FPGAs
Memcomputing is a novel computing paradigm beyond the von–Neumann one. Its digital version is designed for the efficient solution of combinatorial optimization problems, which emerge in various fields of science and technology. Previously, the performance of digital memcomputing machines (DMMs) was demonstrated using software simulations of their ordinary differential equations. Here, we present the first hardware realization of a DMM algorithm on a low-cost FPGA board. In this demonstration, we have implemented a Boolean satisfiability problem solver. To optimize the use of hardware resources, the algorithm was partially parallelized. The scalability of the present implementation is explored and our FPGA-based results are compared to those obtained using a python code running on a traditional (von–Neumann) computer, showing one to two orders of magnitude speed-up in time to solution. This initial small-scale implementation is projected to state-of-the-art FPGA boards anticipating further advantages of the hardware realization of DMMs over their software emulation.  more » « less
Award ID(s):
2229880
PAR ID:
10505823
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
2023 IEEE 66th International Midwest Symposium on Circuits and Systems (MWSCAS)
ISBN:
979-8-3503-0210-3
Page Range / eLocation ID:
346 to 350
Subject(s) / Keyword(s):
Field programmable gate arrays, nonlinear dynamical systems, computing technology
Format(s):
Medium: X
Location:
Tempe, AZ, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. We present a fully parallel digital memcomputing solver implemented on a field-programmable gate array (FPGA) board. For this purpose, we have designed an FPGA code that solves the ordinary differential equations associated with digital memcomputing in parallel. A feature of the code is the use of only integer-type variables and integer constants to enhance optimization. Consequently, each integration step in our solver is executed in 96 ns. This method was utilized for difficult instances of the Boolean satisfiability (SAT) problem close to a phase transition, involving up to about 150 variables. Our results demonstrate that the parallel implementation reduces the scaling exponent by about 1 compared to a sequential C++ code on a standard computer. Additionally, compared to C++ code, we observed a time-to-solution advantage of about three orders of magnitude. Given the limitations of FPGA resources, the current implementation of digital memcomputing will be especially useful for solving compact but challenging problems. 
    more » « less
  2. Deep Neural Network (DNN) acceleration with digital Processing-in-Memory (PIM) platforms at the edge is an actively-explored domain with great potential to not only address memory-wall bottlenecks but to offer orders of performance improvement in comparison to the von-Neumann architecture. On the other side, FPGA-based edge computing has been followed as a potential solution to accelerate compute-intensive workloads. In this work, adopting low-bit-width neural networks, we perform a solid and comparative inference performance analysis of a recent processing-in-SRAM tape-out with a low-resource FPGA board and a high-performance GPU to provide a guideline for the research community. We explore and highlight the key architectural constraints of these edge candidates that impact their overall performance. Our experimental data demonstrate that the processing-in-SRAM can obtain up to ~160x speed-up and up to 228x higher efficiency (img/s/W) compared to the under-test FPGA on the CIFAR-10 dataset. 
    more » « less
  3. Wang, Huan (Ed.)
    Defect identification has been a significant task in various fields to prevent the potential problems caused by imperfection. There is great attention for developing technology to accurately extract defect information from the image using a computing system without human error. However, image analysis using conventional computing technology based on Von Neumann structure is facing bottlenecks to efficiently process the huge volume of input data at low power and high speed. Herein efficient defect identification is demonstrated via a morphological image process with minimal power consumption using an oxide transistor and a memristor‐based crossbar array that can be applied to neuromorphic computing. Using a hardware and software codesigned neuromorphic system combined with a dynamic Gaussian blur kernel operation, an enhanced defect detection performance is successfully demonstrated with about 104 times more power‐efficient computation compared to the conventional complementary metal‐oxide semiconductor (CMOS)‐based digital implementation. It is believed the back end of line (BEOL)‐compatible all‐oxide‐based memristive crossbar array provides the unique potential toward universal artificial intelligence of things (AIoT) applications where conventional hardware can hardly be used. 
    more » « less
  4. State machine replication (SMR) is a core mechanism for building highly available and consistent systems. In this paper, we propose Waverunner, a new approach to accelerate SMR using FPGA-based SmartNICs. Our approach does not implement the entire SMR system in hardware; instead, it is a hybrid software/hardware system. We make the observation that, despite the complexity of SMR, the most common routine—the data replication—is actually simple. The complex parts (leader election, failure recovery, etc.) are rarely used in modern datacenters where failures are only occasional. These complex routines are not performance critical; their software implementations are fast enough and do not need acceleration. Therefore, our system uses FPGA assistance to accelerate data replication, and leaves the rest to the traditional software implementation of SMR. Our Waverunner approach is beneficial in both the common and the rare case situations. In the common case, the system runs at the speed of the network, with a 99th percentile latency of 1.8 µs achieved without batching on minimum-size packets at network line rate (85.5 Gbps in our evaluation). In rare cases, to handle uncommon situations such as leader failure and failure recovery, the system uses traditional software to guarantee correctness, which is much easier to develop and maintain than hardware-based implementations. Overall, our experience confirms Waverunner as an effective and practical solution for hardware accelerated SMR—achieving most of the benefits of hardware acceleration with minimum added complexity and implementation effort. 
    more » « less
  5. Many currently deployed public-key cryptosystems are based on the difficulty of the discrete logarithm and integer factorization problems. However, given an adequately sized quantum computer, these problems can be solved in polynomial time as a function of the key size. Due to the future threat of quantum computing to current cryptographic standards, alternative algorithms that remain secure under quantum computing are being evaluated for future use. One such algorithm is CRYSTALS-Dilithium, a lattice-based digital signature scheme, which is a finalist in the NIST Post Quantum Cryptography (PQC) competition. As a part of this evaluation, high-performance implementations of these algorithms must be investigated. This work presents a high-performance implementation of CRYSTALS-Dilithium targeting FPGAs. In particular, we present a design that achieves the best latency for an FPGA implementation to date. We also compare our results with the most-relevant previous work on hardware implementations of NIST Round 3 post-quantum digital signature candidates. 
    more » « less