skip to main content

Search for: All records

Creators/Authors contains: "Cauwenberghs, Gert"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM) 1 promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory 2–5 . Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware 6–17 , it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the design. Here, by co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM—a RRAM-based CIM chip that simultaneously delivers versatility in reconfiguring CIM cores for diverse model architectures, energy efficiency that is two-times better than previous state-of-the-art RRAM-CIM chips across various computational bit-precisions, and inference accuracy comparable to software models quantized to four-bit weights across various AImore »tasks, including accuracy of 99.0 percent on MNIST 18 and 85.7 percent on CIFAR-10 19 image classification, 84.7-percent accuracy on Google speech command recognition 20 , and a 70-percent reduction in image-reconstruction error on a Bayesian image-recovery task.« less
    Free, publicly-accessible full text available August 18, 2023
  2. We present an efficient and scalable partitioning method for mapping large-scale neural network models with locally dense and globally sparse connectivity onto reconfigurable neuromorphic hardware. Scalability in computational efficiency, i.e., amount of time spent in actual computation, remains a huge challenge in very large networks. Most partitioning algorithms also struggle to address the scalability in network workloads in finding a globally optimal partition and efficiently mapping onto hardware. As communication is regarded as the most energy and time-consuming part of such distributed processing, the partitioning framework is optimized for compute-balanced, memory-efficient parallel processing targeting low-latency execution and dense synaptic storage, with minimal routing across various compute cores. We demonstrate highly scalable and efficient partitioning for connectivity-aware and hierarchical address-event routing resource-optimized mapping, significantly reducing the total communication volume recursively when compared to random balanced assignment. We showcase our results working on synthetic networks with varying degrees of sparsity factor and fan-out, small-world networks, feed-forward networks, and a hemibrain connectome reconstruction of the fruit-fly brain. The combination of our method and practical results suggest a promising path toward extending to very large-scale networks and scalable hardware-aware partitioning.
  3. With the rising need for on-body biometric sensing, the development of wearable electrophysiological sensors has been faster than ever. Surface electrodes placed on the skin need to be robust in order to measure biopotentials from the body reliably and comfortable for extended wearability. The electrical stability of nonpolarizable silver/silver chloride (Ag/AgCl) and its low-cost, commercial production have made these electrodes ubiquitous health sensors in the clinical environment, where wet gels and long wires are accommodated by patient immobility. However, smaller, dry electrodes with wireless acquisition are essential for truly wearable, continuous health sensing. Currently, techniques for the robust fabrication of custom Ag/AgCl electrodes are lacking. Here, we present three methods for the fabrication of Ag/AgCl electrodes: oxidizing Ag in a chlorine solution, electroplating Ag, and curing Ag/AgCl ink. Each of these methods is then used to create three different electrode shapes for wearable application. Bench-top and on-body evaluation of the electrode techniques was achieved by electrochemical impedance spectroscopy (EIS), calculation of variance in electrocardiogram (ECG) measurements, and analysis of auditory steady-state response (ASSR) measurement. Microstructures produced on the electrode by each fabrication technique were also investigated with scanning electron microscopy (SEM) and energy-dispersive X-ray spectroscopy (EDX). The custom Ag/AgCl electrodesmore »were found to be efficient in comparison with standard, commercial Ag/AgCl wet electrodes across all three of our presented techniques, with Ag/AgCl ink shown to be the better out of the three in bench-top and biometric recordings.« less
  4. Progress in computational neuroscience toward understanding brain function is challenged both by the complexity of molecular-scale electrochemical interactions at the level of individual neurons and synapses and the dimensionality of network dynamics across the brain covering a vast range of spatial and temporal scales. Our work abstracts an existing highly detailed, biophysically realistic 3D reaction-diffusion model of a chemical synapse to a compact internal state space representation that maps onto parallel neuromorphic hardware for efficient emulation at a very large scale and offers near-equivalence in input-output dynamics while preserving biologically interpretable tunable parameters.
  5. Real-time spike sorting with large data throughput is essential for studying neural dynamics and brain-machine interfaces. Neural recordings from high-density multi-electrode arrays that consist of hundreds of electrodes impose stringent demands on spike sorting hardware regarding data transmission bandwidth and computation complexity. That leads to an urgent need for specialized hardware with high throughput, low power, and latency. Here, we present a real-time spike sorting processor that utilizes high-density BEOL-integrable CuO x resistive crossbars to perform in-memory spike segregation. We experimentally demonstrate, for the first time, efficient hardware implementation of spike sorting from in vivo extracellular recordings with high accuracy. Our neuromorphic interface promises substantial performance gains ( ∼1000×less area,∼200×less power,4.8 μs latency for sorting 100 channels) for in vivo real-time spike sorting.