skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Award ID contains: 2323819

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Smaller transistor feature sizes have made integrated circuits (ICs) more vulnerable to permanent faults. This leads to short lifetimes and increased risk of faults that lead to catastrophic errors. Fortunately, Artificial Neural Networks (ANNs) are error resilient as their accuracies can be maintained through e.g., fault-aware re-training. One of the problems though with previous work is that they require a re-design in the individual neuron processing element structure in order to efficiently deal with these faults. In this work, we propose a novel architecture combined with a design flow that performs a fault-aware weight re-assignment in order to minimize the effect of permanent faults on the accuracy of ANNs mapped to AI accelerator without the need of time-consuming fault-aware re-training nor neuron processing elements re-design. In particular, we deal with Tensor Processing Units (TPUs) although our proposed approach is also extensible to any other architecture. Experimental results show that our proposed approach and can be efficiently executed on a fast dedicated hardware re-binding unit or on software. 
    more » « less
    Free, publicly-accessible full text available June 29, 2026
  2. Approximate deep neural networks (AxDNNs) are promising for enhancing energy efficiency in real-world devices. One of the key contributors behind this enhanced energy efficiency in AxDNN is the use of approximate multipliers. Unfortunately, the simulation of approximate multipliers does not usually scale well on CPUs and GPUs. As a consequence, this slows down the overall simulation of AxDNNs aimed at identifying the appropriate approximate multipliers to achieve high energy efficiency with a minimum accuracy loss. To address this problem, we present a novel XAI-Gen methodology, which leverages the analytical model of the emerging hardware accelerator (e.g., Google TPU v4) and explainable artificial intelligence (XAI) to precisely identify the non-critical layers for approximation and quickly discover the appropriate approximate multipliers for AxDNN layers. Our results show that XAI-Gen achieves up to 7× lower energy consumption with only 1-2% accuracy loss. We also showcase the effectiveness of the XAI-Gen approach through a neural architecture search (XAI-NAS) case study. Interestingly, XAI-NAS achieves 40% higher energy efficiency with up to 5× less execution time when compared to the state-of-the-art NAS methods for generating AxDNNs. 
    more » « less
    Free, publicly-accessible full text available April 23, 2026
  3. Deep neural networks are lucrative targets of adversarial attacks and approximate deep neural networks (AxDNNs) are no exception. Searching manually for adversarially robust AxDNN architectures incurs outrageous time and human effort. In this paper, we propose XAI-NAS, an explainable neural architecture search (NAS) method that leverages explainable artificial intelligence (XAI) to efficiently co-optimize the adversarial robustness and hardware efficiency of AxDNN architectures on systolic-array hardware accelerators. During the NAS process, AxDNN architectures are evolved layer-wise with heterogeneous approximate multipliers to deliver the best trade-offs between adversarial robustness, energy consumption, latency, and memory footprint. The most suitable approximate multipliers are automatically selected from an open-source Evoapprox8b library. Our extensive evaluations provide a set of Pareto optimal hardware efficient and adversarially robust solutions. For example, a Pareto-optimal DNN AxDNN for the MNIST and CIFAR-10 datasets exhibits up to 1.5× higher adversarial robustness, 2.1× less energy consumption, 4.39× reduced latency, and 2.37× low memory footprint when compared to the state-of-the-art NAS approaches. 
    more » « less
    Free, publicly-accessible full text available January 1, 2026