skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Dual Use Circuitry for Early Failure Warning and Test
Incorrect circuit timing often leads to errors in the field, including silent data corruption. Once a circuit violates timing, recovery can be difficult even when the violation is detected. Canary flip-flops have previously been proposed to identify when the desired slack of a path has first been violated in functional mode even before failure occurs. We show how such flip-flops can be combined in a MISR to also provide enhanced coverage during manufacturing test and scan-based field test.  more » « less
Award ID(s):
1812777
PAR ID:
10509333
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
IEEE
Date Published:
ISBN:
979-8-3503-0927-0
Page Range / eLocation ID:
1 to 8
Subject(s) / Keyword(s):
Canary Flip-flops, Design for Test, Fault Coverage, Silent Data Corruption (SDC)
Format(s):
Medium: X
Location:
San Francisco, CA, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Test sets that target standard fault models may not always be sufficient for detecting all defects. To evaluate test sets for the detection of unmodeled defects,n-detect test sets (which detect all modeled faults at leastntimes) have previously been proposed. Unfortunately,n-detect test sets are often prohibitively long. In this paper, we investigate the ability of shadow flip-flops connected into a MISR (Multiple Input Signature Register) to detect stuck-at faults fortuitously multiple times during scan shift. We explore which flip-flops should be shadowed to increase the value ofnfor the least detected stuck-at faults for each circuit studied. We then identify which circuit characteristics are most important for determining the cost of the MISR needed to achieve high values ofn. For example, circuits that contain a few flip-flops with upstream fault cones that cover a large percentage of all faults in the circuit can often achieve highn-detect coverage fortuitously with a low-cost MISR. This allows a DFT engineer to predict the viability of this MISR-based approach early in the design cycle. 
    more » « less
  2. null (Ed.)
    The recycling of used integrated circuits (ICs) has raised serious problems in ensuring the integrity of today's globalized semiconductor supply chain. This poses a serious threat to critical infrastructure due to potentially shorter lifetime, lower reliability, and poorer performance from these counterfeit new chips. Recently, we have proposed a highly effective approach for detecting such chips by exploiting the power-up state of on-chip SRAMs. Due to the symmetry of the memory array layout, an equal number of cells power-up to the 0 and 1 logic states in a new unused SRAM; this ratio gets skewed in time due to uneven NBTI aging from normal usage in the field. Although this solution is very effective in detecting recycled ICs, its applicability is somewhat limited as a large number older designs do not have large on-chip memories. In this paper, we propose an alternate approach based on the initial power-up state of scan flip-flops, which are present in virtually every digital circuit. Since the flip-flops, unlike SRAM cells, are generally not perfectly symmetrical in layout, an equal number of scan cells will not power-up to 0 or 1 logic states in most designs. Consequently, a stable time zero reference of 50% logic 0s and 1s cannot be used for determining the subsequent usage of a chip. To overcome this key limitation, we propose a novel solution in this paper that reliably identifies used ICs from testing the part alone, without the need for any additional reference data or even the netlist of the circuit. Through scan testing of the IC, we first identify a significant number of asymmetrically stressed flip-flops in the design, divided into two groups. One group of flip-flops is selected such that it mostly experiences the 1 logic state during functional operation, while the other group mostly experiences the 0 state. The resulting differential stress during operation causes growing disparity over time in the number of 0s (and 1s) observed in these two groups at power-up. When new and unaged, these two groups behave similarly, with similar percentage of 1s (or 0s). However, over time the differential stress makes these counts diverge. We show that this changing count can be a measure of operational aging. Our simulation results show that it is possible to reliably detect used ICs after as little as three months of operation. 
    more » « less
  3. Many Cyber-Physical Systems (CPS) have timing constraints that must be met by the cyber components (software and the network) to ensure safety. It is a tedious job to check if a CPS meets its timing requirement especially when they are distributed and the software and/or the underlying computing platforms are complex. Furthermore, the system design is brittle since a timing failure can still happen e.g., network failure, soft error bit flip, etc. In this paper, we propose a new design methodology called Plan B where timing constraints of the CPS are monitored at the runtime, and a proper backup routine is executed when a timing failure happens to ensure safety. We provide a model on how to express the desired timing behavior using a set of timing constructs in a C/C++ code and how to efficiently monitor them at the runtime. We showcase the effectiveness of our approach by conducting experiments on three case studies: 1) the full software stack for autonomous driving (Apollo), 2) a multi-agent system with 1/10th scale model robots, and 3) a quadrotor for search and rescue application. We show that the system remains safe and stable even when intentional faults are injected to cause a timing failure. We also demonstrate that the system can achieve graceful degradation when a less extreme timing failure happens. 
    more » « less
  4. This study examines the single-event response of Xilinx 16nm FinFET UltraScale+ FPGA and MPSoC device families. Heavy-ion single-event latch-up, single-event upsets in configuration SRAM, BlockRAM™ memories, and flip-flops, and neutron-induced single-event latch-up results are provided. 
    more » « less
  5. Excessive power during in–field testing can cause multiple issues, including invalidation of the test results, overheating, and damage to the circuit. In this paper, we evaluate the reduction of capture power when specific segments of a scan chain can be kept from capturing data subject to values stored in a control register. The proposed approach requires no changes to the Automatic Test Pattern Generation (ATPG), no redesign of the circuitry to match a particular test set, and no additional patterns to maintain fault coverage. We will show that our approach can achieve very high capture power reduction— approaching 100% for multiple patterns. 
    more » « less