Electromigration (EM) is still the most important reliability
concern for VLSI systems, especially at the nanometer regime. EM
immortality check is an important step for full-chip EM signoff
analysis. In this paper, we propose a new electromigration (EM)
immortality check method for multi-segment interconnect considering
the impacts of Joule heating induced temperature gradient.
Temperature gradients from metal Joule heating, called thermal
migration, can be a significant force for the metal atomic
migrations, and these impacts get more significant as technology
scales down. Compared to existing methods, the new method can
consider the spatial temperature gradient due to Joule heating for
multi-segment wires for the first time. We derive the analytic
solution for the resulting steady-state EM-thermal migration stress
distribution problem. Then we develop the new temperature-aware
voltage-based EM immortality check method considering the
multi-segment temperature migration effects, which carries all the
benefits of the recently proposed voltage-based EM immortality
method for multi-segment interconnects. Numerical results on an IBM
power grid and self synthesized power delivery networks show that
the proposed temperature-aware EM immortality check method is much
more accurate than recently proposed state of the art EM immortality
method.
more »
« less
EMGraph: Fast electromigration stress assessment for interconnect trees using graph convolution networks
Electromigration (EM) becomes a major concern for VLSI circuits as
the technology advances in the nanometer regime. With Korhonen
equations, EM assessment for VLSI circuits remains challenged due to
the increasing integrated density. VLSI multisegment interconnect
trees can be naturally viewed as graphs. Based on this observation,
we propose a new graph convolution network (GCN) model, which is
called {\it EMGraph} considering both node and edge embedding
features, to estimate the transient EM stress of interconnect trees.
Compared with recently proposed generative adversarial network (GAN)
based stress image-generation method, EMGraph model can learn more
transferable knowledge to predict stress distributions on new graphs
without retraining via inductive learning. Trained on the large
dataset, the model shows less than 1.5% averaged error compared to
the ground truth results and is orders of magnitude faster than both
COMSOL and state-of-the-art method. It also achieves smaller model
size, 4X accuracy and 14X speedup over the GAN-based
method.
more »
« less
- Award ID(s):
- 2007135
- NSF-PAR ID:
- 10301043
- Date Published:
- Journal Name:
- Proc. IEEE/ACM Design Automation Conference (DAC’21)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
In this paper, we propose a new spatial temperature aware transient EM induced stress analysis method. The new method consists of two new contributions: First, we propose a new TM-aware void saturation volume estimation method for fast immortality check in the post-voiding phase for the first time. We derive the analytic formula to estimate the void saturation in the presence of spatial temperature gradients due to Joule heating. Second, we developed a fast numerical solution for EM-induced stress analysis for multi-segment interconnect trees considering TM effect. The new method first transforms the coupled EM-TM partial differential equations into linear time-invariant ordinary differential equations (ODEs). Then extended Krylov subspace-based reduction technique is employed to reduce the size of the original system matrices so that they can be efficiently simulated in the time domain. The proposed method can perform the simulation process for both void nucleation and void growth phases under time-varying input currents and position-dependent temperatures. The numerical results show that, compared to the recently proposed semi-analytic EM-TM method, the proposed method can lead to about 28x speedup on average for the interconnect with up to 1000 branches for both void nucleation and growth phases with negligible errors.more » « less
-
In this paper, we propose an image generative learning framework for electrostatic analysis for VLSI dielectric aging estimation. This work leverages the observation that the synthesized multi layer interconnect VLSI layout can be viewed as layered 2D images and the analysis can be viewed as the image generation. The efficient image-to-image translation property of generative learning is therefore used to obtain the potential distribution on the respective interconnect layers. Compared with the recent CNN-based electrostatic analysis method, the new method can lead to 1.54x speedup for inference due to reduced neural network structures and parameters. We demonstrate the proposed method for time-dependent dielectric breakdown analysis and show the significant speedup compared to the traditional numerical method.more » « less
-
Electromigration (EM) is a major failure effect for on-chip power grid networks of deep submicron VLSI circuits. EM degradation of metal grid lines can lead to excessive voltage drops (IR drops) before the target lifetime. In this paper, we propose a fast data-driven EM-induced IR drop analysis framework for power grid networks, named {\it GridNet}, based on the conditional generative adversarial networks (CGAN). It aims to accelerate the incremental full-chip EM-induced IR drop analysis, as well as IR drop violation fixing during the power grid design and optimization. More importantly, {\it GridNet} can naturally leverage the differentiable feature of deep neural networks (DNN) to {\it obtain the sensitivity information of node voltage with respect to the wire resistance (or width) with marginal cost}. {\it GridNet} treats continuous time and the given electrical features as input conditions, and the EM-induced time-varying voltage of power grid networks as the conditional outputs, which are represented as data series images. We show that {\it GridNet} is able to learn the temporal dynamics of the aging process in continuous time domain. Besides, we can take advantage of the sensitivity information provided by {\it GridNet} to perform efficient localized IR drop violation fixing in the late stage design and optimization. Numerical results on 36000 synthesized power grid network samples demonstrate that the new method can lead to $10^5\times$ speedup over the recently proposed full-chip coupled EM and IR drop analysis tool. We further show that localized IR drop violation fix for the same set of power grid networks can be performed remarkably efficiently using the cheap sensitivity computation from {\it GridNet}.more » « less
-
This work proposes a new dynamic thermal and reliability management framework via task mapping and migration to improve thermal performance and reliability of commercial multi-core processors considering workload-dependent thermal hot spot stress. The new method is motivated by the observation that different workloads activate different spatial power and thermal hot spots within each core of processors. Existing run-time thermal management, which is based on on-chip location-fixed thermal sensor information, can lead to suboptimal management solutions as the temperatures provided by those sensors may not be the true hot spots. The new method, called Hot-Trim, utilizes a machine learning-based approach to characterize the power density hot spots across each core, then a new task mapping/migration scheme is developed based on the hot spot stresses. Compared to existing works, the new approach is the first to optimize VLSI reliabilities by exploring workload-dependent power hot spots. The advantages of the proposed method over the Linux baseline task mapping and the temperature-based mapping method are demonstrated and validated on real commercial chips. Experiments on a real Intel Core i7 quad-core processor executing PARSEC-3.0 and SPLASH-2 benchmarks show that, compared to the existing Linux scheduler, core and hot spot temperature can be lowered by 1.15 to 1.31C. In addition, Hot-Trim can improve the chip's EM, NBTI and HCI related reliability by 30.2%, 7.0% and 31.1% respectively compared to Linux baseline without any performance degradation. Furthermore, it improves EM and HCI related reliability by 29.6% and 19.6% respectively, and at the same time even further reduces the temperature by half a degree compared to the conventional temperature-based mapping technique.more » « less