ELSEVIER

Contents lists available at ScienceDirect

# Journal of Computational Science

journal homepage: www.elsevier.com/locate/jocs





# A methodology for thermal simulation of interconnects enabled by model reduction with material property variation

Wangkun Jia, Ming-C. Cheng

Department of Electrical and Computer Engineering, Clarkson University, Potsdam, NY 13699-5720, USA

#### ARTICLE INFO

Keywords:
Interconnect thermal simulation
Model order reduction
Proper orthogonal decomposition
Data-driven
Method of snapshots

#### ABSTRACT

A thermal simulation methodology is developed for interconnects enabled by a data-driven learning algorithm accounting for variations of material properties, heat sources and boundary conditions (BCs). The methodology is based on the concepts of model order reduction and domain decomposition to construct a multi-block approach. A generic block model is built to represent a group of interconnect blocks that are used to wire standard cells in the integrated circuits (ICs). The blocks in this group possess identical geometry with various metal/via routings. The data-driven model reduction method is thus applied to *learn* material property variations induced by different metal/via routings in the blocks, in addition to the variations of heat sources and BCs. The approach is investigated in two very different settings. It is first applied to thermal simulation of a single interconnect block with similar BCs to those in the training of the generic block. It is then implemented in multi-block thermal simulation of a FinFET IC, where the interconnect structure is partitioned into several blocks each modeled by the generic block model. Accuracy of the generic block model is examined in terms of the metal/via routings, BCs and thermal discontinuities at the block interfaces.

# 1. Introduction

Temperature escalation in integrated circuits (ICs) due to Joule heating has emerged as one of the most critical issues in integrated circuit (IC) design for decades [1–11]. This results from the rapid increase in the device and interconnect densities in semiconductor chips. High thermal gradients and hot spots induced by Joule heating significantly degrade performance and reliability of ICs in various IC technologies [1–9]. In addition to self-heating effects in devices [12–16], high temperature in interconnects also imposes serious problems. For example, metal resistance and signal delays increase as temperature rises, which limits the operating speed of ICs [8], [9], [17]. Moreover, the failure rate of interconnects due to electromigration is accelerated by high temperature [17–19]. It is thus essential to predict accurate thermal distributions over the interconnects in IC design.

Simulations for predicting temperature distributions in ICs from the gate to the system level have usually been based on efficient lumped thermal element or compact thermal models [20–28]. These approaches rely on *RC* thermal elements and/or averaging concepts to minimize the computing time. They are however limited to regional average temperature and not able to offer accurate temperature gradients or hot

spots in semiconductor chips. To provide more detailed and accurate thermal profiles, direct numerical simulation (DNS) is needed but its intensive computational demand is prohibitive for thermal simulation of large-scale semiconductor chips.

In order to offer both efficiency and accuracy of the thermal prediction in ICs, alternatives based on the projection-based reduced-order models have been applied in recent years [29-36]. Many of these approaches utilize proper orthogonal decomposition (POD) [29-36]. The POD methods have been investigated in many different fields including fluid dynamics [37-40], micro-electro-mechanical systems (MEMS) [41, 42], heat transfer [29-36], etc. The POD generates orthogonal basis functions from solution data of the DNSs in a domain subjected to the parametric variations that for heat transfer problems usually include spatial/temporal power sources and boundary conditions (BCs). The decomposition optimizes the basis functions (or POD modes) specifically tailored to the geometry and parametric variations of the problem using a data-driven learning process. With the heat conduction equation projected to the POD modes, the approach is able to significantly reduce the numerical degree of freedom (DoF) needed to accurately predict the thermal profile in the domain.

The major drawback of the POD method is the time-consuming

E-mail address: mcheng@clarkson.edu (M.-C. Cheng).

<sup>\*</sup> Corresponding author.

process for data collection from DNSs, generation of POD modes and calculations of POD model parameters for a large domain with a fine resolution. One may also question the usefulness of the POD method for engineering applications since this "training" or "learning" process for the POD modes requires full-scale DNSs with enough variations of heat sources and BCs. It becomes prohibited in a very large domain with a fine spatial resolution, especially in a dynamic problem.

To resolve this issue, a multi-block POD methodology was proposed in our previous work [31], [33], which implements the concept of reduced basis elements [43,44] in POD, similar to the concept based on space vector clustering [45]. The multi-block methodology implements domain decomposition in the POD method, which partitions a large domain into smaller building blocks. Each of these blocks is projected onto a functional space described by its POD modes. This approach offers several advantages. First, computational resources needed in DNSs to collect thermal data to generate POD modes and parameters increase exponentially with the size of the domain. With smaller blocks, the POD modes and parameters can be generated more efficiently. To construct a large domain, the projected building blocks can be glued together with the interface continuity enforced by the interior penalty discontinuous Galerkin (DG) method [46,47]. Second, domain decomposition possesses a nature advantage of parallel computing and has been proven effective in parallel/distributed computing settings [48–51]. Finally, IC design at all levels always utilizes identical repeating functional circuit blocks, such as standard cells, caches, memory units, CPU/GPU cores, GPU streaming multiprocessors, etc. The POD modes and parameters of these generic building blocks can be generated and stored in a library for each design level, which can then offer cost-effective thermal simulation and thermal management of large IC structures.

Most POD applications to thermal simulations have been based on single-block approaches [29,30], [34–37], the multi-block POD approach with generic blocks have recently been successfully applied to device and IC structures [31], [33]. It was shown [31] that the multi-block POD model is able to predict accurate 3D dynamic thermal distributions and small-size hot spots in a FinFET IC structure subjected to random power pulses induced by digital input voltages. It was demonstrated that the multi-block POD simulation offers a reduction in DoF by 5–6 orders of magnitude with high accuracy compared to DNS.

In [31], the interconnects were not included in the multi-block POD simulation because it is difficult to find interconnect building blocks that are repeatedly used in different locations of ICs. Metal routings in ICs consist of a wide range of variations, which are selected based on the placement of the functional blocks. However, within a specific group of interconnects with similar features, it is still possible to select a generic block to develop a multi-block POD approach for interconnect thermal simulation if the material property variation (MPV) is considered in the POD-mode training. In the proposed multi-block MPV-POD methodology, the POD modes in each selected block are then trained to experience variation of thermal properties between the metal and dielectric at the locations where the metal lines and vias appear. To construct a multi-block MPV-POD thermal model, several small interconnect blocks with various metal/via routings modeled by the same generic-block POD model can then be glued together. To the best of our knowledge, this is the first study considering MPV in POD. It should be noted that, unlike the standard cells or many other functional circuit blocks, it is difficult to select building blocks that can be used in different groups of interconnects in ICs. Different types of generic blocks will be needed for different interconnect configurations. In this work, the developed MPV-POD methodology is applied to the interconnects that are used to wire standard cells. With the demonstration of a simple interconnect configuration to examine the concept and effectiveness of the MPV-POD approach, this study perhaps paves the way toward developing a more sophisticate multi-block MPV-POD methodology for thermal simulation of on-chip and off-chip interconnects.

After an overview of the conventional POD method in Section 2 for thermal problems, the MPV-POD methodology is presented in Section 3.

The MPV-POD approach for interconnect thermal simulation is illustrated in Section 4 step-by-step including the settings of the simulation domains for data collection and the generation of POD eigenvalues and modes. In Section 5, the MPV-POD method is demonstrated in a single-block interconnect structure and a small multi-block IC structure, compared against the DNSs. Discussions of the findings are given in Section 6, and conclusions are finally presented in Section 7.

## 2. Overview of POD thermal simulation methodology

## 2.1. POD for a single-block domain

POD generates a set of modes from spatial/temporal thermal data accounting for the parametric variation. This is done by seeking a POD mode  $\varphi(\vec{x})$  that maximizes its mean square inner product with an ensemble of thermal data  $T(\vec{x},t)$  in the domain  $\Omega$ ,

$$\left\langle \left( \int_{\Omega} T(\overrightarrow{x}, t) \varphi \, d\Omega \right)^2 \right\rangle / \int_{\Omega} \varphi^2 d\Omega,$$
 (1)

where  $\langle \cdot \rangle$  indicates an average over ensemble data sets observed in time in our study. This can also be done in static problems where the observations are carried out in response to a range of heat source strengths and BCs. This maximization process ensures that the component projected onto the POD mode contains the maximum least squares (LS) information of the thermal behavior described by the thermal data [52, 53]. In the space orthogonal to this mode, the process can be performed again to generate the second mode. Repetition of the process in this fashion results in an orthogonal set of POD modes.

Applying the variational calculus to Eq. (1), this problem can be reformulated to the Fredholm equation of the second kind,

$$\int_{\overrightarrow{x'}} R(\overrightarrow{x}, \overrightarrow{x'}) \varphi(\overrightarrow{x'}) d\overrightarrow{x'} = \lambda \varphi(\overrightarrow{x}), \tag{2}$$

where  $R(\overrightarrow{x}, \overrightarrow{x}')$  is an two-point correlation tensor given by

$$R(\overrightarrow{x}, \overrightarrow{x}') = \langle T(\overrightarrow{x}, t) \otimes T(\overrightarrow{x}', t) \rangle, \tag{3}$$

with  $\otimes$  as the tensor product, and  $\lambda$  is the POD eigenvalue of R and represents the mean squared temperature captured by the corresponding POD mode. This decomposition process leads to an eigenvalue problem represented by Eq. (2) for R. Once the POD modes are found, temperature  $T(\overrightarrow{x},t)$  can be represented by a linear combination of the POD modes,

$$T(\overrightarrow{x},t) = \sum_{i=1}^{M} a_i(t)\varphi_i(\overrightarrow{x}),\tag{4}$$

where M is the selected number of modes or DoF for the temperature solution and  $a_i$  is the time dependent coefficient for each mode. The dimension of the eigenvalue problem given in Eq. (2) for a large-scale multi-dimensional structure may be enormously large and numerically prohibitive. To generate the POD modes and eigenvalues more efficiently, the method of snapshots [29], [31], [54] is applied to convert the eigenvalue problem from a space domain to a sampling domain whose dimension is determined by the smaller number of samples/snapshots. A brief overview of the method of snapshots [54] is presented in Appendix.

To predict the temperature in Eq. (4), a set of equations for  $a_j$  is generated by projecting the heat conduction equation onto an eigenspace using the Galerkin projection method,

$$\int_{\Omega} \left( \varphi \frac{\partial \rho CT}{\partial t} + \nabla \varphi \cdot k \nabla T \right) d\Omega = \int_{\Omega} \varphi P_d d\Omega - \int_{\Gamma} \varphi (-k \nabla T \cdot \overrightarrow{n}) d\Gamma, \tag{5}$$

where k is the thermal conductivity,  $P_d$  the power density,  $\rho$  the density,

C the specific heat,  $\Gamma$  the boundary surface,  $\overrightarrow{n}$  the outward normal vector of the surface, and  $(-k\nabla T)$  the heat flux on the surface. With a selected number of modes M, the spatial integrals in Eq. (5) can be pre-evaluated to construct a set of M ordinary differential equation for  $a_i$ .

$$\sum_{i=1}^{M} c_{i,j} \frac{da_j}{dt} + \sum_{i=1}^{M} g_{i,j} a_j = P_{pod,i}, \quad i = 1 \text{ to } M,$$
 (6)

where  $c_{i,j}$  and  $g_{i,j}$  are elements of the thermal capacitance and conductance matrices in the POD space and given by

$$c_{i,j} = \int_{\Omega} \rho C \varphi_i \varphi_j d\Omega \text{ and } g_{i,j} = \int_{\Omega} k \nabla \varphi_i \nabla \varphi_j d\Omega$$
 (7)

and  $P_{pod,i}$  represents the projected power density in  $\Omega$  and heat flux across  $\Gamma$  along the ith POD mode,

$$P_{pod,i} = \int_{\Omega} \varphi_i P_d(\overrightarrow{x}, t) \, d\Omega - \int_{\Gamma} \varphi_i (-k \nabla T \cdot \overrightarrow{n}) \, d\Gamma. \tag{8}$$

Power density  $P_d$  in metal is induced by Joule heating,

$$P_d = \overrightarrow{J} \cdot \overrightarrow{E} = J^2 / \sigma. \tag{9}$$

where  $\overrightarrow{J}$  is the current density,  $\overrightarrow{E}$  the electric field, and  $\sigma$  the copper conductivity. To obtain realistic current density in data collections and demonstrations, layouts of the FinFET ICs (such as the FinFET circuit shown in Fig. 1) are constructed in a VLSI design tool [55] that is then used to generate the Spice circuits for the ICs including the interconnect routings with metal resistance. Simulations of FinFET ICs are performed in Spice with the FinFET subcircuit model adopted from [56]. Based on the current in each metal wire obtained from Spice simulation, together with the metal wire cross section and conductivity, J and  $P_d$  are calculated. With the determined  $a_i$ , the temperature can be predicted from Eq. (1). The off-diagonal terms of  $c_{i,j}$  are usually small and ignored in previous studies [29–31]. In this study, they are all included.

### 2.2. POD formulation for multi-block structure

When several blocks adjoin together, thermal continuity at the interface needs to be appropriately enforced in the last term of Eq. (5) or the second term of Eq. (8). The interior penalty DG method [46,47] is applied to the interface and Eq. (5) becomes



**Fig. 1.** Circuit and layout for a FinFET IC with horizontal M1 in blue, vertical M2 in yellow, poly in red and vias in black. The NAND-gate structures (labeled as A, B and C) are identical and given in Fig. 2.

$$\int_{\Omega} \left( \varphi \frac{\partial \rho CT}{\partial t} + \nabla \varphi \cdot k \nabla T \right) d\Omega - k \int_{\Gamma} ([[T]] \{ \nabla \varphi \} + \{ \nabla T \} [[\varphi]]) \cdot \overrightarrow{n} d\Gamma 
+ k \int_{\Gamma} \mu [[T]] [[\varphi]] d\Gamma 
= \int_{\Omega} \varphi P_d d\Omega,$$
(10)

where  $\mu$  is a penalty constant defined as  $N_{\mu}/dx$  (dx is the local element size and  $N_{\mu}$  as the non-unit penalty number), and  $\{\cdot\}$  and  $[\![\cdot]\!]$  are the average and difference across the interface, respectively (see [31]). A large positive value of  $\mu$  is usually needed to stabilize the numerical result. Numerical accuracy and stability influenced by  $N_{\mu}$  will be examined in Section 5.2.

For a domain consisting of N projected blocks, Eq. (10) reduces to an N-block POD model given as

$$\begin{bmatrix} C_{1} & 0 & \cdots & 0 \\ 0 & C_{2} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & C_{N} \end{bmatrix} \frac{d}{dt} \begin{bmatrix} \overrightarrow{a}_{1} \\ \overrightarrow{a}_{2} \\ \vdots \\ \overrightarrow{a}_{N} \end{bmatrix}$$

$$+ \begin{bmatrix} G_{1} + \sum_{q=2}^{N} G_{1,B_{1,q}} & G_{1,2} & \cdots & G_{1,N} \\ G_{2,1} & G_{2} + \sum_{q=1,q\neq2}^{N} G_{2,B_{2,q}} & \cdots & G_{2,N} \\ \vdots & \vdots & \ddots & \vdots \\ G_{N,1} & G_{N,2} & \cdots & G_{N} + \sum_{q=1}^{N-1} G_{N,B_{N,q}} \end{bmatrix} \cdot \begin{bmatrix} \overrightarrow{a}_{1} \\ \overrightarrow{a}_{2} \\ \vdots \\ \overrightarrow{a}_{N} \end{bmatrix}$$

$$= \begin{bmatrix} \overrightarrow{P}_{1} \\ \overrightarrow{P}_{2} \\ \vdots \\ \overrightarrow{P}_{N} \end{bmatrix}$$

$$(11)$$

where the subscript  $B_{p,q}$  denotes the boundary surface in the pth block between the pth and qth blocks,  $\mathbf{C}_p$  and  $\mathbf{G}_p$  are the  $c_{i,j}$  and  $g_{i,j}$  matrices of the pth block given in Eq. (7),  $G_{p,B_{n,q}}$ . is given as

$$\mathbf{G}_{p,B_{p,q}} = \begin{bmatrix} g_{p,B_{p,q},1,1} & g_{p,B_{p,q},1,2} & \cdots & g_{p,B_{p,q},1,M_p} \\ g_{p,B_{p,q},2,1} & g_{p,B_{p,q},2,2} & \cdots & g_{p,B_{p,q},2,M_p} \\ \vdots & \vdots & \ddots & \vdots \\ g_{p,B_{p,q},M_p,1} & g_{p,B_{p,q},M_p,2} & \cdots & g_{p,B_{p,q},M_p,M_p} \end{bmatrix},$$

$$(12)$$

and  $G_{p,g}$  is given as

$$\mathbf{G}_{p,q} = \begin{bmatrix} g_{p,q;\ 1,1} & g_{p,q;\ 1,2} & \cdots & g_{p,q;\ 1,M_q} \\ g_{p,q;\ 2,1} & g_{p,q;\ 2,2} & \cdots & g_{p,q;\ 2,M_q} \\ \vdots & \vdots & \ddots & \vdots \\ g_{p,q;\ M_p,1} & g_{p,q;\ M_p,2} & \cdots & g_{p,q;\ M_p,M_q} \end{bmatrix}$$
(13)

If the pth and qth do not adjoin each other,  $G_{p,B_{p,q}}=0$ , and  $G_{p,g}=0$ . The thermal coupling conductance matrices in Eqs. (12) and (13) are given in the following equations respectively,

$$g_{p,B_{p,q},i,j} = -k \int_{\Gamma_{po}} \left( \frac{1}{2} \varphi_{p,j} \nabla \varphi_{p,i} + \frac{1}{2} \varphi_{p,i} \nabla \varphi_{p,j} - \mu \varphi_{p,i} \varphi_{p,j} \right) d\Gamma \tag{14}$$

$$g_{p,q:\;i,j} = -k \int_{\Gamma_{ov}} \left( -\frac{1}{2} \varphi_{q,j} \nabla \varphi_{p,i} + \frac{1}{2} \varphi_{p,i} \nabla \varphi_{q,j} + \mu \; \varphi_{p,i} \varphi_{q,j} \right) d\Gamma. \tag{15}$$

With a large number of blocks in a structure, the block G matrix in Eq. (11) becomes sparse. The nonzero blocks in a row are neighbored by other elements.

#### 3. POD with material property variation

In this study, we extend the POD thermal simulation methodology to account for the MPV between metal and dielectric (oxide). More specifically, for a selected interconnect block that may include one or more layers of metal and interlayer dielectric (ILD), the POD modes are trained to include the effects of the variation when a metal line or a via appears in the dielectric. Thermal solution data collected from DNSs to generate/train the POD modes must include the MPV in the selected block to offer influences of the MPV on the solution of Eq. (6). It is however difficult for POD modes to effectively capture the effects of the MPV if the metal lines and/or vias appear at arbitrary locations. For such a case, a large number of POD modes will be needed to predict accurate thermal solution; namely the DoF will not be significantly reduced. It is therefore more effective to train a finite set of POD modes for a specific group of interconnects that have similar metal/via routing features to reveal some generic blocks. Each generic block is designated for blocks with the same geometric shape and identical dimensions, consisting of one or more metal and ILD layers. Each generic block can be trained to generate POD modes that are able to account for effects of MPV induced by various metal/via routings.

There may be several different generic interconnect blocks (e.g., with different block dimensions or with wider metal lines or larger pitches) needed to cover different groups of interconnects. Using this approach, once each selected generic block is trained and projected onto its POD space, their modes and model parameters can be stored in a technology library. To build a large interconnect structure, several projected generic blocks can be glued together to construct a multi-block MPV-POD thermal model, as presented in Section 2.2. There may also be a handful of nonstandard interconnect blocks that do not belong to any of the generic blocks. The POD modes for these individual blocks can also be generated either with or without MPV and implemented in the multi-block structure.

A simple FinFET IC shown in Fig. 1 is used to elucidate the concept of the MPV-POD approach. The IC consists of 3 identical FinFET NAND2 gates with the NAND2 structure given in Fig. 2. For ICs using standard cells shown in Fig. 1, the interconnects reveal similar features that can be utilized to identify a generic block. As illustrated in Fig. 1, the interconnects are partitioned into 6 blocks, each with the same shape and identical dimensions. In this case, each block includes Metal-1 (M1), Metal-2 (M2), ILDs and the substrate, and one generic block shown in Fig. 3 can be used to represent a group of interconnect blocks with all possible metal/via routings specified in Fig. 3 based on the technology design rules. The generic block includes 6 possible vias and 14 possible metal segments with square metal pieces connecting the neighboring segments and vias. Each of the 6 interconnect blocks with a specific metal/via routing in Fig. 1 belongs to this generic block or its mirror symmetric block. These 6 blocks or their mirror symmetric blocks are redrawn in Fig. 4 with more clearly defined metal segments. For example, Block 2a consists of M1 Segments 5-8 and M2 Segments 10 and 13 connected to M1 Segments 3 and 4 by a via. Also, Block 4a, the mirror symmetric block of Block 4a' in Fig. 1, includes M2 Segment 12 connected to M1 Segment 5 by a via and M2 Segments 11 and 14 connected



Fig. 2. FinFET NAND2 structure with 4 fins in each FinFET.



Fig. 3. A generic interconnect block with all possible metal lines and vias whose sizes are labeled in nm. M1 and M2 thicknessess are all equal to 60 nm. Dielectric thickness between two metal layers is 100 nm. Each of the metal segment is labeled by a number.

## to M1 Segments 1-3 by a via.

The method of snapshots [29], [31], [54] can be applied in Eq. (2) to extract the POD modes from the collected data. In all previous POD studies [29-44], material properties in the structure remain unchanged during the data collection. Therefore, each projected POD block represents a specific structure. If a material change in any location of the structure, a different POD model will be needed. In the MPV-POD approach for interconnects, a group of blocks including different metal/via routings with appropriate values of  $\rho C$  and k are performed in DNSs to embed effects of MPV into the POD modes. However, even if the generated MPV-POD modes include the MPV information, the POD parameters given in Eqs. (7) and (8) still need to be evaluated numerically before each POD simulation because the material in the structure for each simulation may be different. This imposes a time-consuming process for thermal prediction of a large structure. This is however not a serious problem for interconnects. Using the example of the generic block in Fig. 3, one can pre-evaluate POD parameters from integrals in Eqs. (7) and (8) for this block without the regions where metal lines/vias may run. The parameters for this blank generic block with only the dielectric and the silicon substrate, can be stored in the library. Once the routings are determined in each block of this generic group, it will take little time to add integrals in Eqs. (7) and (8) of the unfilled regions by filling either metal or oxide in the vacant regions before POD thermal simulation.

## 4. Illustration of MPV-POD mode generation

The generic interconnect block given in Fig. 3 is selected in this work to illustrate thermal data collection, mode generation and demonstrations for the MPV-POD methodology. The dimensions and material properties of the FinFET IC structure used in this study are adopted from an earlier investigation in [31].

To arrive at robust POD modes, the collected data from the selected interconnect blocks need to experience realistic variations of heating sources, BCs and material properties. To account for metal/oxide variations effectively along the routing paths, a simple guideline is used; namely each metal/via segment must appear a few times with different connections to other metal/via segments. For this test case, the 6 interconnect blocks given in Fig. 4 (adopted from the circuit in Fig. 1) are included to collect thermal data in DNSs. In addition, 6 more selected blocks with different metal/via routings shown in Fig. 5 are incorporated in the data collection. This ensures that each metal/via segment in these 12 blocks appears at least twice with connections to different metal/via segments. One can always include many more different connections for each segment to enhance the quality of the collected data and thus the robustness of the POD modes. This however will produce more data and may become computationally intensive in data collection, eigenvalue and mode generation, and calculations of POD model parameters from Eqs. (7) and (8).

DNSs of each block in Figs. 4 and 5 need to perform to collect



Fig. 4. Six interconnect blocks in Fig. 1. Blocks 1a-3a are on the top of the layout in Fig. 1 while Blocks 4a-6a are the mirror symmetric blocks of those on the bottom of Fig. 1. An open square at the via location indicates no via existence.



Fig. 5. Six additional blocks used in DNSs for thermal data collection to generate POD modes for the generic interconnect block given in Fig. 3.

dynamic thermal data with spatial details, subjected to joule heating in the metal and BCs induced by the neighboring blocks. To account for heat fluxes across boundaries of each block appropriately, each of the 12 selected block is embedded in a larger simulation domain with an extended length of 200 nm beyond the selected block on each side, such as the diagram shown in Fig. 6 for Block 2a. Any metal line reaching a boundary of the selected block is extended to the boundary of the simulation domain. The extension allows us to apply a more realistic range of BCs on the selected block boundaries. For example, BCs on the

oxide boundary of the selected block (inside the simulation domain in Fig. 6) near metal lines are affected heat flow along these lines outside the selected block. The metal line extension shown in Fig. 6 offers realistic heat flow across the interface between blocks shown in Fig. 1. These BC variations offer information for the generated POD modes to learn enough variations to be able to accurately predict the heat flow in the metal wires even with different BCs. It should be pointed out that this simplified domain setting, although covering the major heat flux paths across the boundaries of the selected interconnect block (Block 2a), does



**Fig. 6.** Block 2a placed in a larger dielectric domain for DNSs for thermal data collection. The extended length of the domain in each direction is 200 nm.

not account for some BCs in the multi-block demonstration presented in Section 5.2. This is indeed partially responsible for the error observed in the multi-block demonstration. One can always include more simulations to accommodate more metal routings around the selected block to cover more variations of the BCs for the selected block that will enhance the quality of the collected data. This will again increase the computing resources needed for the POD mode training and model parameter calculations.

Power density induced by Joule heating in Eq. (9) along metal lines results from current density. To apply more realistic dynamic current density, circuit simulation of the FinFET IC in Fig. 1 is performed in Spice. As shown in the circuit of Fig. 1, a 4 GHz voltage clock is applied to one of the inputs of each gate while a random digital voltage sequence is applied to the second input of each gate ( $v_{i1}$  or  $v_{i2}$ ). Dynamic current along each metal line is extracted from the Spice simulation and the power density is then implemented in the DNS of the simulation domain in Fig. 6. Even though the interconnect routings may be different between Figs. 1 and 6, the variations of the current and power density cover a range of metal BCs and joule heating for the generate POD modes to learn. The heat flux applied to each metal boundary of the simulation domain is consistent with the power density applied to its metal line. Ambient temperature is applied to the bottom of the substrate and the top boundary is adiabatic. On other boundary surfaces (oxide boundaries sown in Fig. 6), dynamic heat fluxes across the boundaries between the interconnect blocks extracted from the DNSs of the FinFET circuit in Fig. 1, together with some random variations, are applied. This offers more realistic variations of the boundary fluxes affected by neighboring interconnect blocks.

DNS of each selected block was performed over 10 clock periods with 50 time steps in each period. Thermal data are collected at each time step. It should be noted that simulation settings for data collection (in turn for POD mode training) are not unique. Thermal data collected from DNSs of the selected blocks should as much as possible encompass a range of parametric variations which the generic blocks will experience in realistic operation, including variations of heat sources, BCs and metal/oxide along the interconnect routing paths.

Thermal data of the 12 interconnect blocks collected from the DNSs are combined together to generate eigenvalues and POD modes from Eq. (2) using the method of snapshots [31], [54]. The generic block in Fig. 3 is thus projected onto a POD space represented by these POD modes. The eigenvalue  $\lambda_i$  represents the mean squared temperature variations captured by the i-th mode  $\varphi_i$  and thus reveals the importance of the mode. The eigenvalue for the collected data shown in Fig. 7 decreases by more than 600 times from the first POD mode to the 6th mode and more than 4 orders of magnitude at 10th mode. The eigenvalue curve becomes nearly flat beyond the 102nd mode due to the machine precision.



Fig. 7. Eigenvalue of collected thermal data from the 12 selected blocks.

### 5. Demonstration of the MPV-POD methodology

### 5.1. Single-block demonstration

With the POD modes generated from the thermal data collected from the DSNs of the 12 selected blocks given in Figs. 4 and 5, the MPV-POD approach is first demonstrated below in a single-block domain shown in Fig. 8. To test how the modes via the training accommodate different metal patterns, two distinct metal/via paths are included in the test block. The left metal/via path in the test block of Fig. 8 is different from any of the 12 selected blocks, and the right path is the same as the right metal/via path in Block 9a. Different random pulses of current density and different BCs are applied to the test block except the top adiabatic boundary and the ambient bottom boundary. To have meaningful comparison with DNS, current density, BCs and metal/via routings are identical in the simulations between these two approaches.

Dynamic temperatures at Points a and b specified in Fig. 8 are illustrated in Fig. 9 with the MPV-POD approach compared against the DNS. Because Point b is in a metal/via path identical to the right metal/via path in Block 9a, only 3 modes in the POD model are needed to reach an excellent agreement with DNS, as shown in Fig. 9(c) and 9(d). On the other hand, errors with 3 modes shown in Fig. 9(a) and 9(b) are relatively large at Point a. With 6 modes in POD, both approaches are in a very good agreement, where the POD model reaches a maximum deviation of 2.47% from DNS near a peak temperature at 0.675 ns.

Spatial temperature profiles at 0.675 s in the test block are also shown in Fig. 10. Similar to Fig. 9, temperatures derived from the 3-mode MPV-POD model along Line B in M2 Segments 11 and 14 (Fig. 10(b)) and along Line A in M1 Segment 4 (beyond 4  $\mu m$  in Fig. 10 (a)) are in excellent agreement with that from the DNS. This is because the metal/via path from M1 Segment 4 to M2 Segments 11 and 14 (see Fig. 8) is identical to one of paths in Block 9a used in data collection.



**Fig. 8.** Interconnect test block for the single-block demonstration. Points a and b are in the center of M1 lines at distances of 0.24  $\mu$ m and 0.45  $\mu$ m from the left boundary, respectively. Lines A and C run through the ceters of M1 lines and Line B through the M2-line center for temperature plots given in Fig. 10.



**Fig. 9.** Dynamic temperatures in M1 (a) at Point *a* specified in Fig. 8 with a more detailed illustration in (b) from 0.52 ns to 1.025 ns, and (c) at Point *b* specified in Fig. 8 with a more detailed illustration in (d).

Along Lines A and C on the other metal/via path, the error derived from the MPV-POD model with 3 modes are however slightly large. With 6 modes in POD, a maximum error near 3% along Lines A and C is observed compared to the DNS.

The LS errors of the MPV-POD approach for different number of modes over the whole simulation domain and time are given in Table 1. For this single-block case, the error decreases as the number of modes increases and remains near 1.81% beyond 8 modes. With 6 or more modes, an LS error smaller than 2% can be achieved. The demonstration reveals that even for an interconnect routing not included in POD mode training, the MPV-POD model with a small number modes is still able to offer a very accurate prediction.

# 5.2. Multi-block demonstration in an integrated circuit

To verify the multi-block MPV-POD methodology under a more realistic operation, the multi-block approach is implemented in a FinFET IC structure shown in Fig. 11. The layout in Fig. 11 represents the same circuit in Fig. 1 with NAND2 Gates B and C swapped in the layout. The POD modes for the generic interconnect block in Fig. 3 and its mirror symmetric block are applied to all the six interconnect blocks in Fig. 11 including Blocks 1b-3b and 4b'-6b'. In addition to the 6 interconnect blocks represented by the generic MPV-POD block, the POD thermal simulation of the entire IC structure also includes 3 additional NAND2 blocks described by one set of POD modes (without MPV effects) developed in [31]. The POD simulation is thus performed in a 9-block domain that is projected onto a POD space described by only 2 sets of POD modes, one set for the generic interconnect block and the other for the NAND2 block. The same number of modes are used for each of the 9 projected blocks in the demonstration. In the demonstration, a 4 GHz voltage clock is applied to the 3-NAND2 circuit. Different random digital voltages are applied to  $v_{i1}$  and  $v_{i2}$  in Spice simulation to estimate the power densities at device junctions and along each metal lines. These power densities are then implemented in both the POD simulation and DNS. In the simulation domain, the bottom of the substrate is fixed at ambient and all other boundaries are adiabatic except for the metal boundaries where surface power densities are applied based on the

power evaluated from the Spice simulation.

Blocks 2b, 3b and 5b' in Fig. 11 are different from any trained block (or their symmetric blocks) given in Figs. 4 and 5 while Blocks 1b, 4b and 6b are identical to Blocks 1a, 4a and 5a in Fig. 4, respectively. It should be noted that, even with Blocks 1b, 4b and 6b identical to 3 of the blocks in the data collection, the simulation settings for the 6 interconnect blocks in Fig. 11 are very different from those in the training. Simulation domains for the selected 12 blocks in the training process are similar to the structure given in Fig. 6 with the BCs that only emulate the effects induced by neighboring interconnect blocks. Also, there was no adiabatic boundary in these trained blocks. On the other hand, in the demonstration there is at least one adiabatic boundary on the dielectric surface of each block, as shown in Fig. 11. Moreover, one of the boundaries of each block in the demonstration is neighbored by a NAND2 block instead of an interconnect block. The VDD/GND M1 line in the NAND2 circuit shown in Fig. 11 runs in parallel closely with one of the boundary surfaces of each interconnect blocks. These M1 lines impose entirely different BCs for the interconnect blocks from the trained blocks where no metal line runs in parallel near any boundary.

Our study of this multi-block domain has found that the multi-block POD approach becomes numerically unstable if the penalty number  $N_{\mu}$  is below  $N_{\mu,min}$  or above  $N_{\mu,max}$ , where  $N_{\mu,min}\approx 7$  and  $N_{\mu,max}\approx 40$ . In the demonstration, the value of  $N_{\mu}$  used in [31] ( $N_{\mu}=20$ ) is applied first. Other values of  $N_{\mu}$  between 7 and 40 are then used to analyze its influence on the interface thermal discontinuity and the accuracy. The POD thermal modeling approach without considering the MPV for the FinFET IC has been investigated in detail in a previous study [31]. In this study, we only present thermal solution derived from the MPV-POD models in the interconnect blocks compared against the DNS.

Dynamic temperatures at Points a, b, c and d in M1 labeled in Fig. 11 under the vias are illustrated in Fig. 12. Even though Points a, c and d are located in the blocks identical to some selected trained blocks, due to inadequate boundary settings in the training, accuracy of dynamic temperatures in Fig. 12 ( $N_{\mu}=20$ ) for the multi-block MPV-POD model with 6 modes at these locations is not as good as those at Points a and b in Fig. 9 for the single-block case. It needs 10 or 11 modes for the multi-block model to reach good accuracy. With 11 modes, maximum errors



**Fig. 10.** Temperature profiles at 0.675 ns along (a) Line A in Fig. 8 through M1, (b) Line B in M2 and (c) Line C through M1.

Table 1
LS error of single-block MPV-POD models.

| No. of modes | 1    | 3    | 6    | 8    | 10   | 11  | 12   |
|--------------|------|------|------|------|------|-----|------|
| LS error (%) | 6.82 | 3.14 | 1.97 | 1.82 | 1.81 | 1.8 | 1.81 |

near 2.5%– 4.4% around the peak temperatures at Points a-d are observed in Fig. 12. The LS error shown in Table 2 with  $N_\mu=20$  is near 3% with 11 modes and slightly reduces to 2.67% with 12 modes, that is still reasonably small considering that at least 2 BCs for each block are very different from the training settings, one with the adiabatic condition and the other induced by the VDD/GND M1 lines. Also, Point b is located in a block with very different metal/via routings from the training ones. With very different BCs and metal/via routings in the training, the MPV-POD modes are somehow able to predict temperature profiles with good accuracy in the multi-block structure.

With only 3 modes, it is interesting to observe very accurate solution at Point d but poor accuracy at other locations. Also, much better accuracy is observed for the 6-mode MPV-POD model at Points c and d than at Points a and b. It should be noted that the POD process in Eq. (1) only optimizes the LS error instead of local errors. The LS error



Fig. 11. Layout for the NAND circuit given in Fig. 1. This layout is however different from that in Fig. 1.

shown in Table 2 with  $N_{\mu}=20$  for the multi-block MPV-POD simulation reduces gradually, as more modes are used, and it reaches 2.67% with 12 POD modes. Even with more modes included in the multi-block case, its LS error only slightly reduces and is always greater than that of the single-block case. This is because accuracy of the multi-block case is limited by the data quality as a result of serious discrepancies of the BCs between the demonstration and the training. The other reason why more modes are needed for the multi-block MPV-POD approach to reach good accuracy is because of the boundary thermal discontinuities at block interfaces. Owing to the truncation of the solution given in Eq. (4), it is impossible to satisfy both the continuities of temperature and heat flux on any boundary. The interior penalty DG method [46], [47] is thus to enforce the flux continuity but allows a small temperature discontinuity (weak Dirichlet BCs) at the interface in an average sense between any 2 neighboring blocks.

The temperature distributions at 0.215 ns along Lines A and B through the center of M1 lines (see Fig. 11) are illustrated in Fig. 13. Overall the MPV-POD model with more modes offers more accurate thermal solution; however the 3-mode model actually leads to a better accuracy than the 11 mode model in Block 6b' along Line B, as shown in Fig. 13(b) and at Point d in Fig. 12. Along the metal lines in Line A, temperature discontinuities derived from the POD models with a small number of modes appear clearly across interfaces between neighboring blocks. With 3 modes, a discontinuity higher than 5% of the interface temperature is observed at each interface, as detailed in Fig. 14(a) and 14(b). Using 8 or 9 modes, the discontinuity is effectively suppressed to 2.2% or 1.5%, respectively, and it is successfully suppressed with 10 or more modes. Except for the one mode model, in general as the discontinuity is suppressed with more modes in the MPV-POD model, Table 2 shows that the LS error is reduced. Effects of the discontinuity is minimum along Line B because both interfaces are located in oxide.

To understand how  $N_\mu$  influences the interface thermal discontinuities, different penalty numbers are applied in the POD simulations of the 9-block structure. Temperature profiles near the same interfaces in Fig. 14 are illustrated in Fig. 15 derived from the 8-mode and 11-mode POD models with  $N_\mu$  slightly above  $N_{\mu,min}$ , slightly below  $N_{\mu,max}$  and  $N_\mu$  = 20. Fig. 15 shows that the temperature discontinuity is suppressed and the accuracy of the POD prediction is improved as  $N_\mu$  decreases from 37.2 to 7.3 for both 8-mode and 11-mode MPV-POD models. When  $N_\mu$  = 37.2, even with 11 modes in the MPV-POD model, discontinuities are still observed at bot interfaces in Fig. 15(a) and (b). As  $N_\mu$  decreases to 20 or 7.3, the discontinuities are successfully removed. In addition, use of smaller values of  $N_\mu$  leads to a better agreement with the DNS. As  $N_\mu$  decreases from 37.2, 20–7.3, the improvement in the LS error for



**Fig. 12.** Dynamic temperatures in M1 at (a) Point a, (b) Point b, (c) Point c and (d) Point d shown in Fig. 11 with  $N_{\mu} = 20$  in the multi-block POD models.

Table 2
LS error (%) of multi-block POD models.

| No. of modes     | 1     | 3    | 6    | 8    | 10   | 11   | 12   |
|------------------|-------|------|------|------|------|------|------|
| $N_{\mu}=7.3$    | 9.87  | 7.51 | 4.68 | 4.25 | 3.21 | 3.01 | 2.67 |
| $N_{\mu}=20$     | 10.40 | 7.87 | 4.95 | 4.28 | 3.23 | 3.02 | 2.67 |
| $N_{\mu} = 37.2$ | 11.55 | 8.47 | 6.93 | 6.76 | 5.9  | 5.43 | 5.19 |



Fig. 13. Temperature profiles at 0.215 ns in Fig. 12 along (a) Line A and (b) Line B shown in Fig. 11 through the center of M1 line. In the multi-block POD models,  $N_{\mu}=20$ .



**Fig. 14.** Close-up views of the discontinuities on (a) the left and (b) the right of the temperature profiles in Fig. 13(a). In the multi-block POD models,  $N_{\mu}=20$ .

different number of modes is clearly observed in Table 2. A reduction of 1.5%–2.7% in the LS error can be achieved as  $N_{\mu}$  changes from 37.2 to 7.3 with 6 or more modes. However, with 10 or more modes, use of  $N_{\mu}$  equal to or below 20 is not able to further reduce the LS error and the LS errors for both  $N_{\mu}=7.3$  and 20 reach 2.67% with 12 modes even though the local errors near the interfaces shown in Fig. 15(a) and 15(b) reduce



**Fig. 15.** Temperature profiles in (a) and (b) at the same interfaces as those in Fig. 14(a) and 14(b), respectively. In the multi-block POD models, the profiles with 8 and 11 modes are included for  $N_u = 7.3$ , 20 and 37.2.

with a smaller  $N_{\mu}$ . It is believed that the LS error in this case is limited by the quality of the collected thermal data in the training of the generic block due to inadequate BCs, as discussed above.

### 6. Discussions

It is found in this investigation on the single-block and multi-block interconnect structures that quality of the collected data, as a result of simulation settings in the training, is the key to determine the accuracy and robustness of the MPV-POD models. Appropriate settings of the simulation domain in the training of the POD modes are needed to cover enough variations of metal/via routings, power sources and BCs which the interconnect blocks will encounter in realistic operation.

In the single-block case, an MPV-POD model with only 3 modes offers excellent accuracy for dynamic temperature at Point b (between Segments 4 and 11 in Fig. 8) shown in Fig. 9(c) and (d) and for spatial temperature distributions shown in Fig. 10(a) and (b) along Segment 4 and Segments 11 and 14, respectively. This is because this metal/via routing is identical to a metal/via path in a selected trained block with perhaps slightly different BCs and power sources. Even with a metal/via routing in the test domain different from any selected trained block, the MPV-POD model with 6 modes was still able to offer a very good agreement with the DNS.

The simulation settings used to train the generic-block, although offering a good quality training for the single block case, does not provide good-quality data for the interconnect blocks embedded in the IC given in Fig. 11 due to serious inconsistent BCs between the demonstration and the training. Therefore, the POD modes of the generic block are not well-trained to adapt the BCs enforced by the IC operation in the multi-block case. As a result, the temporal/spatial temperature solution from the MPV-POD model in the multi-block case is not as good as that in the single-block demonstration. In addition, the multi-block approach suffers from inevitable thermal discontinuities across the interface

between neighboring blocks. To further minimize the LS error in the multi-block structure, more POD modes are needed. It is also found that the value of the penalty number  $N_{\mu}$  has a profound impact on the thermal discontinuities at block interfaces. An appropriate range for  $N_{\mu}$  (in our case  $7 < N_{\mu} < 40$ ) is needed to avoid numerical instability induced by the discontinuities, and a smaller  $N_{\mu}$  within this range offers a smaller LS error. In the multi-block test case, even with inadequate BC training and different metal/via routings in 50% of the interconnect blocks, the MPV-POD modes of the generic interconnect block are able to offer a good thermal prediction, compared to with the DNS, as displayed in Table 2, if a smaller  $N_{\mu}$  within the appropriate range is used.

Applications of the MPV-POD methodology to the single-block and multi-block interconnect structure reveal interesting and encouraging findings. This work demonstrates that, in addition to the heat sources and BCs, variations of material thermal properties between metal and oxide in interconnects can be captured effectively by the POD modes as long as enough variations of metal/via routings are implemented in the data collection (or the training). It has been shown that a small number of MPV-POD modes are able to offer a very accurate prediction of spatial/temporal thermal solution in interconnect blocks with routings that are different from the trained ones provided that the adequate BCs are included in the training. Even with inadequate BCs in the training, the MPV-POD model with 10 or more modes still offer a good description for the thermal solution in the multi-block interconnects. Apparently, the trained generic block was able to intelligently perform extrapolation to capture variations of metal/via routings and BCs to reach an accurate prediction for new routings or substantially different BCs as long as the block was trained by a wide range of routing and BC

In order to accurately predict the small-diameter hot spots at the device junctions of the 9-block IC structure given in Fig. 11, including 6 interconnect blocks and 3 NAND-gate blocks, a high spatial resolution is needed in DNS. In this case, the POD simulation of the IC structure with  $N_{\mu}=7.3$  and 10 modes in each block offers a reduction in the DoF by more than 4 orders of magnitude, compared to the DNS using ANSYS Mechanical APDL [57]. This amounts to a speedup over 3 orders of magnitude in computational time.

### 7. Conclusions

Use of building blocks has been one of the major practices for more effective engineering design and simulation in many different fields. In order to implement the building-block concept to improve effectiveness of thermal simulation of interconnects, MPV is proposed in POD to capture thermal effects of metal/via routings embedded in a dielectric structure. With this approach, some generic building blocks may be selected for a group of interconnect structures with similar metal/via routing features. Each selected generic block can be projected onto a POD space represented by a finite set of POD modes that are trained to capture variations of material properties, power sources and BCs. These generic blocks can then be glued together to construct a large interconnect structure. In this study, the interconnects used to wire the standard cells for a FinFET logic IC in Fig. 1 are partitioned into standard-size blocks modeled by a single generic block shown in Fig. 3. The selected generic block was trained to generate its POD modes and model parameters and then applied to demonstrate the accuracy and robustness of the MPV-POD approach in a single interconnect block and a multi-block IC structure. For the simulation of the 9-block IC structure, the POD simulation methodology offers a saving in computational time over 3 orders of magnitude, compared to the DNS.

The investigation reveals the importance of the quality of thermal data used to generate POD modes and parameters for the developed MPV-POD methodology. However, even with inadequate BCs accounted for in the training (that offers insufficient data quality), the MPV-POD model of the generic block is still able to offer a good prediction of the spatial/dynamic thermal simulation in the multi-block interconnect

structure if more modes are used. It is also shown that the accuracy of the multi-block MPV-POD model is partially limited by the thermal discontinuities at block interfaces unless more modes are used. To the best of our knowledge, this study presents the first POD modeling approach with the MPV. It has been shown that it is possible to effectively account for thermal effects of the MPV induced by different metal/via routings in the POD modes for thermal simulation of interconnects. However, to derive robust MPV-POD models, in addition to a wide range of metal/via routings needed in the training of POD modes, BCs implemented in the training need to be close to those in realistic operation. To further improve the multi-block MPV-POD models, perhaps alternative schemes for more effectively minimizing the interface

discontinuity need to be investigated.

### **Declaration of Competing Interest**

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

### Acknowledgments

This work was partially supported by the National Science Foundation, USA, under Grant No. ECCS-2003307.

# Appendix. : Brief overview of method of snapshots

The autocorrelation tensor in (3) can be rewritten in terms of the average of the sample data sets over a number of snapshots (observations),

$$R\left(\overrightarrow{x},\overrightarrow{x}'\right) = \frac{1}{N_t} \sum_{j=1}^{N_t} T\left(\overrightarrow{x}, t_j\right) T\left(\overrightarrow{x}', t_j\right),\tag{A1}$$

and thus (2) becomes

$$\frac{1}{N_t} \sum_{i=1}^{N_t} T(\overrightarrow{x}, t_j) \int_{\Omega} T(\overrightarrow{x}', t_j) \varphi(\overrightarrow{x}') d\overrightarrow{x}' = \lambda \varphi(\overrightarrow{x}), \tag{A2}$$

where  $N_t$  is the total number of observations in time. In a static problem, the observations are performed at different heat source strengths and/or different BC's

Let us now define the projection of the jth sample data set onto the POD space given by the integral in (A2) as

$$u_{j} = \int_{\Omega} T(\overrightarrow{x}', t_{j}) \varphi(\overrightarrow{x}') d\overrightarrow{x}', \tag{A3}$$

and (A2) can then rewritten as

$$\frac{1}{N_t} \sum_{j=1}^{N_t} T(\overrightarrow{x}, t_j) \ u_j = \lambda \varphi(\overrightarrow{x}). \tag{A4}$$

After multiplying both sides of (A4) by  $T(\vec{x}, t_i)$  and performing an integral on each side over the entire domain, the follow equation is obtained

$$\frac{1}{N_t} \sum_{i=1}^{N_t} u_j \int_{\Omega} T(\overrightarrow{x}, t_i) \ T(\overrightarrow{x}, t_i) d\overrightarrow{x} = \lambda \int_{\Omega} T(\overrightarrow{x}, t_i) \ \varphi(\overrightarrow{x}) d\overrightarrow{x}, \tag{A5}$$

which can be expressed as a matrix equation for a different eigenvalue problem,

$$\begin{bmatrix} A_{11} & \cdots & A_{1j} & \cdots & A_{1N_t} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ A_{i1} & \cdots & A_{ij} & \cdots & A_{iN_t} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ A_{N_t1} & \cdots & A_{N_tj} & \cdots & A_{N_tN_t} \end{bmatrix} \begin{bmatrix} u_1 \\ \vdots \\ u_j \\ \vdots \\ u_{N_t} \end{bmatrix} = \lambda \begin{bmatrix} u_1 \\ \vdots \\ u_j \\ \vdots \\ u_{N_t} \end{bmatrix},$$

$$(A6)$$

with  $u_j$  given in (A3) and

$$A_{ij} = \frac{1}{N_t} \int_{\Omega} T(\overrightarrow{x}, t_i) \ T(\overrightarrow{x}, t_j) d\overrightarrow{x} \tag{A7}$$

Once the eigenvectors in (A6) are determined, the POD modes at can be recovered using (A4) as a linear combination of the observations,

$$\varphi(\overrightarrow{x}) = \frac{1}{N_{l}\lambda} \sum_{i=1}^{N_{l}} T(\overrightarrow{x}, t_{j}) u_{j}. \tag{A8}$$

The eigenvalues derived from (A6) and the modes estimated (A8) have been shown in (A1)-(A8) to be identical to the first  $N_t$  eigenvalues and POD modes, respectively, given in (2). For a large-scale domain with fine resolution, (2) represents an eigenvalue problem with an unmanageably large size, compared to the relatively small matrix size offered by (A6).

#### References

- [1] S. Rzepka, K. Banerjee, E. Meusel, C. Hu, Characterization of self-heating in advanced VLSI interconnect lines based on thermal finite element simulation, IEEE Trans. Comp. Pack., Manuf. Technol. Part A v21 (3) (1998) 406–411.
- [2] A. Kim, B. Li, B. Linder, Transient self-heating modeling and simulations of back-end-of-line interconnects. In: Proceedings of the 2018 IEEE Int. Reliability Phys. Symp. (IRPS), Burlingame, CA, 2018, pp. P-MR.2–1-P-MR.2–5.
- [3] K.N. Tu, Yingxia Liu, Menglu Li, Effect of Joule heating and current crowding on electromigration in mobile technology, Appl. Phys. Rev. 4 (011101) (2017).
- [4] R. Kanapady, D. Moore, A. Raghupathy, W. Maltz, Influence of temperature gradient on electromigration failures in 3D packaging. In: Proc. Conf. Therm. Thermomech. Phenom. Electron. Syst. (ITherm), Las Vegas, NV, USA, 2016, pp. 70–76.
- [5] M. Li, D.W. Kim, S. Gu, D.Y. Parkinson, H. Barnard, K.N. Tu, Joule heating induced thermomigration failure in un-powered microbumps due to thermal crosstalk in 2.5D IC technology, J. Appl. Phys. 120 (075105) (2016).
- [6] G. Kunti, J. Dhar, A. Bhattacharya, S. Chakraborty, Alternating current electrothermal flow for energy efficient thermal management of microprocessor hot spots. In: Proceedings of the 25th Int. Workshop Therm.Investigations ICs & Systems (THERMINIC), Lecco, Italy, 2019, pp. 1–6.
- [7] X. Lu, T. Shi, Q. Xia, G. Liao, Thermal conduction analysis and characterization of solder bumps in flip chip package, Appl. Therm. Eng. 36 (2012) 181–187.
- [8] T.-Y. Chiang, K.C. Saraswat, Closed-form analytical thermal model for accurate temperature estimation of multilevel ULSI interconnects. In: Proceedings of the 2003 Symp. VLSI Circuits. Digest of Tech. Papers (IEEE Cat. No.03CH37408), Kyoto, Japan, 2003, pp. 275–278.
- [9] D. Chen, E. Li, E. Rosenbaum, S.-M. Kang, Interconnect thermal modeling for accurate simulation of circuit timing and reliability, IEEE Trans. CAD ICs Syst. 19 (2) (2000) 197–205.
- [10] L. Choobineh, A. Jain, Determination of temperature distribution in threedimensional integrated circuits (3D ICs) with unequally-sized die, Appl. Therm. Eng. 56 (2013) 176–184.
- [11] C. Wang, X.-J. Huang, K. Vafai, Analysis of hotspots and cooling strategy for multilayer three-dimensional integrated circuits, Appl. Therm. Eng. 186 (2021), 116336
- [12] C.-H. Chang, et al., A novel 14 nm extended body FinFET for reduced corner effect, self-heating effect, and increased drain current, World Acad. Sci. Eng. Technol. 7 (78) (2013) 332–335
- [13] S. Kolluri, K. Endo, E. Suzuki, K. Banerjee, Modeling and analysis of self-heating in FinFET devices for improved circuit and EOS/ESD performance. In: Proc. IEDM Int. Electron Devices Meeting (IEDM), Washington, DC, USA, Dec. 2007, pp. 177–180.
- [14] C. Xu, S.K. Kolluri, K. Endo, K. Banerjee, Analytical thermal model for self-heating in advanced FinFET devices with implications for design and reliability, IEEE Trans. CAD ICs, Syst. 32 (7) (2013) 1045–1058.
- [15] P. Paliwoda, et al., Self-heating assessment on bulk FinFET devices through characterization and predictive simulation, IEEE Trans. Device Mater. Reliab. 18 (2) (2018) 133–138.
- [16] M. Jin et al., Hot carrier reliability characterization in consideration of self-heating in FinFET technology. In: Proceedings of the 2016 IEEE Int. Reliability Phys. Symp. (IRPS), Pasadena, CA, pp. 2 A.2.1–2 A.2.5.
- [17] T.-Y. Chiang, B. Shieh, K.C. Saraswat, Impact of Joule heating on scaling of deep sub-micron Cu/low-k interconnects. In: Proceedings of the Symp. VLSI Technol. Dig. Tech. Papers, Honolulu, HI, 2002, pp. 38–39.
- [18] C. Ryu, et al., Microstructure and reliability of copper interconnects, IEEE Trans. Electron Devices 46 (6) (1999) 1113–1120.
- [19] K.-D. Lee et al., Effect of Joule Heating on electromigration in dual-damascene copper low-k interconnects. In: Proceedings of the 2017 IEEE Int. Reliability Physics Symp. (IRPS), Monterey, CA, 2017, pp. 6B-6.1–6B-6.5.
- [20] E.G. T. Bosch, M.N. Sabry, Thermal compact models for electronic systems. In: Proc. 18th Annu. IEEE Symp. Semicond. Therm. Meas. Manage., San Jose, CA, USA, Mar. 2002, pp. 21–29.
- [21] W. Huang, M.R. Stan, K. Skadron, Parameterized physical compact thermal modeling. In: IEEE Trans. Compon. Packag. Technol., vol. 28, no. 4, pp. 615–622, Dec. 2005.
- [22] M.-N. Sabry ,M. Dessouky, "A framework theory for dynamic compact thermal models. In: Proc. 28th Annu. IEEE Semicond. Therm. Meas. Manage. Symp., San Jose, CA, USA. Mar. 2012, pp. 189–194.
- [23] A. Bansal, M. Meterelliyoz, S. Singh, J.H. Choi, J. Murthy, K. Roy, Compact thermal models for estimation of temperature-dependent power/performance in FinFET technology. In: Proceedings of the Asia & South Pacific Conf. Design Automation, 2006, Yokohama, Japan, Jan. 2006, pp. 24–27.
- [24] W. Huang, K. Sankaranarayanan, R.J. Ribando, M.R. Stan, and K. Skadron, An improved block-based thermal model in HotSpot 4.0 with granularity considerations. WDDD'07, San Diego, CA, USA, Jun. 2007.
- [25] S.P. Gurrum, Y.K. Joshi, W.P. King, K. Ramakrishna, M. Gall, A compact approach to on-chip interconnect heat conduction modeling using the finite element method, ASME J. Electron. Packag 130 (2008), 031001.
- [26] K. Zhang, M.C. Cheng, Thermal Circuit for SOI Structure Accounting for Non-Isothermal effect, IEEE Trans. Electron Devices 57 (2010) 2838–2847.
- [27] F. Yu, M.C. Cheng, Electrothermal Simulation of SOI CMOS Analog Integrated Circuits, Solid-State Electron. 51 (2007) 691–702.
- [28] M.C. Cheng, J.A. Smith, W. Jia, Ryan Coleman, An effective thermal model for finfet structure, IEEE Trans. Electron Devices 61 (1) (2014) 202–206.

- [29] W. Jia, B.T. Helenbrook, M.-C. Cheng, Thermal modeling of multi-fin field effect transistor structure using proper orthogonal decomposition, IEEE Trans. Electron Devices 61 (8) (2014) 2752–2759.
- [30] W. Jia, B. Helenbrook, and M.-C. Cheng, Thermal modeling of multi-gate field effect transistors based on a reduced order model. In: Proc. IEEE Annu. Semicond. Therm. Meas. Manage. Symp., San Jose, CA, USA, Mar. 2014, pp. 230–235.
- [31] W. Jia, B.T. Helenbrook, M.-C. Cheng, Fast thermal simulation of FinFET circuits based on a multiblock reduced-order model, IEEE Trans. CAD ICs. Syst. 35 (7) (2016) 1114–1124.
- [32] L. Feng, Y. Yue, N. Banagaaya, et al., Parametric modeling and model order reduction for (electro-)thermal analysis of nanoelectronic structures, J. Math. Ind. 6 (2016) 10.
- [33] D.S. Meyer, B.T. Helenbrook, M.-C. Cheng, Proper orthogonal decomposition-based reduced basis element thermal modeling of integrated circuits, Int. J. Numer. Methods Eng. 112 (2017) 479–500.
- [34] R. Venters, B.T. Helenbrook, K. Zhang, M.-C. Cheng, Proper-orthogonal-decomposition based thermal modeling of semiconductor structures, IEEE Trans. Electron Devices 59 (11) (2012) 2924–2931.
- [35] B. Barabadi, S. Kumar, Y.K. Joshi, Transient heat conduction in on-chip interconnects using proper orthogonal decomposition method, ASME J. Heat. Transf. 139 (7) (2017), 072101.
- [36] A. Nokhosteen, M. Soltani, B. Barabadi, Reduced order modeling of transient heat transfer in microchip interconnects, ASME J. Electron. Packag. 141 (1) (2019), 011002.
- [37] D. Ahlman, F. Soderlund, J. Jackson, A. Kurdila, W. Shyy, Proper orthogonal decomposition for time-dependent lid-driven cavity flows, Numer. Heat. Transf., Part B 42 (4) (&;;2010) 285–306.
- [38] J.L. Lumley, The structure of inhomogeneous turbulent flows. Atmospheric Turbulence and Radio Wave Propagation, Nauka, Moscow, Russia, 1967, pp. 166–178.
- [39] G. Berkooz, P. Holmes, J.L. Lumley, The proper orthogonal decomposition in the analysis of turbulent flows, Annu. Rev. Fluid Mech. 25 (1993) 539–575.
- [40] P.A. LeGresley, J.J. Alonso, "Investigation of non-linear projection for POD based reduced order models for aerodynamics. In: Proc. 39th Aerosp. Ind. Assoc. Amer. Aerosp. Sci. Meeting Exhibit. (AIAA), Reno, NV, USA, Jan. 2001, pp. 1–15.
- [41] K. Lu, Y. Jin, Y. Chen, Y. Yang, L. Hou, Z. Zhang, Z. Li, C. Fu, Review for order reduction based on proper orthogonal decomposition and outlooks of applications in mechanical systems, Mech. Syst. Signal Process. 123 (2019) 264–297.
- [42] D. Binion, X. Chen, A Krylov enhanced proper orthogonal decomposition method for frequency domain model reduction, Eng. Comput. 34 (2) (2017) 285–306.
- [43] Y. Maday, E.M. Ronquist, A reduced-basis element method, C. R. Math. 335 (2) (2002) 195–200.
- [44] Y. Maday, E.M. Ronquist, The reduced basis element method: Application to a thermal fin problem, SIAM J. Sci. Comput. 26 (1) (2004) 240–258.
- [45] S. Sahyoun, S. Djouadi, Local proper orthogonal decomposition based on space vectors clustering. In: Proceedings of the 3rd International Conference on Systems and Control, 2013, pp. 665–670.
- [46] D.N. Arnold, F. Brezzi, B. Cockburn, D. Marini, Discontinuous Galerkin methods for Elliptic Problems (Lecture Notes in Computational Sci. & Eng.), Springer, Berlin, Germany, 2000, pp. 89–101.
- [47] D.N. Arnold, F. Brezzi, B. Cockburn, L. Donatella Marini, Unified analysis of discontinuous Galerkin methods for elliptic problems, SIAM J. Numer. Anal. 39 (5) (2002) 1749–1779.
- [48] A.M. Amini, D. Dureisseix, P. Cartraud, N. Buannic, A domain decomposition method for problems with structural heterogeneities on the interface: Application to a passenger ship, Comput. Methods Appl. Mech. Eng. 198 (2009) 3452–3463.
- [49] M. Papadrakakis, G. Stavroulakis, A. Karatarakis, A new era in scientific computing: domain decomposition methods in hybrid CPU-GPU architectures, Comp. Methods Appl. Mech. Eng. 200 (13–16) (2011) 1490–1508.
- [50] H. Bao, R. Chen, An efficient domain decomposition parallel scheme for Leapfrog ADI-FDTD method, IEEE Trans. Antennas Prop. 65 (3) (2017) 1490–1494.
- [51] P. Liu, V. Dinavahi, Matrix-free nodal domain decomposition with relaxation for massively parallel finite-element computation of EM apparatus, IEEE Tran. Magn. 54 (9) (2018) 1–7.
- [52] T. Hummel, A. Pacheco-Vega, Application of Karhunen-Lo`eve expansions for the dynamic analysis of a natural convection loop for known heat flux, J. Phys. Ser. 395 (1) (2012), 012121.
- [53] V. Algazi, D. Sakrison, On the optimality of the Karhunen-Lo`eve expansion (Corresp.), IEEE Trans. Inf. Theory 15 (2) (1969) 319–320.
- [54] L. Sirovich, Turbulence and the dynamics of coherent structures. I–Coherent structures. II–Symmetries and transformations. III–Dynamics and scaling, Quart. Appl. Math. 45 (1987) (pp. 561–571 and 573–590).
- [55] https://www.staticfreesoft.com/
- [56] https://ptm.asu.edu/
- [57] https://www.ansys.com/



Wangkun Jia was born in Nanjing, China. He received his B.S. degree from Southeast University, Nanjing, China, in 2007 and M.S. and Ph.D. degrees in electrical engineering from Clarkson University, Potsdam, NY, in 2011 and 2020, respectively. His research interests include thermal and electro-thermal modeling of semiconductor devices and ICs.



Ming-C. Cheng received his B.S. degree in Electrophysics from National Chiao-Tung University, Taiwan, and M.S. and Ph.D. degrees from Polytechnic University, Brooklyn, NY, in Systems Engineering and Electrical Engineering, respectively. He was with University of New Orleans, New Orleans, LA, from 1990 to 2000 and is currently a professor of Electrical and Computer Engineering at Clarkson University, Potsdam, NY. His research has covered many different areas including transport modeling of semiclassical and quantum devices, optimization of solid-state devices, electro-thermal simulation of solid-state devices and integrated circuits (ICs), and electromagnetic simulation for eddy current and hysteresis losses in magnetic materials.

Recently, he has devoted his effort to developing efficient and accurate methods for physics simulations based on data-driven learning algorithms for different areas of research, including dynamic thermal analysis of ICs and CPUs/GPUs, simulations of quantum eigenvalue problems for nanostructures and materials, electromagnetic simulations, etc.