skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Thermal Estimation for 3D-ICs Through Generative Networks
Thermal limitations play a significant role in modern integrated chips (ICs) design and performance. 3D integrated chip (3DIC) makes the thermal problem even worse due to a high density of transistors and heat dissipation bottlenecks within the stack-up. These issues exacerbate the need for quick thermal solutions throughout the design flow. This paper presents a generative approach for modeling the power to heat dissipation for a 3DIC. This approach focuses on a single layer in a stack and shows that, given the power map, the model can generate the resultant heat for the bulk. It shows two approaches, one straightforward approach where the model only uses the power map and the other where it learns the additional parameters through random vectors. The first approach recovers the temperature maps with 1.2 C° or a root-mean-squared error (RMSE) of 0.31 over the images with pixel values ranging from -1 to 1. The second approach performs better, with the RMSE decreasing to 0.082 in a 0 to 1 range. For any result, the model inference takes less than 100 millisecond for any given power map. These results show that the generative approach has speed advantages over traditional solvers while enabling results with reasonable accuracy for 3DIC, opening the door for thermally aware floorplanning.  more » « less
Award ID(s):
1624770 2137288 2137283 2137259 2345055
PAR ID:
10435239
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
IEEE
Date Published:
Journal Name:
2023 IEEE International 3D Systems Integration Conference (3DIC)
Issue:
2023
Page Range / eLocation ID:
1 to 4
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. In this paper, we propose a novel transient full-chip thermal map estimation method for multi-core commercial CPU based on the data-driven generative adversarial learning method. We treat the thermal modeling problem as an image-generation problem using the generative neural networks. In stead of using traditional functional unit powers as input, the new models are directly based on the measurable real-time high level chip utilizations and thermal sensor information of commercial chips without any assumption of additional physical sensors requirement. The resulting thermal map estimation method, called {\it ThermGAN} can provide tool-accurate full-chip {\it transient} thermal maps from the given performance monitor traces of commercial off-the-shelf multi-core processors. In our work, both generator and discriminator are composed of simple convolutional layers with Wasserstein distance as loss function. ThermGAN can provide the transient and real-time thermal map without using any historical data for training and inferences, which is contrast with a recent RNN-based thermal map estimation method in which historical data is needed. Experimental results show the trained model is very accurate in thermal estimation with an average RMSE of 0.47C, namely, 0.63\% of the full-scale error. Our data further show that the speed of the model is faster than 7.5ms per inference, which is two orders of magnitude faster than the traditional finite element based thermal analysis. Furthermore, the new method is about 4x more accurate than recently proposed LSTM-based thermal map estimation method and has faster inference speed. It also achieves about 2x accuracy with much less computational cost than a state-of-the-art pre-silicon based estimation method. 
    more » « less
  2. Abstract This paper proposes a computational fluid dynamics (CFD) simulation methodology for the multi-design variable optimization of heat sinks for natural convection single-phase immersion cooling of high power-density Data Center server electronics. Immersion cooling provides the capability to cool higher power-densities than air cooling. Due to this, retrofitting Data Center servers initially designed for air-cooling for immersion cooling is of interest. A common area of improvement is in optimizing the air-cooled component heat sinks for the fluid and thermal properties of liquid cooling dielectric fluids. Current heat sink optimization methodologies for immersion cooling demonstrated within the literature rely on a server-level optimization approach. This paper proposes a server-agnostic approach to immersion cooling heat sink optimization by developing a heat sink-level CFD to generate a dataset of optimized heat sinks for a range of variable input parameters: inlet fluid temperature, power dissipation, fin thickness, and number of fins. The objective function of optimization is minimizing heat sink thermal resistance. This research demonstrates an effective modeling and optimization approach for heat sinks. The optimized heat sink designs exhibit improved cooling performance and reduced pressure drop compared to traditional heat sink designs. This study also shows the importance of considering multiple design variables in the heat sink optimization process and extends immersion heat sink optimization beyond server-dependent solutions. The proposed approach can also be extended to other cooling techniques and applications, where optimizing the design variables of heat sinks can improve cooling performance and reduce energy consumption. 
    more » « less
  3. null (Ed.)
    In this article, we address the problem of accurate full-chip power and thermal map estimation for commercial off-the-shelf multi-core processors. Processors operating with heat sink cooling remains a challenging problem due to the difficulty in direct measurement. We first propose an accurate full-chip steady-state power density map estimation method for commercial multi-core microprocessors. The new method consists of a few steps. First, 2D spatial Laplace operation is performed on the measured thermal maps (images) without heat sink to obtain the so-called "raw power maps". Then, a novel scheme is developed to generate the true power density maps from the raw power density maps. The new approach is based on thermal measurements of the processor with back-side cooling using an advanced infrared (IR) thermal imaging system. FEM thermal model constructed in COMSOL Multiphysics is used to validate the estimated power density maps and thermal conductivity. Later, this work creates a high-fidelity FEM thermal model with heat sink and reconstructs the full-chip thermal maps while the heat sink is on. Ensuring that power maps are similar under back cooling and heat sink cooling settings, the reconstructed thermal maps are verified by the matching between the on-chip thermal sensor readings and the corresponding elements of thermal maps. Experiments on an Intel i7-8650U 4-core processor with back cooling shows 96\% similarity (2D correlation) between the measured thermal maps and the thermal maps reconstructed from the estimated power maps, with 1.3$$\rm ^\circ$$C average absolute error. Under heat sink cooling, the average absolute error is 2.2$$\rm ^\circ$$C over a 56$$\rm ^\circ$$C temperature range and about 3.9\% error between the computed and the real thermal maps at the sensor locations. Furthermore, the proposed power map estimation method achieves higher resolution and at least 100$$\times$$ speedup than a recently proposed state-of-art Blind Power Identification method. 
    more » « less
  4. In this work, we propose a novel approach for the real-time estimation of chip-level spatial power maps for commercial Google Coral M.2 TPU chips based on a machine-learning technique for the first time. The new method can enable the development of more robust runtime power and thermal control schemes to take advantage of spatial power information such as hot spots that are otherwise not available. Different from the existing commercial multi-core processors in which real-time performance-related utilization information is available, the TPU from Google does not have such information. To mitigate this problem, we propose to use features that are related to the workloads of running different deep neural networks (DNN) such as the hyperparameters of DNN and TPU resource information generated by the TPU compiler. The new approach involves the offline acquisition of accurate spatial and temporal temperature maps captured from an external infrared thermal imaging camera under nominal working conditions of a chip. To build the dynamic power density map model, we apply generative adversarial networks (GAN) based on the workload-related features. Our study shows that the estimated total powers match the manufacturer's total power measurements extremely well. Experimental results further show that the predictions of power maps are quite accurate, with the RMSE of only 4.98\rm mW/mm^2, or 2.6\% of the full-scale error. The speed of deploying the proposed approach on an Intel Core i7-10710U is as fast as 6.9ms, which is suitable for real-time estimation. 
    more » « less
  5. This study presents a novel approach to optimal control utilizing a Koopman operator integrated with a linear quadratic regulator (LQR) to enhance the thermal management and power output efficiency of an open-cathode proton exchange membrane fuel cell (PEMFC) stack. First, a linear time-invariant dynamic model was derived through Koopman operator to forecast the behavior of the PEMFC stack. Second, this Koopman-based model was directly integrated with LQR for optimizing temperature, temperature variations, and output power efficiency of the PEMFC stack by regulating fan speed, with a physics-based model serving as the plant model. Finally, the performance of the Koopman-based LQRs (KLQR) was compared to a baseline proportional-integral (PI) controller across various ambient temperatures and operating conditions, focusing on temperature, temperature variations, and net power output. The results demonstrate the proposed Koopman-based approach can be seamless integration with linear optimal control algorithms, effectively minimizing temperature, temperature variations across the PEMFC stack, and the net power outputs under different ambient temperature and operating conditions. 
    more » « less