Abstract The increasing demand for high-performance computing in applications such as the Internet of Things, Deep Learning, Big data for crypto-mining, virtual reality, healthcare research on genomic sequencing, cancer treatment, etc. have led to the growth of hyperscale data centers. To meet the cooling energy demands of HPC datacenters efficient cooling technologies must be adopted. Traditional air cooling, direct-to-chip liquid cooling, and immersion are some of those methods. Among all, Liquid cooling is superior compared to various air-cooling methods in terms of energy consumption. Direct on-chip cooling using cold plate technology is one such method used in removing heat from high-power electronic components such as CPUs and GPUs in a broader sense. Over the years Thermal Design Power (TDP) is rapidly increasing and will continue to increase in the coming years for not only CPUs and GPUs but also associated electronic components like DRAMs, Platform Control Hub (PCH), and other I/O chipsets on a typical server board. Therefore, unlike air hybrid cooling which uses liquid for cold plates and air as the secondary medium of cooling the associated electronics, we foresee using immersion-based fluids to cool the rest of the electronics in the server. The broader focus of this research is to study the effects of adopting immersion cooling, with integrated cold plates for high-performance systems. Although there are several other factors involved in the study, the focus of this paper will be the optimization of cold plate microchannels for immersion-based fluids in an immersion-cooled environment. Since immersion fluids are dielectric and the fluids used in cold plates are conductive, it exposes us to a major risk of leakage into the tank and short-circuiting the electronics. Therefore, we propose using the immersed fluid to pump into the cold plate. However, it leads to a suspicion of poor thermal performance and associated pumping power due to the difference in viscosity and other fluid properties. To address the thermal and flow performance, the objective is to optimize the cold plate microchannel fin parameters based on thermal and flow performance by evaluating thermal resistance and pressure drop across the cold plate. The detailed CFD model and optimization of the cold plate were done using Ansys Icepak and Ansys OptiSLang respectively.
more »
« less
Accelerated Degradation Of Copper Cold Plates In Direct-to-Chip Liquid Cooling in Data Centers
Increasing demands for cloud-based computing and storage, the Internet of Things and machine learning-based applications have necessitated the use of more eficient cooling technologies. Direct-to-chip liquid cooling using cold plates has proven to be one of the most effective methods to dissipate the high heat luxes of modern high-power CPUs and graphics processing units (GPU). While the published literature has well-documented research on the thermal aspects of direct liquid cooling, a detailed account of reliability degradation is missing. The present investigation provides an in-depth experimental analysis of the accelerated degradation of copper cold plates used in high-power direct-to-chip liquid cooling in data centers.
more »
« less
- Award ID(s):
- 2209751
- PAR ID:
- 10537808
- Publisher / Repository:
- ASHRAE
- Date Published:
- Journal Name:
- ASHRAE journal
- ISSN:
- 0001-2491
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
More than ever before, data centers must deploy robust thermal solutions to adequately host the high-density and high-performance computing that is in high demand. The newer generation of central processing units (CPUs) and graphics processing units (GPUs) has substantially higher thermal power densities than previous generations. In recent years, more data centers rely on liquid cooling for the high-heat processors inside the servers and air cooling for the remaining low-heat information technology equipment. This hybrid cooling approach creates a smaller and more efficient data center. The deployment of direct-to-chip cold plate liquid cooling is one of the mainstream approaches to providing concentrated cooling to targeted processors. In this study, a processor-level experimental setup was developed to evaluate the cooling performance of a novel computer numerical control (CNC) machined nickel-plated impinging cold plate on a 1 in. 1 in. mock heater that represents a functional processing unit. The pressure drop and thermal resistance performance curves of the electroless nickel-plated cold plate are compared to those of a pure copper cold plate. A temperature uniformity analysis is done using compuational fluid dynamics and compared to the actual test data. Finally, the CNC machined pure copper one is compared to other reported cold plates to demonstrate its superiority of the design with respect to the cooling performance.more » « less
-
Abstract Data centers are witnessing an unprecedented increase in processing and data storage, resulting in an exponential increase in the servers’ power density and heat generation. Data center operators are looking for green energy efficient cooling technologies with low power consumption and high thermal performance. Typical air-cooled data centers must maintain safe operating temperatures to accommodate cooling for high power consuming server components such as CPUs and GPUs. Thus, making air-cooling inefficient with regards to heat transfer and energy consumption for applications such as high-performance computing, AI, cryptocurrency, and cloud computing, thereby forcing the data centers to switch to liquid cooling. Additionally, air-cooling has a higher OPEX to account for higher server fan power. Liquid Immersion Cooling (LIC) is an affordable and sustainable cooling technology that addresses many of the challenges that come with air cooling technology. LIC is becoming a viable and reliable cooling technology for many high-power demanding applications, leading to reduced maintenance costs, lower water utilization, and lower power consumption. In terms of environmental effect, single-phase immersion cooling outperforms two-phase immersion cooling. There are two types of single-phase immersion cooling methods namely, forced and natural convection. Here, forced convection has a higher overall heat transfer coefficient which makes it advantageous for cooling high-powered electronic devices. Obviously, with natural convection, it is possible to simplify cooling components including elimination of pump. There is, however, some advantages to forced convection and especially low velocity flow where the pumping power is relatively negligible. This study provides a comparison between a baseline forced convection single phase immersion cooled server run for three different inlet temperatures and four different natural convection configurations that utilize different server powers and cold plates. Since the buoyancy effect of the hot fluid is leveraged to generate a natural flow in natural convection, cold plates are designed to remove heat from the server. For performance comparison, a natural convection model with cold plates is designed where water is the flowing fluid in the cold plate. A high-density server is modeled on the Ansys Icepak, with a total server heat load of 3.76 kW. The server is made up of two CPUs and eight GPUs with each chip having its own thermal design power (TDPs). For both heat transfer conditions, the fluid used in the investigation is EC-110, and it is operated at input temperatures of 30°C, 40°C, and 50°C. The coolant flow rate in forced convection is 5 GPM, whereas the flow rate in natural convection cold plates is varied. CFD simulations are used to reduce chip case temperatures through the utilization of both forced and natural convection. Pressure drop and pumping power of operation are also evaluated on the server for the given intake temperature range, and the best-operating parameters are established. The numerical study shows that forced convection systems can maintain much lower component temperatures in comparison to natural convection systems even when the natural convection systems are modeled with enhanced cooling characteristics.more » « less
-
The rapid growth in data center workloads and the increasing complexity of modern applications have led to significant contradictions between computational performance and thermal management. Traditional air-cooling systems, while widely adopted, are reaching their limits in handling the rising thermal footprints and higher rack power densities of next-generation servers, often resulting in thermal throttling and decreased efficiency, emphasizing the need for more efficient cooling solutions. Direct-to-chip liquid cooling with cold plates has emerged as a promising solution, providing efficient heat dissipation for high-performance servers. However, challenges remain, such as ensuring system stability under varying thermal loads and optimizing integration with existing infrastructure. This comprehensive study digs into the area of data center liquid cooling, providing a novel, comprehensive experimental investigation of the critical steps and tests necessary for commissioning coolant distribution units (CDUs) in direct-to-chip liquid-cooled data centers. It carefully investigates the hydraulic, thermal, and energy aspects, establishing the groundwork for Liquid-to-Air (L2A) CDU data centers. A CDU’s performance was evaluated under different conditions. First, the CDU’s maximum cooling capacity was evaluated and found to be as high as 89.9 kW at an approach temperature difference (ATD) of 18.3 ◦C with a 0.83 heat exchanger effectiveness. Then, to assess the cooling performance and stability of the CDU, a low-power test and a transient thermohydraulic test were conducted. The results showed instability in the supply fluid temperature (SFT) caused by the oscillation in fan speed at low thermal loads. Despite this, heat removal rates remained constant across varying supply air temperatures (SATs), and a partial power usage effectiveness (PPUE) of 1.042 was achieved at 100 % heat load (86 kW) under different SATs. This research sets a foundation for improving L2A CDU performance and offers practical insights for overcoming current cooling limitations in data centers.more » « less
-
Recent commercial efforts have reestablished the benefits of cooling server modules using direct liquid cooling (DLC) technology. The primary drivers behind this technology are the increase in chip densities and the absolute need to reduce the overall data center power usage. In DLC technology, a cold plate is situated on top of the chip with thermal interface material between the chip and the cold plate. The low thermal resistance path facilitates the use of warm water which helps data centers in replacing the chilled water system by a water side economizer utilizing ambient temperature. This work describes the effort to leverage DLC by employing microchannel cold plates to cool multi-chip modules. The primary objective of this work is to build a sophisticated test rig to characterize the flow and thermal performance of a microchannel cold plate for cooling a two-die chip. This study highlights the challenges of building an experimental setup which simulates a two-die chip package and the approaches taken to overcome the challenges. A parallel channel cold plate is used to benchmark the performance. Tests were conducted for a set of independent variables like flow rate, input power to dice, coolant temperature, flow direction and TIM resistance. The results are presented as PQ curves, specific thermal resistance curves and case temperature distribution reflecting the effect of changing the input variables.more » « less
An official website of the United States government

