The rapid growth in data center workloads and the increasing complexity of modern applications have led to significant contradictions between computational performance and thermal management. Traditional air-cooling systems, while widely adopted, are reaching their limits in handling the rising thermal footprints and higher rack power densities of next-generation servers, often resulting in thermal throttling and decreased efficiency, emphasizing the need for more efficient cooling solutions. Direct-to-chip liquid cooling with cold plates has emerged as a promising solution, providing efficient heat dissipation for high-performance servers. However, challenges remain, such as ensuring system stability under varying thermal loads and optimizing integration with existing infrastructure. This comprehensive study digs into the area of data center liquid cooling, providing a novel, comprehensive experimental investigation of the critical steps and tests necessary for commissioning coolant distribution units (CDUs) in direct-to-chip liquid-cooled data centers. It carefully investigates the hydraulic, thermal, and energy aspects, establishing the groundwork for Liquid-to-Air (L2A) CDU data centers. A CDU’s performance was evaluated under different conditions. First, the CDU’s maximum cooling capacity was evaluated and found to be as high as 89.9 kW at an approach temperature difference (ATD) of 18.3 ◦C with a 0.83 heat exchanger effectiveness. Then, to assess the cooling performance and stability of the CDU, a low-power test and a transient thermohydraulic test were conducted. The results showed instability in the supply fluid temperature (SFT) caused by the oscillation in fan speed at low thermal loads. Despite this, heat removal rates remained constant across varying supply air temperatures (SATs), and a partial power usage effectiveness (PPUE) of 1.042 was achieved at 100 % heat load (86 kW) under different SATs. This research sets a foundation for improving L2A CDU performance and offers practical insights for overcoming current cooling limitations in data centers.
more »
« less
Study on the Characterization of Filters for a Direct-to-Chip Liquid Cooling System
Data center cooling systems have undergone a major transformation in the persistent pursuit of better performance and lower energy use. Liquid cooling systems, particularly direct-to-chip systems, have emerged as a promising solution to address the increasing heat dissipation challenges. One critical component of such systems is the filtration mechanism, responsible for safeguarding the integrity and efficiency of the cooling process. These factors are pivotal in ensuring the reliable and sustainable operation of liquid cooling systems in high-demand applications, where electronic components continually push the boundaries of heat generation. This study undertakes a thorough examination of filters of different mesh size used in direct-to-chip liquid cooling systems. The research is multifaceted, encompassing the evaluation of filter performance, pressure drop characteristics, and long-term durability. The methodology employed in this research combines testing with a coolant distribution unit and rack setup to provide a holistic perspective on filter functionality. Findings from this study shed light on the key parameters that influence filter performance. Ultimately, the results of this research promise to contribute significantly to the advancement of direct-to-chip liquid cooling systems, facilitating the continued evolution of electronics in diverse fields, such as high-performance computing, data centers, and emerging technologies. With a focus on enhancing system reliability, efficiency, and sustainability, this study seeks to provide a valuable resource for engineers and researchers in the pursuit of effective cooling solutions for cutting-edge electronic applications.
more »
« less
- Award ID(s):
- 2209751
- PAR ID:
- 10537801
- Publisher / Repository:
- American Society of Mechanical Engineers
- Date Published:
- Journal Name:
- Proceedings
- ISSN:
- 2577-1000
- ISBN:
- 979-8-3503-8534-2
- Format(s):
- Medium: X
- Location:
- San Diego, California, USA
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The increasing demand for high-performance computing in applications such as the Internet of Things, Deep Learning, Big data for crypto-mining, virtual reality, healthcare research on genomic sequencing, cancer treatment, etc. have led to the growth of hyperscale data centers. To meet the cooling energy demands of HPC datacenters efficient cooling technologies must be adopted. Traditional air cooling, direct-to-chip liquid cooling, and immersion are some of those methods. Among all, Liquid cooling is superior compared to various air-cooling methods in terms of energy consumption. Direct on-chip cooling using cold plate technology is one such method used in removing heat from high-power electronic components such as CPUs and GPUs in a broader sense. Over the years Thermal Design Power (TDP) is rapidly increasing and will continue to increase in the coming years for not only CPUs and GPUs but also associated electronic components like DRAMs, Platform Control Hub (PCH), and other I/O chipsets on a typical server board. Therefore, unlike air hybrid cooling which uses liquid for cold plates and air as the secondary medium of cooling the associated electronics, we foresee using immersion-based fluids to cool the rest of the electronics in the server. The broader focus of this research is to study the effects of adopting immersion cooling, with integrated cold plates for high-performance systems. Although there are several other factors involved in the study, the focus of this paper will be the optimization of cold plate microchannels for immersion-based fluids in an immersion-cooled environment. Since immersion fluids are dielectric and the fluids used in cold plates are conductive, it exposes us to a major risk of leakage into the tank and short-circuiting the electronics. Therefore, we propose using the immersed fluid to pump into the cold plate. However, it leads to a suspicion of poor thermal performance and associated pumping power due to the difference in viscosity and other fluid properties. To address the thermal and flow performance, the objective is to optimize the cold plate microchannel fin parameters based on thermal and flow performance by evaluating thermal resistance and pressure drop across the cold plate. The detailed CFD model and optimization of the cold plate were done using Ansys Icepak and Ansys OptiSLang respectively.more » « less
-
Increasing demands for cloud-based computing and storage, the Internet of Things and machine learning-based applications have necessitated the use of more eficient cooling technologies. Direct-to-chip liquid cooling using cold plates has proven to be one of the most effective methods to dissipate the high heat luxes of modern high-power CPUs and graphics processing units (GPU). While the published literature has well-documented research on the thermal aspects of direct liquid cooling, a detailed account of reliability degradation is missing. The present investigation provides an in-depth experimental analysis of the accelerated degradation of copper cold plates used in high-power direct-to-chip liquid cooling in data centers.more » « less
-
Abstract The data center’s server power density and heat generation have increased exponentially because of the recent, unparalleled rise in the processing and storing of massive amounts of data on a regular basis. One-third of the overall energy used in conventional air-cooled data centers is directed toward cooling information technology equipment (ITE). The traditional air-cooled data centers must have low air supply temperatures and high air flow rates to support high-performance servers, rendering air cooling inefficient and compelling data center operators to use alternative cooling technology. Due to the direct interaction of dielectric fluids with all the components in the server, single-phase liquid immersion cooling (Sp-LIC) addresses mentioned problems by offering a significantly greater thermal mass and a high percentage of heat dissipation. Sp-LIC is a viable option for hyper-scale, edge, and modular data center applications because, unlike direct-to-chip liquid cooling, it does not call for a complex liquid distribution system configuration and the dielectric liquid can make direct contact with all server components. Immersion cooling is superior to conventional air-cooling technology in terms of thermal energy management however, there have been very few studies on the reliability of such cooling technology. A detailed assessment of the material compatibility of different electronic packaging materials for immersion cooling was required to comprehend their failure modes and reliability. For the mechanical design of electronics, the modulus, and thermal expansion are essential material characteristics. The substrate is a crucial element of an electronic package that has a significant impact on the reliability and failure mechanisms of electronics at both the package and the board level. As per Open Compute Project (OCP) design guidelines for immersion-cooled IT equipment, the traditional material compatibility tests from standards like ASTM 3455 can be used with certain appropriate adjustments. The primary focus of this research is to address two challenges: The first part is to understand the impact of thermal aging on the thermo-mechanical properties of the halogen-free substrate core in the single-phase immersion cooling. Another goal of the study is to comprehend how thermal aging affects the thermo-mechanical characteristics of the substrate core in the air. In this research the substrate core is aged in synthetic hydrocarbon fluid (EC100), Polyalphaolefin 6 (PAO 6), and ambient air for 720 hours each at two different temperatures: 85°C and 125°C and the complex modulus before and after aging are calculated and compared.more » « less
-
Abstract Due to the increasing computational demand driven by artificial intelligence, machine learning, and the Internet of Things (IoT), there has been an unprecedented growth in transistor density for high-end CPUs and GPUs. This growth has resulted in high thermal dissipation power (TDP) and high heat flux, necessitating the adoption of advanced cooling technologies to minimize thermal resistance and optimize cooling efficiency. Among these technologies, direct-to-chip cold plate-based liquid cooling has emerged as a preferred choice in electronics cooling due to its efficiency and cost-effectiveness. In this context, different types of single-phase liquid coolants, such as propylene glycol (PG), ethylene glycol (EG), DI water, treated water, and nanofluids, have been utilized in the market. These coolants, manufactured by different companies, incorporate various inhibitors and chemicals to enhance long-term performance, prevent biogrowth, and provide corrosion resistance. However, the additives used in these coolants can impact their thermal performance, even when the base coolant is the same. This paper aims to compare these coolant types and evaluate the performance of the same coolant from different vendors. The selection of coolants in this study is based on their performance, compatibility with wetted materials, reliability during extended operation, and environmental impact, following the guidelines set by ASHRAE. To conduct the experiments, a single cold plate-based benchtop setup was constructed, utilizing a thermal test vehicle (TTV), pump, reservoir, flow sensor, pressure sensors, thermocouple, data acquisition units, and heat exchanger. Each coolant was tested using a dedicated cold plate, and thorough cleaning procedures were carried out before each experiment. The experiments were conducted under consistent boundary conditions, with a TTV power of 1000 watts and varying coolant flow rates (ranging from 0.5 lpm to 2 lpm) and supply coolant temperatures (17°C, 25°C, 35°C, and 45°C), simulating warm water cooling. The thermal resistance (Rth) versus flow rate and pressure drop (ΔP) versus flow rate graphs were obtained for each coolant, and the impact of different supply coolant temperatures on pressure drop was characterized. The data collected from this study will be utilized to calculate the Total Cost of Ownership (TCO) in future research, providing insights into the impact of coolant selection at the data center level. There is limited research available on the reliability used in direct-to-chip liquid cooling, and there is currently no standardized methodology for testing their reliability. This study aims to fill this gap by focusing on the reliability of coolants, specifically propylene glycols at concentrations of 25%. To analyze the effectiveness of corrosion inhibitors in these coolants, ASTM standard D1384 apparatus, typically used for testing engine coolant corrosion inhibitors on metal samples in controlled laboratory settings, was employed. The setup involved immersing samples of wetted materials (copper, solder coated brass, brass, steel, cast iron, and cast aluminum) in separate jars containing inhibited propylene glycol solutions from different vendors. This test will determine the reliability difference between the same inhibited solutions from different vendors.more » « less
An official website of the United States government

