Abstract This paper proposes a computational fluid dynamics (CFD) simulation methodology for the multi-design variable optimization of heat sinks for natural convection single-phase immersion cooling of high power-density Data Center server electronics. Immersion cooling provides the capability to cool higher power-densities than air cooling. Due to this, retrofitting Data Center servers initially designed for air-cooling for immersion cooling is of interest. A common area of improvement is in optimizing the air-cooled component heat sinks for the fluid and thermal properties of liquid cooling dielectric fluids. Current heat sink optimization methodologies for immersion cooling demonstrated within the literature rely on a server-level optimization approach. This paper proposes a server-agnostic approach to immersion cooling heat sink optimization by developing a heat sink-level CFD to generate a dataset of optimized heat sinks for a range of variable input parameters: inlet fluid temperature, power dissipation, fin thickness, and number of fins. The objective function of optimization is minimizing heat sink thermal resistance. This research demonstrates an effective modeling and optimization approach for heat sinks. The optimized heat sink designs exhibit improved cooling performance and reduced pressure drop compared to traditional heat sink designs. This study also shows the importance of considering multiple design variables in the heat sink optimization process and extends immersion heat sink optimization beyond server-dependent solutions. The proposed approach can also be extended to other cooling techniques and applications, where optimizing the design variables of heat sinks can improve cooling performance and reduce energy consumption.
more »
« less
Dense Server Design for Immersion Cooling
The growing demands for computational power in cloud computing have led to a significant increase in the deployment of high-performance servers. The growing power consumption of servers and the heat they produce is on track to outpace the capacity of conventional air cooling systems, necessitating more efficient cooling solutions such as liquid immersion cooling. The superior heat exchange capabilities of immersion cooling both eliminates the need for bulky heat sinks, fans, and air flow channels while also unlocking the potential go beyond conventional 2D blade servers to three-dimensional designs. In this work, we present a computational framework to explore designs of servers in three-dimensional space, specifically targeting the maximization of server density within immersion cooling tanks. Our tool is designed to handle a variety of physical and electrical server design constraints. We demonstrate our optimized designs can reduce server volume by 25--52% compared to traditional flat server designs. This increased density reduces land usage as well as the amount of liquid used for immersion, with significant reduction in the carbon emissions embodied in datacenter buildings. We further create physical prototypes to simulate dense server designs and perform real-world experiments in an immersion cooling tank demonstrating they operate at safe temperatures. This approach marks a critical step forward in sustainable and efficient datacenter management.
more »
« less
- Award ID(s):
- 2212049
- PAR ID:
- 10605929
- Publisher / Repository:
- Association for Computing Machinery (ACM)
- Date Published:
- Journal Name:
- ACM Transactions on Graphics
- Volume:
- 43
- Issue:
- 6
- ISSN:
- 0730-0301
- Format(s):
- Medium: X Size: p. 1-20
- Size(s):
- p. 1-20
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
As web-based AI applications are growing rapidly, server rooms face escalating computational demands, prompting enterprises to either upgrade their facilities or outsource to co-located sites. This growth strains conventional heating ventilation and air-conditioning (HVAC) systems, which struggle to handle the substantial thermal load, often resulting in hotspots. Liquid-to-air (L2A) coolant distribution units (CDUs) emerge as a solution, efficiently cooling servers by circulating liquid coolant through cooling loops (CLs) mounted on each server board. In this study, the performance of a 24-kW L2A CDU is evaluated across various scenarios, emphasizing cooling effect and stability. Experimental tests involve a rack with three thermal test vehicles (TTVs), monitoring both liquid coolant and air sides for analysis. Tests are conducted in a limited air-conditioned environment, resembling upgraded server rooms with conventional AC systems. The study also assesses the impact of high-power density cooling units on the server room environment, measuring noise, air velocity, and ambient temperature against ASHRAE standards for human comfort. Recommendations for optimal practices and potential system improvements are included in the research, addressing the growing need for efficient cooling solutions amidst escalating computational demands.more » « less
-
As the online frameworks and services are growing rapidly with the evolution of web-based Artificial Intelligence (AI) applications, server rooms are upgrading in computational capacity and size to keep up with these demands. Enterprise companies with their limited capacity server rooms struggle to keep up with these increasing computational demands. Hence, some of them end up outsourcing their servers to co-located facilities (Co-Lo) and the others choose to upgrade their existing server rooms. Correspondingly, the thermal load associated with such upgrades is typically tremendous. Approximately around 40% of the power consumed by datacentres is dissipated as heat. Conventional HVAC systems fail to satisfy the requirements of such server capacities. Not only do they struggle to fulfil the cooling load, but their maldistribution of cool air into the server room forms a major cause for hotspots formation. To tackle this issue, Liquid-to-Air (L2A) Coolant Distribution Units (CDUs) are being used as a liquid-based cooling solution for rack-level cooling. This type of CDUs provide efficient cooling for servers through liquid coolant that is distributed into cooling loops mounted on top of each server board. The generated heat is curried away using this liquid coolant back to the CDU, which then dissipates it into the surrounding air using dedicated pumps, fans, and heat exchanger, hence the name Liquid-to-Air. In the present work, one of the most popular liquid cooling strategies is explored based on various scenarios. the performance of a 24-kW liquid to Air (L2A) CDU is judged based on cooling effect, stability, and reliability. The study is curried out experimentally, in which a test rack with three thermal test vehicles (TTVs) are used to investigate various operation scenarios. Both liquid coolant and air sides of this experimental setup are equipped with the required instrumentations to monitor and analyse the tests. All test cases were taken in a room with limited air conditioning to resemble the environment of upgraded server rooms with conventional AC systems. Moreover, the impact of using such high-power density cooling unit on the server room environment with restricted HVAC system is also brought to light. Environmental and human comfort parameters such as noise, air velocity, and ambient temperature are measured under various operation conditions and benchmarked against their ranges for human comfort as listed in ASHREE standards. At the end of this research, recommendations for best practice are provided along with areas of enhancement for the selected system.more » « less
-
Abstract To fulfill the increasing demands of data storage and data processing within modern data centers, a corresponding increase in server performance is necessary. This leads to a subsequent increase in power consumption and heat generation in the servers due to high performance processing units. Currently, air cooling is the most widely used thermal management technique in data centers, but it has started to reach its limitations in cooling of high-power density packaging. Therefore, industries utilizing data centers are looking to singlephase immersion cooling using various dielectric fluids to reduce the operational and cooling costs by enhancing the thermal management of servers. In this study, heat sinks with TPMS lattice structures were designed for application in singlephase immersion cooling of data center servers. These designs are made possible by Electrochemical Additive Manufacturing (ECAM) technology due to their complex topologies. The ECAM process allows for generation of complex heat sink geometries never before possible using traditional manufacturing processes. Geometric complexities including amorphous and porous structures with high surface area to volume ratio enable ECAM heat sinks to have superior heat transfer properties. Our objective is to compare various heat sink geometries by minimizing chip junction temperature in a single-phase immersion cooling setup for natural convection flow regimes. Computational fluid dynamics in ANSYS Fluent is utilized to compare the ECAM heat sink designs. The additively manufactured heat sink designs are evaluated by comparing their thermal performance under natural convection conditions. This study presents a novel approach to heat sink design and bolsters the capability of ECAM-produced heat sinks.more » « less
-
Abstract Data centers are witnessing an unprecedented increase in processing and data storage, resulting in an exponential increase in the servers’ power density and heat generation. Data center operators are looking for green energy efficient cooling technologies with low power consumption and high thermal performance. Typical air-cooled data centers must maintain safe operating temperatures to accommodate cooling for high power consuming server components such as CPUs and GPUs. Thus, making air-cooling inefficient with regards to heat transfer and energy consumption for applications such as high-performance computing, AI, cryptocurrency, and cloud computing, thereby forcing the data centers to switch to liquid cooling. Additionally, air-cooling has a higher OPEX to account for higher server fan power. Liquid Immersion Cooling (LIC) is an affordable and sustainable cooling technology that addresses many of the challenges that come with air cooling technology. LIC is becoming a viable and reliable cooling technology for many high-power demanding applications, leading to reduced maintenance costs, lower water utilization, and lower power consumption. In terms of environmental effect, single-phase immersion cooling outperforms two-phase immersion cooling. There are two types of single-phase immersion cooling methods namely, forced and natural convection. Here, forced convection has a higher overall heat transfer coefficient which makes it advantageous for cooling high-powered electronic devices. Obviously, with natural convection, it is possible to simplify cooling components including elimination of pump. There is, however, some advantages to forced convection and especially low velocity flow where the pumping power is relatively negligible. This study provides a comparison between a baseline forced convection single phase immersion cooled server run for three different inlet temperatures and four different natural convection configurations that utilize different server powers and cold plates. Since the buoyancy effect of the hot fluid is leveraged to generate a natural flow in natural convection, cold plates are designed to remove heat from the server. For performance comparison, a natural convection model with cold plates is designed where water is the flowing fluid in the cold plate. A high-density server is modeled on the Ansys Icepak, with a total server heat load of 3.76 kW. The server is made up of two CPUs and eight GPUs with each chip having its own thermal design power (TDPs). For both heat transfer conditions, the fluid used in the investigation is EC-110, and it is operated at input temperatures of 30°C, 40°C, and 50°C. The coolant flow rate in forced convection is 5 GPM, whereas the flow rate in natural convection cold plates is varied. CFD simulations are used to reduce chip case temperatures through the utilization of both forced and natural convection. Pressure drop and pumping power of operation are also evaluated on the server for the given intake temperature range, and the best-operating parameters are established. The numerical study shows that forced convection systems can maintain much lower component temperatures in comparison to natural convection systems even when the natural convection systems are modeled with enhanced cooling characteristics.more » « less
An official website of the United States government
