Abstract Demand is growing for the dense and high-performing IT computing capacity to support artificial intelligence, deep learning, machine learning, autonomous cars, the Internet of Things, etc. This led to an unprecedented growth in transistor density for high-end CPUs and GPUs, creating thermal design power (TDP) of even more than 700 watts for some of the NVIDIA existing GPUs. Cooling these high TDP chips with air cooling comes with a cost of the higher form factor of servers and noise produced by server fans close to the permissible limit. Direct-to-chip cold plate-based liquid cooling is highly efficient and becoming more reliable as the advancement in technology is taking place. Several components are used in the liquid-cooled data centers for the deployment of cold plate-based direct-to-chip liquid cooling like cooling loops, rack manifolds, CDUs, row manifolds, quick disconnects, flow control valves, etc. Row manifolds used in liquid cooling are used to distribute secondary coolant to the rack manifolds. Characterizing these row manifolds to understand the pressure drops and flow distribution for better data center design and energy efficiency is important. In this paper, the methodology is developed to characterize the row manifolds. Water-based coolant Propylene glycol 25% was used as the coolant for the experiments and experiments were conducted at 21 °C coolant supply temperature. Two, six-port row manifolds' P-Q curves were generated, and the value of supply pressure and the flowrate were measured at each port. The results obtained from the experiments were validated by a technique called flow network modeling (FNM). FNM technique uses the overall flow and thermal characteristics to represent the behavior of individual components.
more »
« less
Design and Optimization of Control Strategy to Reduce Pumping Power in Dynamic Liquid Cooling
Abstract Data centers are a large group of networked servers used by organizations for computational and storage purposes. In 2014, data centers consumed an estimated 70 billion kWh in the United States alone. It is incumbent on thermal engineers to develop efficient methods in order to minimize the expenditure at least toward cooling considering the limited available power resources. One of the key areas where electronic cooling research has been focusing, is addressing the issue of nonuniform power distribution at the rack, server and even at package levels. Nonuniform heating at the chip level creates hotspots and temperature gradients across the chip which in turn significantly increases the cost of cooling, as cooling cost is a function of the maximum junction temperature. This challenge has increased the use of temperature sensing mechanisms to help in finding ways to mitigate the gradients. A very effective way to conserve pumping power and address hotspots on the single or multichip modules is by targeted delivery of liquid coolant. One way to enable such targeted delivery of coolant is by using dynamic cold plates coupled with self-regulating flow control device that can control flow rate based on temperature. This novel technology will have more effective implementation coupled with a good control strategy. This paper addresses the development and testing of such control strategy with minimal sensors along with less latency and optimization of the same.
more »
« less
- Award ID(s):
- 1738811
- PAR ID:
- 10276340
- Date Published:
- Journal Name:
- Journal of Electronic Packaging
- Volume:
- 143
- Issue:
- 3
- ISSN:
- 1043-7398
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Transistor density trends till recently have been following Moore's law, doubling every generation resulting in increased power density. The computational performance gains with the breakdown of Moore's law were achieved by using multicore processors, leading to nonuniform power distribution and localized high temperatures making thermal management even more challenging. Cold plate-based liquid cooling has proven to be one of the most efficient technologies in overcoming these thermal management issues. Traditional liquid-cooled data center deployments provide a constant flow rate to servers irrespective of the workload, leading to excessive consumption of coolant pumping power. Therefore, a further enhancement in the efficiency of implementation of liquid cooling in data centers is possible. The present investigation proposes the implementation of dynamic cooling using an active flow control device to regulate the coolant flow rates at the server level. This device can aid in pumping power savings by controlling the flow rates based on server utilization. The flow control device design contains a V-cut ball valve connected to a microservo motor used for varying the device valve angle. The valve position was varied to change the flow rate through the valve by servomotor actuation based on predecided rotational angles. The device operation was characterized by quantifying the flow rates and pressure drop across the device by changing the valve position using both computational fluid dynamics and experiments. The proposed flow control device was able to vary the flow rate between 0.09 lpm and 4 lpm at different valve positions.more » « less
-
Effective cooling is crucial for high-power liquid-cooled servers to ensure optimal performance and reliability ofcomponents. Thermal characterization is necessary to ensure that the cooling system functions as intended, is energy efficient, and minimizes downtime. In this study, a proposed methodology for thermal characterization of a high-powerliquid-cooled server/TTV [server and TTVs (thermal test vehicle) are used interchangeably] is presented. The server layout includes multiple thermal test vehicle setups equipped with direct-to-chip cold plates, with two or more connected in series to form a TTV cooling loop. These cooling loops are connected in parallel to the supply and return plenums of the cooling loop manifold, which includes a chassis-level flow distribution manifold. To obtain accurate measurements, two identical server/TTV prototypes are instrumented with sensors for coolant flow rate and temperature measurements for every TTV cooling loop. Four ultrasonic flow sensors are installed in the flow verification server/TTV to measure the coolant flow rate to each TTV cooling loop. In the thermal verification server, thermistors are installed at the outlet of each GPU heater of TTV cooling loop to log temperature measurements. The amount of heat captured by the coolant in each TTV cooling loop is subsequently estimated based on the flow rates determined from the flow verification server.This methodology enables precise characterization of the thermal performance of high-power liquid-cooled servers,ensuring optimal functionality, energy efficiency, and minimized downtime.more » « less
-
Increasing power densities in data centers due to the rise of Artificial Intelligence (AI), high-performance computing (HPC) and machine learning compel engineers to develop new cooling strategies and designs for high-density data centers. Two-phase cooling is one of the promising technologies which exploits the latent heat of the fluid. This technology is much more effective in removing high heat fluxes than when using the sensible heat of fluid and requires lower coolant flow rates. The latent heat also implies more uniformity in the temperature of a heated surface. Despite the benefits of two-phase cooling, the phase change adds complexities to a system when multiple evaporators (exposed to different heat fluxes potentially) are connected to one coolant distribution unit (CDU). In this paper, a commercial pumped two-phase cooling system is investigated in a rack level. Seventeen 2-rack unit (RU) servers from two distinct models are retrofitted and deployed in the rack. The flow rate and pressure distribution across the rack are studied in various filling ratios. Also, investigated is the transient behavior of the cooling system due to a step change in the information technology (IT) load.more » « less
-
Recent commercial efforts have reestablished the benefits of cooling server modules using direct liquid cooling (DLC) technology. The primary drivers behind this technology are the increase in chip densities and the absolute need to reduce the overall data center power usage. In DLC technology, a cold plate is situated on top of the chip with thermal interface material between the chip and the cold plate. The low thermal resistance path facilitates the use of warm water which helps data centers in replacing the chilled water system by a water side economizer utilizing ambient temperature. This work describes the effort to leverage DLC by employing microchannel cold plates to cool multi-chip modules. The primary objective of this work is to build a sophisticated test rig to characterize the flow and thermal performance of a microchannel cold plate for cooling a two-die chip. This study highlights the challenges of building an experimental setup which simulates a two-die chip package and the approaches taken to overcome the challenges. A parallel channel cold plate is used to benchmark the performance. Tests were conducted for a set of independent variables like flow rate, input power to dice, coolant temperature, flow direction and TIM resistance. The results are presented as PQ curves, specific thermal resistance curves and case temperature distribution reflecting the effect of changing the input variables.more » « less