Direct Liquid Cooling (DLC) has emerged as a promising technology for thermal management of high-performance computing servers, enabling efficient heat dissipation and reliable operation. Thermal performance is governed by several factors, including the coolant physical properties and flow parameters such as coolant inlet temperature and flow rate. The design and development of the coolant distribution manifold to the Information Technology Equipment (ITE) can significantly impact the overall performance of the computing system. This paper aims to investigate the hydraulic characterization and design validation of a rack-level coolant distribution manifold or rack manifold. To achieve this goal, a custom-built high power-density liquid-cooled ITE rack was assembled, and various cooling loops were plugged into the rack manifold to validate its thermal performance. The rack manifold is responsible for distributing the coolant to each of these cooling loops, which is pumped by a CDU (Coolant Distribution Unit). In this study, pressure drop characteristics of the rack manifold were obtained for flow rates that effectively dissipate the heat loads from the ITE. The pressure drop is a critical parameter in the design of the coolant distribution manifold since it influences the flow rate and ultimately the thermal performance of the system. By measuring the pressure drop at various flow rates, the researchers can accurately determine the optimum flow rate for efficient heat dissipation. Furthermore, 1D flow network and CFD models of the rack-level coolant loop, including the rack manifold, were developed, and validated against experimental test data. The validated models provide a useful tool for the design of facility-level modeling of a liquid-cooled data center. The CFD models enable the researchers to simulate the fluid flow and heat transfer within the cooling system accurately. These models can help to design the coolant distribution manifold at facility level. The results of this study demonstrate the importance of the design and development of the coolant distribution manifold in the thermal performance of a liquid-cooled data center. The study also highlights the usefulness of 1D flow network and CFD models for designing and validating liquid-cooled data center cooling systems. In conclusion, the hydraulic characterization and design validation of a rack-level coolant distribution manifold is critical in achieving efficient thermal management of high-performance computing servers. This study presents a comprehensive approach for hydraulic characterization of the coolant distribution manifold, which can significantly impact the overall thermal performance and reliability of the system. The validated models also provide a useful tool for the design of facility-level modeling of a liquid-cooled data center.
Characterization of 300 GHz Wireless Channels for Rack-to-Rack Communications in Data Centers
- Award ID(s):
- 1651273
- PAR ID:
- 10083904
- Date Published:
- Journal Name:
- IEEE International Symposium on Personal, Indoor and Mobile Radio Communications
- Page Range / eLocation ID:
- 1-5
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
Abstract -
This study presents an experimental and numerical characterization of pressure drop in a commercially available direct liquid cooled (DLC) rack. It is important to investigate the pressure drop in the DLC system as it determines the required pumping power for the DLC system, which affects the energy efficiency of the data center. The main objective of this research is to assess the flow rate and pressure distributions in a DLC system to enhance the reliability and the cooling system efficiency. Other objectives of this research are to evaluate the accuracy of flow network modeling (FNM) in predicting the flow distribution in a DLC rack and identify manufacturing limitations in a commercial system that could impact the cooling system reliability. The main components of the investigated DLC system are: coolant distribution module (CDM), supply/return manifold module, and server module which contains a cold plate. Extensive experimental measurements were performed to study the flow distribution and to determine the pressure characteristic curves for the server modules and the coolant distribution module (CDM). Also, a methodology was described to develop an experimentally validated flow network model (FNM) of the DLC system to obtain high accuracy. The measurements revealed a flow maldistribution among the server modules, which is attributed to the manufacturing process of the micro-channel cold plate. The average errors in predicting the flow rate of the server module and the CDM using FNM are 2.5% and 3.8%, respectively. The accuracy and the short run time make FNM a good tool for design, analysis, and optimization for DLC systems. The pressure drop in the server module is found to account for 56% of the total pressure drop in the DLC rack. Further analysis showed that 69% of the pressure drop in the server module is associated with the module's plumbing (corrugated hoses, disconnects, fittings). The server cooling modules are designed to provide secured connections and flexibility, which come with a high pressure drop cost.more » « less
-
null (Ed.)Low-latency online services have strict Service Level Objectives (SLOs) that require datacenter systems to support high throughput at microsecond-scale tail latency. Dataplane operating systems have been designed to scale up multi-core servers with minimal overhead for such SLOs. However, as application demands continue to increase, scaling up is not enough, and serving larger demands requires these systems to scale out to multiple servers in a rack. We present RackSched, the first rack-level microsecond-scale scheduler that provides the abstraction of a rack-scale computer (i.e., a huge server with hundreds to thousands of cores) to an external service with network-system co-design. The core of RackSched is a two-layer scheduling framework that integrates inter-server scheduling in the top-of-rack (ToR) switch with intra-server scheduling in each server. We use a combination of analytical results and simulations to show that it provides near-optimal performance as centralized scheduling policies, and is robust for both low-dispersion and high-dispersion workloads. We design a custom switch data plane for the inter-server scheduler, which realizes power-of-k- choices, ensures request affinity, and tracks server loads accurately and efficiently. We implement a RackSched prototype on a cluster of commodity servers connected by a Barefoot Tofino switch. End-to-end experiments on a twelve-server testbed show that RackSched improves the throughput by up to 1.44x, and scales out the throughput near linearly, while maintaining the same tail latency as one server until the system is saturated.more » « less