In typical data centers, the servers and IT equipment are cooled by air and almost half of total IT power is dedicated to cooling. Hybrid cooling is a combined cooling technology with both air and water, where the main heat generating components are cooled by water or water-based coolants and rest of the components are cooled by air supplied by CRAC or CRAH. Retrofitting the air-cooled servers with cold plates and pumps has the advantage over thermal management of CPUs and other high heat generating components. In a typical 1U server, the CPUs were retrofitted with cold plates and the server tested with raised coolant inlet conditions. The study showed the server can operate with maximum utilization for CPUs, DIMMs, and PCH for inlet coolant temperature from 25–45 °C following the ASHRAE guidelines. The server was also tested for failure scenarios of the pumps and fans with reducing numbers of fans and pumps. To reduce cooling power consumption at the facility level and increase air-side economizer hours, the hybrid cooled server can be operated at raised inlet air temperatures. The trade-off in energy savings at the facility level due to raising the inlet air temperatures versus the possible increase in server fan power and component temperatures is investigated. A detailed CFD analysis with a minimum number of server fans can provide a way to find an operating range of inlet air temperature for a hybrid cooled server. Changes in the model are carried out in 6SigmaET for an individual server and compared to the experimental data to validate the model. The results from this study can be helpful in determining the room level operating set points for data centers housing hybrid cooled server racks.
more »
« less
Characterization of an Isolated Hybrid Cooled Server With Failure Scenarios Using Warm Water Cooling
Modern day data centers are operated at high power for increased power density, maintenance, and cooling which covers almost 2 percent (70 billion kilowatt-hours) of the total energy consumption in the US. IT components and cooling system occupy the major portion of this energy consumption. Although data centers are designed to perform efficiently, cooling the high-density components is still a challenge. So, alternative methods to improve the cooling efficiency has become the drive to reduce the cooling cost. As liquid cooling is more efficient for high specific heat capacity, density, and thermal conductivity, hybrid cooling can offer the advantage of liquid cooling of high heat generating components in the traditional air-cooled servers. In this experiment, a 1U server is equipped with cold plate to cool the CPUs while the rest of the components are cooled by fans. In this study, predictive fan and pump failure analysis are performed which also helps to explore the options for redundancy and to reduce the cooling cost by improving cooling efficiency. Redundancy requires the knowledge of planned and unplanned system failures. As the main heat generating components are cooled by liquid, warm water cooling can be employed to observe the effects of raised inlet conditions in a hybrid cooled server with failure scenarios. The ASHRAE guidance class W4 for liquid cooling is chosen for our experiment to operate in a range from 25°C – 45°C. The experiments are conducted separately for the pump and fan failure scenarios. Computational load of idle, 10%, 30%, 50%, 70% and 98% are applied while powering only one pump and the miniature dry cooler fans are controlled externally to maintain constant inlet temperature of the coolant. As the rest of components such as DIMMs & PCH are cooled by air, maximum utilization for memory is applied while reducing the number fans in each case for fan failure scenario. The components temperatures and power consumption are recorded in each case for performance analysis
more »
« less
- Award ID(s):
- 1738811
- PAR ID:
- 10058018
- Date Published:
- Journal Name:
- ASME 2017 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems collocated with the ASME 2017 Conference on Information Storage and Processing Systems
- Page Range / eLocation ID:
- V001T02A002
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract Data centers are witnessing an unprecedented increase in processing and data storage, resulting in an exponential increase in the servers’ power density and heat generation. Data center operators are looking for green energy efficient cooling technologies with low power consumption and high thermal performance. Typical air-cooled data centers must maintain safe operating temperatures to accommodate cooling for high power consuming server components such as CPUs and GPUs. Thus, making air-cooling inefficient with regards to heat transfer and energy consumption for applications such as high-performance computing, AI, cryptocurrency, and cloud computing, thereby forcing the data centers to switch to liquid cooling. Additionally, air-cooling has a higher OPEX to account for higher server fan power. Liquid Immersion Cooling (LIC) is an affordable and sustainable cooling technology that addresses many of the challenges that come with air cooling technology. LIC is becoming a viable and reliable cooling technology for many high-power demanding applications, leading to reduced maintenance costs, lower water utilization, and lower power consumption. In terms of environmental effect, single-phase immersion cooling outperforms two-phase immersion cooling. There are two types of single-phase immersion cooling methods namely, forced and natural convection. Here, forced convection has a higher overall heat transfer coefficient which makes it advantageous for cooling high-powered electronic devices. Obviously, with natural convection, it is possible to simplify cooling components including elimination of pump. There is, however, some advantages to forced convection and especially low velocity flow where the pumping power is relatively negligible. This study provides a comparison between a baseline forced convection single phase immersion cooled server run for three different inlet temperatures and four different natural convection configurations that utilize different server powers and cold plates. Since the buoyancy effect of the hot fluid is leveraged to generate a natural flow in natural convection, cold plates are designed to remove heat from the server. For performance comparison, a natural convection model with cold plates is designed where water is the flowing fluid in the cold plate. A high-density server is modeled on the Ansys Icepak, with a total server heat load of 3.76 kW. The server is made up of two CPUs and eight GPUs with each chip having its own thermal design power (TDPs). For both heat transfer conditions, the fluid used in the investigation is EC-110, and it is operated at input temperatures of 30°C, 40°C, and 50°C. The coolant flow rate in forced convection is 5 GPM, whereas the flow rate in natural convection cold plates is varied. CFD simulations are used to reduce chip case temperatures through the utilization of both forced and natural convection. Pressure drop and pumping power of operation are also evaluated on the server for the given intake temperature range, and the best-operating parameters are established. The numerical study shows that forced convection systems can maintain much lower component temperatures in comparison to natural convection systems even when the natural convection systems are modeled with enhanced cooling characteristics.more » « less
-
In the United States, out of the total electricity produced, 2% of it is consumed by the data center facility, and up to 40% of its energy is utilized by the cooling infrastructure to cool all the heat-generating components present inside the facility, with recent technological advancement, the trend of power consumption has increased and as a consequence of increased energy consumption is the increase in carbon footprint which is a growing concern in the industry. In air cooling, the high heat- dissipating components present inside a server/hardware must receive efficient airflow for efficient cooling and to direct the air toward the components ducting is provided. In this study, the duct present in the air-cooled server is optimized and vanes are provided to improve the airflow, and side vents are installed over the sides of the server chassis before the duct is placed to bypass some of the cool air which is entering from the front where the hard drives are present. Experiments were conducted on the Cisco C220 air-cooled server with the new duct and the bypass provided, the effects of the new duct and bypass are quantified by comparing the temperature of the components such as the Central Processing Unit (CPUs), and Platform controller hub (PCH) and the savings in terms of total fan power consumption. A 7.5°C drop in temperature is observed and savings of up to 30% in terms of fan power consumption can be achieved with the improved design compared with the standard server.more » « less
-
Abstract In recent years there has been a phenomenal development in cloud computing, networking, virtualization, and storage, which has increased the demand for high performance data centers. The demand for higher CPU (Central Processing Unit) performance and increasing Thermal Design Power (TDP) trends in the industry needs advanced methods of cooling systems that offer high heat transfer capabilities. Maintaining the CPU temperature within the specified limitation with air-cooled servers becomes a challenge after a certain TDP threshold. Among the equipments used in data centers, energy consumption of a cooling system is significantly large and is typically estimated to be over 40% of the total energy consumed. Advancements in Dual In-line Memory Modules (DIMMs) and the CPU compatibility led to overall higher server power consumption. Recent trends show DIMMs consume up to or above 20W each and each CPU can support up to 12 DIMM channels. Therefore, in a data center where high-power dense compute systems are packed together, it demands efficient cooling for the overall server components. In single-phase immersion cooling technology, electronic components or servers are typically submerged in a thermally conductive dielectric fluid allowing it to dissipate heat from all the electronics. The broader focus of this research is to investigate the heat transfer and flow behavior in a 1U air cooled spread core configuration server with heat sinks compared to cold plates attached in series in an immersion environment. Cold plates have extremely low thermal resistance compared to standard air cooled heatsinks. Generally, immersion fluids are dielectric, and fluids used in cold plates are electrically conductive which exposes several problems. In this study, we focus only on understanding the thermal and flow behavior, but it is important to address the challenges associated with it. The coolant used for cold plate is 25% Propylene Glycol water mixture and the fluid used in the tank is a commercially available synthetic dielectric fluid EC-100. A Computational Fluid Dynamics (CFD) model is built in such a way that only the CPUs are cooled using cold plates and the auxiliary electronic components are cooled by the immersion fluid. A baseline CFD model using an air-cooled server with heat sinks is compared to the immersion cold server with cold plates attached to the CPU. The server model has a compact model for cold plate representing thermal resistance and pressure drop. Results of the study discuss the impact on CPU temperatures for various fluid inlet conditions and predict the cooling capability of the integrated cold plate in immersion environment.more » « less
-
Abstract Data centers have started to adopt immersion cooling for more than just mainframes and supercomputers. Due to the inability of air cooling to cool down recent high-configured servers with higher Thermal Design Power, current thermal requirements in machine learning, AI, blockchain, 5G, edge computing, and high-frequency trading have resulted in a larger deployment of immersion cooling. Dielectric fluids are far more efficient at transferring heat than air. Immersion cooling promises to help address many of the challenges that come with air cooling systems, especially as computing densities increase. Immersion-cooled data centers are more expandable, quicker installation, more energy-efficient, allows for the cooling of almost all server components, save more money for enterprises, and are more robust overall. By eliminating active cooling components such as fans, immersion cooling enables a significantly higher density of computing capabilities. When utilizing immersion cooling for server hardware that is intended to be air-cooled, immersion-specific optimized heat sinks should be used. A heat sink is an important component for server cooling efficacy. This research conducts an optimization of heatsink for immersion-cooled servers to achieve the minimum case temperature possible utilizing multi-objective and multidesign variable optimization with pumping power as the constraint. A high-density server of 3.76 kW was modeled on Ansys Icepak that consists of 2 CPUs and 8 GPUs with heatsink assemblies at their Thermal Design Power along with 32 Dual In-line Memory Modules. The optimization is conducted for Aluminum heat sinks by minimizing the pressure drop and thermal resistance as the objective functions whereas fin count, fin thickness, and heat sink height are chosen as the design variables in all CPUs, and GPUs heatsink assemblies. Optimization for the CPU and the GPU heatsink was done separately and then the optimized heatsinks were tested in an actual test setup of the server in ANSYS Icepak. The dielectric fluid for this numerical study is EC-110 and the cooling is carried out using forced convection. A Design of Experiment (DOE) is created based on the input range of design variables using a full-factorial approach to generate multiple design points. The effect of the design variables is analyzed on the objective functions to establish the parameters that have a greater impact on the performance of the optimized heatsink. The optimization study is done using Ansys OptiSLang where AMOP (Adaptive Metamodel of Optimal Prognosis) as the sampling method for design exploration. The results show total effect values of heat sinks geometric parameters to choose the best design point with the help of a Response Surface 2D and 3D plot for the individual heat sink assembly.more » « less
An official website of the United States government

