skip to main content


Title: Measurement of the Thermal Performance of a Custom-Build Single-Phase Immersion Cooled Server at Various High and Low Temperatures for Prolonged Time
Abstract The next radical change in the thermal management of data centers is to shift from conventional cooling methods like air-cooling to direct liquid cooling to enable high thermal mass and corresponding superior cooling. There has been in the past few years a limited adoption of direct liquid cooling in data centers because of its simplicity and high heat dissipation capacity. Single-phase engineered fluid immersion cooling has several other benefits like better server performance, even temperature profile, and higher rack densities and the ability to cool all components in a server without the need for electrical isolation. The reliability aspect of such cooling technology has not been well addressed in the open literature. This paper presents the performance of a fully single-phase dielectric fluid immersed server over wide temperature ranges in an environmental chamber. The server was placed in an environmental chamber and applied extreme temperatures ranging from −20 °C to 10 °C at 100% relative humidity and from 20 to 55 °C at constant 50% relative humidity for extended durations. This work is a first attempt of measuring the performance of a server and other components like pump including flow rate drop, starting trouble, and other potential issues under extreme climatic conditions for a completely liquid-submerged system. Pumping power consumption is directly proportional to the operating cost of a data center. The experiment was carried out until the core temperature reached the maximum junction temperature. This experiment helps to determine the threshold capacity and the robustness of the server for its applications in extreme climatic conditions.  more » « less
Award ID(s):
1738811
NSF-PAR ID:
10166922
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
Journal of Electronic Packaging
Volume:
142
Issue:
1
ISSN:
1043-7398
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modern day data centers are operated at high power for increased power density, maintenance, and cooling which covers almost 2 percent (70 billion kilowatt-hours) of the total energy consumption in the US. IT components and cooling system occupy the major portion of this energy consumption. Although data centers are designed to perform efficiently, cooling the high-density components is still a challenge. So, alternative methods to improve the cooling efficiency has become the drive to reduce the cooling cost. As liquid cooling is more efficient for high specific heat capacity, density, and thermal conductivity, hybrid cooling can offer the advantage of liquid cooling of high heat generating components in the traditional air-cooled servers. In this experiment, a 1U server is equipped with cold plate to cool the CPUs while the rest of the components are cooled by fans. In this study, predictive fan and pump failure analysis are performed which also helps to explore the options for redundancy and to reduce the cooling cost by improving cooling efficiency. Redundancy requires the knowledge of planned and unplanned system failures. As the main heat generating components are cooled by liquid, warm water cooling can be employed to observe the effects of raised inlet conditions in a hybrid cooled server with failure scenarios. The ASHRAE guidance class W4 for liquid cooling is chosen for our experiment to operate in a range from 25°C – 45°C. The experiments are conducted separately for the pump and fan failure scenarios. Computational load of idle, 10%, 30%, 50%, 70% and 98% are applied while powering only one pump and the miniature dry cooler fans are controlled externally to maintain constant inlet temperature of the coolant. As the rest of components such as DIMMs & PCH are cooled by air, maximum utilization for memory is applied while reducing the number fans in each case for fan failure scenario. The components temperatures and power consumption are recorded in each case for performance analysis 
    more » « less
  2. Abstract The adoption of Single-phase Liquid Immersion Cooling (Sp-LIC) for Information Technology equipment provides an excellent cooling platform coupled with significant energy savings. There are, however, very limited studies related to the reliability of such cooling technology. The Accelerated Thermal Cycling (ATC) test given ATC JEDEC is relevant just for air cooling but there is no such standard for immersion cooling. The ASTM benchmark D3455 with some appropriate adjustments was adopted to test the material compatibility because of the air and dielectric fluid differences in the heat capacitance property and corresponding ramp rate during thermal cycling. For this study, accelerated thermal degradation of the printed circuit board (PCB), passive components, and fiber optic cables submerged in air, white mineral oil, and synthetic fluid at a hoisted temperature of 45C and 35% humidity is undertaken. This paper serves multiple purposes including designing experiments, testing and evaluating material compatibility of PCB, passive components, and optical fibers in different hydrocarbon oils for single-phase immersion cooling. Samples of different materials were immersed in different hydrocarbon oils and air and kept in an environmental chamber at 45C for a total of 288 hours. Samples were then evaluated for their mechanical and electrical properties using Dynamic Mechanical Analyzer (DMA) and a multimeter, respectively. The cross-sections of some samples were also investigated for their structural integrity using SEM. The literature gathered on the subject and quantifiable data gathered by the authors provide the primary basis for this research document. 
    more » « less
  3. Abstract Airside economizers lower the operating cost of data centers by reducing or eliminating mechanical cooling. It, however, increases the risk of reliability degradation of information technology (IT) equipment due to contaminants. IT Equipment manufacturers have tested equipment performance and guarantee the reliability of their equipment in environments within ISA 71.04-2013 severity level G1 and the ASHRAE recommended temperature-relative humidity (RH) envelope. IT Equipment manufacturers require data center operators to meet all the specified conditions consistently before fulfilling warranty on equipment failure. To determine the reliability of electronic hardware in higher severity conditions, field data obtained from real data centers are required. In this study, a corrosion classification coupon experiment as per ISA 71.04-2013 was performed to determine the severity level of a research data center (RDC) located in an industrial area of hot and humid Dallas. The temperature-RH excursions were analyzed based on time series and weather data bin analysis using trend data for the duration of operation. After some period, a failure was recorded on two power distribution units (PDUs) located in the hot aisle. The damaged hardware and other hardware were evaluated, and cumulative corrosion damage study was carried out. The hypothetical estimation of the end of life of components is provided to determine free air-cooling hours for the site. There was no failure of even a single server operated with fresh air-cooling shows that using evaporative/free air cooling is not detrimental to IT equipment reliability. This study, however, must be repeated in other geographical locations to determine if the contamination effect is location dependent. 
    more » « less
  4. As the demand for faster and more reliable data processing is increasing in our daily lives, the power consumption of electronics and, correspondingly, Data Centers (DCs), also increases. It has been estimated that about 40% of this DCs power consumption is merely consumed by the cooling systems. A responsive and efficient cooling system would not only save energy and space but would also protect electronic devices and help enhance their performance. Although air cooling offers a simple and convenient solution for Electronic Thermal Management (ETM), it lacks the capacity to overcome higher heat flux rates. Liquid cooling techniques, on the other hand, have gained high attention due to their potential in overcoming higher thermal loads generated by small chip sizes. In the present work, one of the most commonly used liquid cooling techniques is investigated based on various conditions. The performance of liquid-to-liquid heat exchange is studied under multi-leveled thermal loads. Coolant Supply Temperature (CST) stability and case temperature uniformity on the Thermal Test Vehicles (TTVs) are the target indicators of the system performance in this study. This study was carried out experimentally using a rack-mount Coolant Distribution Unit (CDU) attached to primary and secondary cooling loops in a multi-server rack. The effect of various selected control settings on the aforementioned indicators is presented. Results show that the most impactful PID parameter when it comes to fluctuation reduction is the integral (reset) coefficient (IC). It is also concluded that fluctuation with amplitudes lower than 1 ᵒC is converged into higher amplitudes 
    more » « less
  5. Over the past few years, there has been an ever increasing rise in energy consumption by IT equipment in Data Centers. Thus, the need to minimize the environmental impact of Data Centers by optimizing energy consumption and material use is increasing. In 2011, the Open Compute Project was started which was aimed at sharing specifications and best practices with the community for highly energy efficient and economical data centers. The first Open Compute Server was the ‘ Freedom’ Server. It was a vanity free design and was completely custom designed using minimum number of components and was deployed in a data center in Prineville, Oregon. Within the first few months of operation, considerable amount of energy and cost savings were observed. Since then, progressive generations of Open Compute servers have been introduced. Initially, the servers used for compute purposes mainly had a 2 socket architecture. In 2015, the Yosemite Open Compute Server was introduced which was suited for higher compute capacity. Yosemite has a system on a chip architecture having four CPUs per sled providing a significant improvement in performance per watt over the previous generations. This study mainly focuses on air flow optimization in Yosemite platform to improve its overall cooling performance. Commercially available CFD tools have made it possible to do the thermal modeling of these servers and predict their efficiency. A detailed server model is generated using a CFD tool and its optimization has been done to improve the air flow characteristics in the server. Thermal model of the improved design is compared to the existing design to show the impact of air flow optimization on flow rates and flow speeds which in turn affects CPU die temperatures and cooling power consumption and thus, impacting the overall cooling performance of the Yosemite platform. Emphasis is given on effective utilization of fans in the server as compared to the original design and improving air flow characteristics inside the server via improved ducting. 
    more » « less