skip to main content

Title: Impact of Static Pressure Differential Between Supply Air and Return Exhaust on Server Level Performance
Modern Information Technology (IT) servers are typically assumed to operate in quiescent conditions with almost zero static pressure differentials between inlet and exhaust. However, when operating in a data center containment system the IT equipment thermal status is a strong function of the non- homogenous environment of the air space, IT utilization workloads and the overall facility cooling system design. To implement a dynamic and interfaced cooling solution, the interdependencies of variabilities between the chassis, rack and room level must be determined. In this paper, the effect of positive as well as negative static pressure differential between inlet and outlet of servers on thermal performance, fan control schemes, the direction of air flow through the servers as well as fan energy consumption within a server is observed at the chassis level. In this study, a web server with internal air-flow paths segregated into two separate streams, each having dedicated fan/group of fans within the chassis, is operated over a range of static pressure differential across the server. Experiments were conducted to observe the steady-state temperatures of CPUs and fan power consumption. Furthermore, the server fan speed control scheme’s transient response to a typical peak in IT computational workload while operating at negative pressure differentials across the server is reported. The effects of the internal air flow paths within the chassis is studied through experimental testing and simulations for flow visualization. The results indicate that at higher positive differential pressures across the server, increasing server fans speeds will have minimal impact on the cooling of the system. On the contrary, at lower, negative differential pressure server fan power becomes strongly dependent on operating pressure differential. More importantly, it is shown that an imbalance of flow impedances in internal airflow paths and fan control logic can onset recirculation of exhaust air within the server. For accurate prediction of airflow in cases where negative pressure differential exists, this study proposes an extended fan performance curve instead of a regular fan performance curve to be applied as a fan boundary condition for Computational Fluid Dynamics simulations.  more » « less
Award ID(s):
Author(s) / Creator(s):
; ; ;
Date Published:
Journal Name:
2018 17th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm)
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. There are various designs for segregating hot and cold air in data centers such as cold aisle containment (CAC), hot aisle containment (HAC), and chimney exhaust rack. These containment systems have different characteristics and impose various conditions on the information technology equipment (ITE). One common issue in HAC systems is the pressure buildup inside the HAC (known as backpressure). Backpressure also can be present in CAC systems in case of airflow imbalances. Hot air recirculation, limited cooling airflow rate in servers, and reversed flow through ITE with weaker fan systems (e.g. network switches) are some known consequences of backpressure. Currently there is a lack of experimental data on the interdependency between overall performance of ITE and its internal design when a backpressure is imposed on ITE. In this paper, three commercial 2-rack unit (RU) servers with different internal designs from various generations and performance levels are tested and analyzed under various environmental conditions. Smoke tests and thermal imaging are implemented to study the airflow patterns inside the tested equipment. In addition, the impact leak of hot air into ITE on the fan speed and the power consumption of ITE is studied. Furthermore, the cause of the discrepancy between measured inlet temperatures by internal intelligent platform management interface (IPMI) and external sensors is investigated. It is found that arrangement of fans, segregation of space upstream and downstream of fans, leakage paths, location of sensors of baseboard management controller (BMC) and presence of backpressure can have a significant impact on ITE power and cooling efficiency. 
    more » « less
  2. In recent years, there have been phenomenal increases in Artificial Intelligence and Machine Learning that require data collection, mining and using data sets to teach computers certain things to learn, analyze image and speech recognition. Machine Learning tasks require a lot of computing power to carry out numerous calculations. Therefore, most servers are powered by Graphics Processing Units (GPUs) instead of traditional CPUs. GPUs provide more computational throughput per dollar spent than traditional CPUs. Open Compute Servers forum has introduced the state-of-the-art machine learning servers “Big Sur” recently. Big Sur unit consists of 4OU (OpenU) chassis housing eight NVidia Tesla M40 GPUs and two CPUs along with SSD storage and hot-swappable fans at the rear. Management of the airflow is a critical requirement in the implementation of air cooling for rack mount servers to ensure that all components, especially critical devices such as CPUs and GPUs, receive adequate flow as per requirement. In addition, component locations within the chassis play a vital role in the passage of airflow and affect the overall system resistance. In this paper, sizeable improvement in chassis ducting is targeted to counteract effects of air diffusion at the rear of air flow duct in “Big Sur” Open Compute machine learning server wherein GPUs are located directly downstream from CPUs. A CFD simulation of the detailed server model is performed with the objective of understanding the effect of air flow bypass on GPU die temperatures and fan power consumption. The cumulative effect was studied by simulations to see improvements in fan power consumption by the server. The reduction in acoustics noise levels caused by server fans is also discussed. 
    more » « less
  3. In recent years, various airflow containment systems have been deployed in data centers to improve the cooling efficiency by minimizing the mixing of hot and cold air streams. The goal of this study is the experimental investigation of passive and active hot aisle containment (HAC) systems. Also investigated, will be the dynamic interaction between HAC and information technology equipment (ITE). In addition, various provisioning levels of HAC are studied. In this study, a chimney exhaust rack (CER) is considered as the HAC system. The rack is populated by 22 commercial 2-RU servers and one network switch. Four scenarios with and without the presence of cold and hot aisle containments are investigated and compared. The transient pressure build-up inside the rack, servers' fan speed, inlet air temperatures (IAT), IT power consumption, and CPU temperatures are monitored and operating data recorded. In addition, IAT of selected servers is measured using external temperature sensors and compared with data available via the Intelligent Platform Management Interface (IPMI). To the best of authors' knowledge, this is the first experimental study in which a HAC system is analyzed using commercial ITE in a white space. It is observed that presence of backpressure can lead to a false high IPMI IAT reading. Consequently, a cascade rise in servers' fan speed is observed, which increases the backpressure and worsen the situation. As a result, the thermal performance of ITE and power consumption of the rack are affected. Furthermore, it is shown that the backpressure can affect the accuracy of common data center efficiency metrics. 
    more » « less
  4. In typical data centers, the servers and IT equipment are cooled by air and almost half of total IT power is dedicated to cooling. Hybrid cooling is a combined cooling technology with both air and water, where the main heat generating components are cooled by water or water-based coolants and rest of the components are cooled by air supplied by CRAC or CRAH. Retrofitting the air-cooled servers with cold plates and pumps has the advantage over thermal management of CPUs and other high heat generating components. In a typical 1U server, the CPUs were retrofitted with cold plates and the server tested with raised coolant inlet conditions. The study showed the server can operate with maximum utilization for CPUs, DIMMs, and PCH for inlet coolant temperature from 25–45 °C following the ASHRAE guidelines. The server was also tested for failure scenarios of the pumps and fans with reducing numbers of fans and pumps. To reduce cooling power consumption at the facility level and increase air-side economizer hours, the hybrid cooled server can be operated at raised inlet air temperatures. The trade-off in energy savings at the facility level due to raising the inlet air temperatures versus the possible increase in server fan power and component temperatures is investigated. A detailed CFD analysis with a minimum number of server fans can provide a way to find an operating range of inlet air temperature for a hybrid cooled server. Changes in the model are carried out in 6SigmaET for an individual server and compared to the experimental data to validate the model. The results from this study can be helpful in determining the room level operating set points for data centers housing hybrid cooled server racks. 
    more » « less
  5. During the lifespan of a data center, power outages and blower cooling failures are common occurrences. Given that data centers have a vital role in modern life, it is especially important to understand these failures and their effects. A previous study [16] showed that cold aisle containment might have a negative impact on IT equipment uptime during a blower failure. This new study further analyzed the impact of containment on IT equipment uptime during a CRAH blower failure. It also compared the IT equipment performance both with and without a pressure relief mechanism implemented in the containment system. The results show that the effect of implementing pressure relief in containment solution on the IT equipment performance and response could vary and depend on the server's airflow, generation and hence types of servers deployed in cold aisle enclosure. The results also showed that when compared to the discrete sensors, the IPMI inlet temperature sensors underestimate the Ride Through Time (RTT) by 32%. This means that the RTT calculations based on the IPMI inlet sensors may be inaccurate due to variations in the sensor readings; as they exist today; in these servers. as discussed in a previous study [26]. Additionally, it was shown that all Dell PowerEdge 2950 servers have a similar IPMI inlet temperature reading, regardless of mounting location. As external system resistance increases during cooling failure, the servers exhibit internal recirculation through their weaker power supply fans, which is reflected in the high IPMI inlet temperature readings. For this server specifically, a pressure relief mechanism reduces the external resistance, thereby eliminating internal recirculation and resulting in lower IPMI inlet temperature readings. This in turn translates to a lower RTT. However, pressure relief showed conflicting results where the discrete sensors showed an increase in inlet temperature when pressure relief was introduced, thereby reducing the RTT. The CPU temperatures conformed with the discrete sensor data, indicating that containment helped increase the RTT of the servers during failure. 
    more » « less