Liquid Cooling Technology Ushers in a New Era of AI Server Cooling


Power Consumption Surges, Air Cooling Soon to Be Outdated
In traditional general-purpose servers, the CPU is often the main contributor to computing power and also the "power hog" of the entire system. However, since entering the era of AI servers, AI accelerators such as GPUs and TPUs have become the primary sources of computing power. Although this architecturally reduces CPU power consumption, the power consumption per server rack has only increased.
For instance, NVIDIA released the new-generation AI superchip GB200 this year, along with the GB200 NVL72 single-rack solution based on this chip. GB200 NVL72 represents NVIDIA's first server solution that fully embraces liquid cooling, given that the configuration of 36 CPUs + 72 GPUs exceeds 100kW in power consumption.
Traditional air-cooling solutions struggle to maintain high cooling efficiency in the face of such high overall system power consumption, adversely impacting the power usage effectiveness (PUE) of entire AI data centers. In response to explicit government regulations requiring new data centers to reduce PUE to below 1.1, the adoption of liquid cooling for large-scale computing hubs above the megawatt level is imperative.
Currently, the two most common liquid cooling solutions are liquid cold plate and immersion cooling. The former adopts a non-contact approach, which is cost-effective and requires minimal modifications to existing equipment, facilitating easy installation and maintenance. As for immersion liquid cooling, it is more suitable for scenarios requiring higher cooling performance, especially for ultra-high-power AI servers. However, this solution is costlier and requires specialized immersion racks and coolants. One of the reasons many laser welding enterprises hesitate is precisely the higher processing requirements for liquid-cooled racks, especially in laser welding.

Unparalleled Welding Requirements for Liquid-Cooled Racks
Server racks are the fundamental structural components in data center rooms used to house servers and other equipment. Taking NVIDIA's GB200 NVL72 as an example, individual AI server racks now integrate more components, with increasing power densities and loads. Although the cold plate liquid cooling solution can significantly reduce the weight of the cooling system, racks above the 40U specification still face loads exceeding 1500kg. Such weight places high demands on welding quality to prevent deformation or loosening under high loads.
In traditional server rack welding processes, welding slag generated after welding should be removed. However, to improve processing efficiency, the seamless welding process compatible with welding robots is more worth adopting. This welding process not only ensures robust and high-strength welding but also produces surfaces without welding scars or marks, eliminating the need for polishing.
In liquid-cooled servers, due to the high spatial efficiency of liquid cooling solutions, more rack space can be allocated to computing equipment, making the internal design more compact. Meanwhile, to ensure the compatibility of server modules, i.e., that equipment produced by different manufacturers can fit into standard rack systems, the industry has established size standards such as 1U, 2U, 19 inches, and 21 inches, with organizations like the Open Compute Project (OCP) proposing design standards for racks. These factors collectively raise the bar for bending and welding accuracy.
Moreover, the welding of liquid-cooled racks must achieve absolute sealant to prevent coolant leakage, especially in immersion-type direct-contact liquid cooling solutions. Therefore, welding points must be free of pores and cracks to ensure unimpeded coolant flow and heat dissipation within the system.
Lastly, the inspection of welding quality cannot be overlooked, such as pressure tests to verify welding strength and X-ray or ultrasonic testing of weld seams to ensure sealant. Only by meeting these complex welding requirements can liquid-cooled AI servers operate safely and reliably.

In the future, with the continuous development of AI technology and the ongoing expansion of data center scale, liquid cooling technology will become the preferred cooling solution for more data centers. Lori, with nearly a decade of focus on the cooling field, can customize liquid cooling solutions for your data centers, joining you in welcoming the new era of cooling led by liquid cooling technology.

评论

此博客中的热门博文

How does a CNC machining center ensure precision for mass-produced products?

How to deal with the plastic injection molding raw material is not filled?

Control of Injection Molding Parts Processing Temperature