Abstract

Through the application of cryogenic cooling via liquid nitrogen (LN2), the power consumption of a CPU was substantially reduced. Using a digitally controlled solenoid valve and an additively manufactured cold plate, the manual process of LN2 cooling was automated for precise control of cold plate temperature. The power consumption and frequency relationship of the processor were established across three different thermal solutions to demonstrate the effect of temperature on this relationship. It was found that power consumption of the processor decreased at lower temperatures due to a reduction in current leakage and the core voltage necessary for stable operation. This culminated in a reduction of up to 10.7% in processor power consumption for the automated solution and 21.5% for the manual LN2 solution when compared to the air-cooled baseline. Due to the binary nature of the solenoid valve used, flow rate was tuned via an in-line needle valve to increase thermal stability. It was found that for lower flow rates, approximately 5.0 g/s, temperatures oscillated within a range of ±11.5 °C while for higher flow rates of 10–12 g/s, generated amplitudes are as small as ±3.5 °C. Additionally, several tests measured the rate of LN2 consumption and found that the automated solution used 230%–280% more coolant than the manual thermal solution, implying there is room for improvement in the cold plate geometry, LN2 vapor exhaust design, and coolant delivery optimization.

1 Introduction and Literature Review

As server hardware has developed over the past few decades, the methods of cooling have evolved to meet the demands of modern data centers. While many data centers still rely on traditional air-cooled systems, some are moving toward water-cooled servers, which take advantage of water's high heat capacity to improve cooling and computational density [1]. In recent years, several companies have pushed this trend in cooling technologies further by building data centers using single phase or two-phase immersion-cooled systems [2]. In all cases, the goal of these new technologies is to reduce the power consumption of the cooling system while improving power and computational density. Power usage effectiveness (PUE) is an often-used metric in assessing the overall data center efficiency, and it is defined as the ratio of total power consumption and Information Technology power consumption [3]. PUE has been applied to measuring the effectiveness of chassis cooling on a partial PUE (pPUE) scale [46], at the rack level [79], and at the building scale [1013]. Topics such as air flow management in the data center have been explored [14,15], how to adjust louvers dynamically to address transient loading within the data center [16], and more closely marrying the thermal management solution with the construction of the rack [17]. While leakage current impacts have been modeled [18], adjusting characteristics at the chip basic input/output system (BIOS) level, i.e., overclocking, is generally not available to researchers as it is a function turned off by chip manufacturers [19]. Additionally, dynamic cryogenic temperature monitoring of chips is not available as on-chip temperature measurement capabilities generally do not go below freezing [20]. The potential of cooling in these systems to improve computational efficiency or speed on an individual socket scale is often neglected, as seen by the low adoption rate of subambient cooling in datacenters. Therefore, it was the goal of the current study to quantify the potential computational performance and efficiency gains that can be acquired through cryogenic cooling on an individual processor.

Most data center efficiency and performance metrics view individual CPU power consumption as static. However, any given CPU can function on a wide spectrum of operating frequencies and power draw characteristics. This is often overlooked because the manufacturer of the CPU sets a target frequency and operating voltage aimed at maximizing efficiency and reliability. The current study implements LN2 cooling to create the thermal headroom needed to unlock performance beyond these supplier-mandated settings. Through the process of overclocking, a user can adjust these parameters, such as gate voltage and frequency, to improve a given performance metric, such as compute performance. However, power consumption, operating temperature, and stability are often impacted unless procedures, often labor-intensive in the case of manual overclocking, are implemented to ensure these parameters are not catastrophically altered. This process has been used by hobbyists and computer enthusiasts in combination with LN2-based cooling systems to greatly improve compute performance at the cost of power consumption and reliability. For this reason, overclocking is largely ignored by the data center community and the commercial sector as a whole [21]. It is important to note that at the moment, most processors and motherboards do not support overclocking. This study aims to look at the potential of cooling on a system as if a manufacturer lifted restrictions or made custom hardware by using a consumer processor with few restrictions, but not to go as far to speculate on new architectures or similar changes specifically for cryogenic use.

In this study, an automated LN2 delivery system was developed in combination with an additively manufactured cold plate to cool a single overclocked CPU to cryogenic temperatures. The processor was then stress tested under different voltage and frequency settings to map out the optimum operating parameters needed to achieve low power consumption and a high operating frequency. The same processor underwent these tests while being cooled with a consumer grade air-cooled device to highlight the difference between cryogenic and more conventional cooling methods. Finally, overclocking with an enthusiast LN2 system, referred to as a LN2 pot, was done to compare the potential overclocking effects of cryogenic cooling to the automated system developed for this study.

Generally speaking, overclocking in combination with cryogenic cooling has rarely been studied in a reliable manner. While hobbyists have been using LN2 to overclock since the late 1990s, as a competitive hobby few focus on efficiency or reliability as they are usually trying to achieve the highest operating frequency for a given piece of hardware, even if it is just for several seconds. However, individual aspects of overclocking have been studied. In one case, by increasing the front side bus frequency, the researchers were able to decrease the computation time for a genetic algorithm by 20% [22]. While this study did not mention power consumption or temperature, it highlights the ability of overclocking to yield increased computational power without purchasing a more expensive CPU. An approach much more broadly studied is the reduction of input voltage to the processor which in turn leads to reduced power consumption. Statically setting a voltage input target in multicore advanced reduced instruction set computer machine processors was found to reduce power consumption between 22.5% and 25.3% on average without reducing computational performance [23]. In more modern x86 processors, a dynamic undervolting governor was developed which takes into account CPU utilization and other input parameters to dynamically lower target voltages, which reduced power consumption by 42.68% and 34.37% in Intel Skylake and Haswell processors, respectively, when compared to the standard Intel voltage governor [24].

The relation between a processor's temperature and its power consumption is a large aspect to overclocking. Power consumption can be divided into dynamic and static draw, with static power containing leakage current. This leakage power consumption has been shown to be up to 30% of total package power draw [25], and it increases at a rate of around 2X per 100 °C increase in package temperature [26]. Additionally, the voltage requirement of transistors can also be reduced when the operating temperature is lowered. Utilizing this approach to optimize transistor voltage, Intel's thermal velocity boost technology has allowed a processor to run 100 or 200 MHz faster if the junction temperature is maintained below 50 °C or 70 °C, depending on the processor [27].

There are many metrics by which to evaluate a server or datacenter's efficiency in relation. Defining efficiency in terms of PUE is a rather holistic approach, as it accounts for all components within a rack and more, such as lighting. The effect of operating temperature on PUE has been more widely studied, with results such as by Patterson finding that decreasing operating temperature adds to the cost of system cooling faster than the effect on leakage can reduce Information Technology power consumption [28]. On a system level, performance per watt (PPW) can be used to compare computational power to system power consumption [29]. This is more often used in regard to workload or architecture efficiency [30], but in a fixed architecture and workload experiment, this metric highlights the efficiency of different overclocking profiles.

With the reduction in node size for modern server hardware, cooling solutions that were once considered novel are becoming more conventional. A study by the Integral Group concluded that traditional air-cooling systems would not be able to support more than 10 kW per rack in data centers, driving the need for new technologies to address current rack power consumptions that are at least one order of magnitude higher [31]. Computational fluid dynamics simulations by Ohadi et al. compared air-cooling, single-phase liquid, and two-phase liquid cooling to determine flow rates and inlet temperatures needed to maintain a chip temperature of 90 °C. The simulations demonstrated that water cooling allows for higher inlet temperatures while cooling with a lower thermal resistance. It was also shown that two-phase cooling allowed for even higher temperatures and a lower thermal resistance than water while requiring a fraction of the pumping power typical of single-phase solutions. These results have led to the proposal and experimentation of high-density pool boiling systems using dielectric fluid, which can support power densities of 130 kW/m2, a density that is over twice that of a typical air-cooled system (approximately 52 kW/m2) [2,31].

2 Experimental Facility

2.1 Liquid Nitrogen Delivery.

Figure 1 is a schematic for the LN2 coolant delivery system. The LN2 originates at a large Dewar that selfregulates to a low pressure, ranging from 35 to 50 pounds per square inch depending on the tank. A vacuum jacketed hose three feet (0.9 m) in length connects the tank to a solenoid and needle valve. After the needle valve, there is an additional six feet of vacuum-jacketed hose extending to the inlet port of the CPU cold plate. Vacuum-jacketed hoses are used to minimize heat ingress from the surrounding air as flashing of the cryogenic LN2 will occur if the tube wall interior surfaces were to reach a temperature above −196 °C. Vapor generation prior to injection into the cold plate is considered wasted performance and can hamper consistent heat transfer. In addition to providing low temperatures, the cryogenic fluid is also meant to change phase within the cold plate extracting tremendous heat through latent energy exchange. The vacuum jacketed hose keeps as much of this coolant in the liquid phase prior to insertion into the cold plate as possible. Excess vapor within the cold plate caused by upstream flashing will also cause instabilities in the ejection of the spent vapor around the periphery of the cold plate.

Fig. 1
Illustration of liquid nitrogen delivery system plumbing
Fig. 1
Illustration of liquid nitrogen delivery system plumbing

Given the potential for wide variations in flow rate due to the reliance on a pressurized tank to generate flow, tests were run to quantify the flow rates that could be expected and how they may change. A steady power draw was applied to the processor as it would be in future tests and the solenoid cycled open and closed to maintain an 85 °C temperature in the cold plate. Over the course of half an hour, the net flow rate based on the weight change of the nitrogen Dewar and the time the solenoid spent open were used to calculate a net coolant mass flow rate. This was done across two tanks, both tested near empty and near full. The needle valve was adjusted between tests to replicate the flow rates that may be used in future tests. As seen in Fig. 2, this resulted in the conclusion that variation due to tank pressure and weight has little effect on the coolant flow rate, thus making tests that target a specific flow rate possible to run.

Fig. 2
Flow rate calibration under steady power draw
Fig. 2
Flow rate calibration under steady power draw

2.2 Heat Exchangers.

The cold plate is made with additive manufacturing using a 60% stainless steel and 40% bronze matrix with a reported thermal conductivity of 22.6 W/m°K. The gas and LN2 mixture enters through a 6.35 mm fitting located 36 mm above boiling area, as shown in Fig. 3(a). The circular pins are 12.7 mm tall with eight exhaust holes 2.8 mm in diameter on each wall. The exhaust ports are angled at 45 deg, which is normal to the pyramidal exterior surface. The boiling surface area measures 57 mm × 57 mm (W × D) with a grid pattern of 3.2 mm diameter pins spaced 6.35 mm apart, as shown in Fig. 3(b). For temperature measurements, a 1.6 mm × 1.6 mm (W × H) slot was milled on the underside of the cold plate, allowing a type-T thermocouple 1.5 mm in diameter to be located directly above the integrated heat spreader of the processor. These measurements are summarized with the final product shown in Fig. 3(c). The thermocouple was calibrated in LN2, dry ice, and ice water for a three-point calibration that captured all anticipated cryogenic temperatures.

Fig. 3
From left to right, additive cold plate vertical cross section with measurements in mm (a), horizontal cross section with measurements in mm (b), and picture of the cold plate (c)
Fig. 3
From left to right, additive cold plate vertical cross section with measurements in mm (a), horizontal cross section with measurements in mm (b), and picture of the cold plate (c)

For a point of comparison, two other cooling devices were used. A Noctua model NHD-15 CPU Cooler was used for comparison against a more conventional thermal management approach. It relies on six heat pipes connected to aluminum fins and a single 140- mm diameter fan to provide cooling to the processor. While it is larger and more expensive than heat sinks found in a server, it acts as a best case solution that exchanges heat with the ambient air. For the manual option, LN2 was poured periodically into a Kingpin Cooling T-REX LN2 Pot, a thermal solution made by the overclocker Kingpin based in Taiwan. This device features a large copper block with twenty 12.7 mm holes drilled to various depths, as shown in Fig. 4. A user then regulates temperature by pouring LN2 from a thermos into the pot. The operator controls the depth of the pool boil to in turn control the temperature, as a pool approximately 1 cm in depth maintains a steady thermal state for a processor drawing 150 watts. If the pot increases in temperature above the target, the operator will increase the fill level to remove more heat than the processor produces, thus reducing the pot temperature. Alternatively, the pot can become too cold if the processor draws less power than expected or if the user pours too much LN2. To counteract, the operator must introduce more heat through a blow torch or other source.

Fig. 4
LN2 pot without mounting hardware installed on motherboard
Fig. 4
LN2 pot without mounting hardware installed on motherboard

2.3 System Enclosure.

The computer motherboard and cold plate are placed inside a clear polycarbonate box, approximately 400 mm × 355 mm × 100 mm (W × D × H). Three holes were drilled in the lid to allow for entry of power cables and the vacuum jacketed hose attached to the cold plate as well as required I/O (USB, video cables, thermocouple wire, etc.). The tank is then filled to a depth of approximately 30 mm with a PF-5060 dielectric. Water is immiscible with dielectric PF-5060 and is less dense. Therefore, any ice/water formation near or on the cold plate is electrically isolated from sensitive electronic components on the motherboard. Any ice/water that detaches from the horizontally mounted cold plate is then driven upward through buoyancy which dynamically protects the components from harm. Condensation and ice formation are two of the main issues facing overclocking hobbyists in this space, and lessons learned from this community have inspired the integration of electrically insulating dielectric fluid into the experimental design. The dielectric fluid has not shown signs of freezing with measured bath temperature as low as −58 °C. Preferential orientation of a motherboard to prevent failures from interaction with conductive coolants is not necessarily unique to this approach [32]. However, this is a design feature that must be intelligently incorporated and optimized into future iterations of the chassis-level packaging for LN2-based thermal management solutions.

2.4 Computer Equipment.

The processor used is an Intel i9-9900KS. It was chosen for its potentially high power consumption and power density, as it has a base thermal power design of 95 watts but can easily exceed 250 watts under certain workloads. Additionally, the thermal interface between the die and integrated heat spreader is an indium solder alloy, which allows for lower thermal conduction resistance at cryogenic temperatures than the polymer pastes used in previous generations of processors. The motherboard is an EVGA Z390 Dark, which was chosen from its stable power delivery through twelve phases and the accessibility of hardware control not often found outside of custom printed circuit boards (PCBs). All tests were done on the version 0.92 XOC BIOS as this greatly increased current and voltage limits often reached while overclocking at cryogenic temperatures.

2.5 Procedures.

In all overclocking tests Prime95 version 29.8 [33] was used to generate an artificial workload that delivered near constant power draw and stability through the smallest FFTs setting, a predefined stress test within Prime95 intended to recreate a power draw similar to worst case scenarios. The target temperature of the cold plate was set to −85 °C unless otherwise stated. In the BIOS, the core voltage (VCore) was statically set, the stability test was then run in Prime95 for 5 min and, upon completion, VCore was then reduced by 10 mv. This was repeated until the minimum voltage necessary to achieve stability at the given frequency was found. Multiple frequencies were tested through adjusting the core multiplier for all eight cores. Each data point was tested five times to ensure that the voltage required and the power draw resulting from it was consistent. The standard deviation of each point consisting of five tests was taken to produce the error bars seen in Figs. 5 and 6. All points have an error bar, but many are obscured by the marker, as error was often less than 1 Watt.

Fig. 5
Plot of processor frequency and power relationship. Automated LN2 distribution achieves better performance and extends the operational range of the stock chipset.
Fig. 5
Plot of processor frequency and power relationship. Automated LN2 distribution achieves better performance and extends the operational range of the stock chipset.
Fig. 6
Plot of performance per watt versus CPU frequency. Manual operation extends performance beyond the automated system but with extra operational overhead. The difference represents an margin of opportunity for improved system performance with future optimization of coolant distribution design both upstream of and within the cold plate.
Fig. 6
Plot of performance per watt versus CPU frequency. Manual operation extends performance beyond the automated system but with extra operational overhead. The difference represents an margin of opportunity for improved system performance with future optimization of coolant distribution design both upstream of and within the cold plate.

In tests aimed to measure LN2 consumption rates, a power draw of 160 Watts was produced, and the weight of the LN2 supply tank was measured with a scale every five minutes over the course of twenty minutes. The change in weight over this time was used with the liquid density of the fluid to determine net coolant usage over time. For periodic instantaneous flow rate measurements, LabVIEW recorded the amount of time the solenoid was switched open. Mass change of coolant in the tank over that duration along with the liquid density was used to determine a time-averaged flow rate over that injection period. Similar to the overlocking procedures, the flow rate and consumption-based tests were repeated for five data sets, and the resulting standard deviation was used to create error bars.

3 Experimental Results and Discussion

3.1 Overclocking Performance.

In this study the relationship between minimum core voltage and core frequency was tested for this particular processor (Intel i9-9900KS), allowing a power-frequency curve to be established for each thermal solution tested. In testing, the processor was capable of running at a lower voltage when cooled cryogenically, which, in combination with the reduced current leakage resulting from the lowered operating temperatures, reduced power consumption compared to the air-cooled unit, as shown in Fig. 5 [26]. Within the range of 4.0–4.8 GHz, cooling from the LN2 pot allowed the processor to draw between 10.6% and 21.5% less power than the air-cooled solution, with the difference in power consumption growing as the frequency increases. In comparison, the automated liquid nitrogen solution performed worse, as it only reduced power consumption by 0%–10.6% compared to the air-cooled system although the same trend of increased efficiency with respect to frequency increase is still apparent. This relative decrease in performance is likely due to the higher processor temperature for the air-cooled solution as the pot and automated system both have a plate temperature of −85 °C. At first glance, it is disappointing that the manual solution performed better than the automated solution. However, it must be noted that under manual condition, there is an experienced overclocker constantly tending to the system, checking temperatures, managing settings, and pouring coolant as needed. This degree of human interaction is unsustainable at the data center scale, but future design work is targeted at programing in the conditions/settings/actions that an experienced user would implement in order to attain truly optimal, and in some cases record setting, performance.

Schedule and cost were of utmost importance for this project, so the cold plate was manufactured with a powder that is known to print reliably for a wide variety of geometries, and the internal pin fin design was kept simple in order to prevent schedule-robbing metal printer failures during manufacture. Future work will design the cold plate out of much more thermally conductive copper material, which is the same material used to fabricate the better performing copper pot used in manual operation. Under this pretext, it is evident why the manual solution performs the best but introduces a target metric for improvement, namely, the difference between the manual (circular markers in Fig. 5) and automated approach (diamond markers in Fig. 5). Overclockers have been cooling energy-dense chips for decades using cryogenic fluids, but this is the first study that has sought to integrate this thermal management approach into a packaged solution for datacenters. With a focus on packaging this approach into a manufacturable unit, software/architecture modifications were out of scope. However, close codesign efforts with software engineers are critical if this approach is to gain more traction in the industry. As an example, 41% improvement in single-thread performance is possible with architectures optimized specifically for cryogenic cooling [34]. The study that optimized the architecture used an open-pot for cryogenic cooling delivery, which translates to the current study's “manual” option. Therefore, it is clear that even further gains in performance can be had by efforts at improving the software for the current study's “automated” solution. This current study appears to be the first attempt at systematic analysis of a closed LN2 cold plate for high performance processors as, previously, hobbyists were simply pouring LN2 into an open-lid copper reservoir thermally attached to a processor, like what was done in the software-focused study by Byun et al. [34]. Building this cold plate with additive manufacturing will also allow for future iterations to fully exploit performance enhancing features that are not possible with more conventional manufacturing techniques, specifically as it pertains to functional internal geometries that are hermetically sealed from the outside environment.

In addition to manual overclocked settings, each cooling solution was tested with stock settings in which the processor determines its core voltage from a predetermined table based primarily on frequency. The air-cooled solution was not able to cool the processor sufficiently under this working load, resulting in the processor thermal throttling. It reduced its frequency and voltage in order to reduce power consumption until it was operating under the maximum junction temperature of 100 °C. Both cryogenic solutions stayed under 100 °C allowing them to achieve the maximum boost frequency of 5.0 GHz. The pot-cooled processor drew 192 watts, and the automated system consumed 218 watts during stock setting operation. These single-setting results are shown as individual data points on Fig. 5. As these manual and automated solutions are operating at the same frequency and voltage, the difference in power consumption is most likely a result of reduced current leakage in the better cooled liquid nitrogen pot.

The performance of this processor can also be expressed in a PPW value, which compares data processed in flops to power consumption. This figure allows the efficiency of various overclock settings and cooling solutions to be compared as the processor and workload are constant. As seen in Fig. 6, the PPW decreases as frequency increases because the power consumption growth overwhelms the increase in operating frequency. This is driven by the superlinear increase in core voltage needed to maintain stability at higher frequencies, as found in testing. Given that dynamic power draw is both a function of voltage to the second power and frequency, the power increase per frequency follows an exponential trend, as described in Eq. (1) [35]. This is only a portion of the full power consumption, with the static component scaling linearly with voltage and the short circuit power not increasing with voltage, as shown in Eqs. (2) and (3). This study does not use these equations in an attempt to predict power consumption or derive other constants but merely to verify the trends seen
Ptransistion=α12CV2f
(1)
PCPU=Pstatic+Pshortcircuit+Ptransition
(2)
Ptransistion=(m*V)+(α*Eshortcircuit)+(α12CV2f)
(3)

A runoff effect of added voltage and power is that the operating temperature of the processor will also increase with frequency, thus driving up the temperature-dependent leakage current portion of static power draw as well. As seen in the power consumption plot of Fig. 5, the LN2 pot outperforms the air-cooled solution while the automated system falls in between the air-cooled and manual LN2 pot trends.

3.2 Thermal Performance.

Given that temperature of the base of the cold plate is fixed, the driving force in reducing the processor junction temperature is the conduction and interfacial thermal resistances between the die and thermocouple as well as the heat spreading capability of the cold plate. The solder thermal interface material between the die and integrated heat spreader (IHS) are fixed, thus leaving the difference in performance between cryogenic cooling solutions a result of the second layer of thermal interface material on the top of the IHS and the heat spreading of the cold plate. The target temperature under cryogenic operation is −85 °C, and the processor temperature will vary based on the fixed and in-line thermal resistance of the coldplate/pot to chip interface and heat dissipation. The junction-to-coldplate thermal resistance for the additive cold plate used in the automated liquid nitrogen system was 0.36 °C/W on average over all the frequencies, temperatures, and power draw scenarios examined, while the liquid nitrogen pot was 0.25 °C/W on average. This is most likely a result of low thermal diffusivity of the additive part, resulting in a larger thermal gradient across the cold plate and thus a more concentrated heat flux above the die. While the liquid nitrogen pot was used primarily as a comparison to current techniques for hobbyist manual cryogenic cooling, it was also used as an approximation for what a copper version of the additive cold plate would perform. This approximation can be made as both solutions are temperature controlled by a thermocouple at the base of thermal solution, meaning fin geometry has little impact on performance. The only notable difference between the liquid nitrogen pot and a copper additive cold plate would be the interface surface, as the pot came nickel coated and lapped from the manufacturer. The small difference in thermocouple location could make a small difference in thermal performance, as the pot has an additional 1.5 mm of thickness between the base and the thermocouple. Accounting for this with Fourier's law of heat conduction would account for 0.004–0.022 °C/W resistance increase in the pot depending on the assumed amount of heat spreading occurring in the IHS. The 40% increase in thermal resistance for the cold plate versus the high thermal mass copper pot results in higher operating temperatures during automatic operation. This leads to higher current leakage, core voltage, and ultimately a higher power draw in the automated system, which decreases the system's PPW rating shown previously. Future design work and fabrication of a cold plate within the automated system that uses materials with better thermal performance, copper being one example, will bridge the gap between these two measured thermal resistances.

Temperature stabilities are issues that arise with steady two-phase systems [36,37] and single-phase pulsed systems alike [38,39]. This is compounded by the even greater junction-to-ambient temperature differences inherent to these types of coolants. The binary solenoid-driven pulsed flow results in periodic temperature overshoots and undershoots that are a fundamental artifact of cooling a steady CPU power draw. However, with proper empirical/semi-empirical data input, a variable flow rate system could be calibrated to use the exact optimum flow rate for a given power draw. The ultimate aim in increasing thermal stability for this system is to reduce mechanical stress from repeated expansion and contraction. Overclocking requires that one operate within a given upper and lower temperature operating window. A large temperature swing can push the die temperature out of this window, resulting in a computer crash or “cold bug,” both of which would result in unwanted down time.

The “cold bug” is a phenomenon that occurs in which electrical connections are disrupted at extreme low temperatures, severing data and power connections to the processor, thus causing the motherboard to not recognize it. The leading theory is that differential thermal contraction within the PCB, silicon die, and integrated heat spreader results in the PCB bowing, severing the solder connections to the die. To find an optimum flow rate under prescribed conditions for future empirical/semi-empirical optimization models, a series of tests were conducted at six flow rates and the same processor power consumption of 160 watts. The results are found in Fig. 7. Reducing the flow rate results in temperature overshoots of varying amplitude, in which more time is used to develop flow to the heated surface and deposit the required amount of fluid in the cold plate. As the flow rate increases from the lowest tested, progressing from Figs 7(a)7(d), the amplitude of temperature oscillation increases. A large amount of fluid is able to build up before the thermocouple registers a temperature below the target. This likely results in pool boiling heat transfer exceeding power consumption until too little fluid is left in the cold plate to extract the requisite heat under steady power emission conditions. However, once flow rate exceeds a certain point, shown in the 10.10 g/s state data of Fig. 7(e), temperature stability is greatly increased. This is most likely a result of less vapor build up in the cold plate and hoses, allowing small amounts of liquid nitrogen to be delivered more precisely and consistently. Higher fluid momentum will also break through any vapor build up; a phenomenon seen in other two-phase cold plate applications [40]. While there is a reduction in the magnitude of the temperature swings under these higher flow rate conditions (Figs. 7(e) and 7(f)), this does result in a higher frequency of temperature variation. Future design work is needed to ensure that the increased cyclic nature of thermal expansion at these higher flow rates, albeit at less maximum internal stress, is not enough to deleteriously effect long-term performance.

Fig. 7
Temperature stability versus flow rate plots (a)–(f). Low flow rates introduce large oscillations while these perturbations are mitigated with gentle increases in flow rate.
Fig. 7
Temperature stability versus flow rate plots (a)–(f). Low flow rates introduce large oscillations while these perturbations are mitigated with gentle increases in flow rate.

3.3 Thermal Stability and Coolant Consumption.

As both the liquid nitrogen pot and automated system exhaust spent coolant to the ambient environment, the LN2 consumption rate is important for the economic viability of this approach. In testing, the automated system consumes significantly more LN2 than the pot used in the automated approach, as seen in Fig. 8. There are three possible causes for the extra consumption of the automated system with respect to the manual approach. First and perhaps most significantly, in the automated system, the cold plate removes all heat produced from ancillary components on the motherboard as they are thermally connected to the cold plate via dielectric fluid immersion. These ancillary components include the graphics card, memory modules, and voltage regulators. The power draw of these devices is not measured as they are not included in the monitoring software for the processor, which was used for power application recording. Second, ingress heat coming from the ambient space, maintained at approximately 22 °C, is effectively driven into the cold plate by the large thermal potential difference between the ambient space and the internal cryogenic state maintained by the LN2 coolant. Currently, the cold plate is uninsulated. However, future work is directed at proper material selection and design of an insulating casing around the cold plate aimed at both preventing frost formation and thermally protecting the cryogenic coolant within the cold plate. Additionally, the vacuum jacketed hoses and uninsulated fittings would also allow environmental heat ingress, thus increasing liquid nitrogen consumption. Third, the design of the cold plate allows liquid nitrogen to be expelled through the exhaust ports as new nitrogen vapor and liquid displaces it. In comparison, losses in the manual approach are almost negligible as the high thermal mass of the large copper pot, weighing over 2.8 kg, makes it a very effective low temperature reservoir throughout the heat rejection process. The “theoretical” limit, indicated as a gray line on Fig. 8, is the predicted amount of coolant needed assuming that all of the energy transfer is via latent heat of vaporization. The nearly identical trend of this theoretical line to that of the copper pot used in manual operation illustrates how much more effective this idealized but impractically large reservoir is at managing the CPU than the cold-plate used in the hands-off automated operation.

Fig. 8
Plot of nitrogen consumption versus CPU power consumption. Manual operation follow theoretical curve of pure latent heat transfer as the pot is maintained as a theoretical thermal reservoir through consistent observation/maintenance.
Fig. 8
Plot of nitrogen consumption versus CPU power consumption. Manual operation follow theoretical curve of pure latent heat transfer as the pot is maintained as a theoretical thermal reservoir through consistent observation/maintenance.

As previously mentioned, an increased flow rate can increase temperature stability. Thus, tests were performed to determine if higher flow rates resulted in increased consumption. This test addresses economic viability, as it is anticipated that a data center would either purchase LN2 coolant or manufacture it onsite. Mass flow rate was varied via adjustment of the upstream needle valve. The mass flow rate values shown on the abscissa of Fig. 7 indicate the flow rate that is being injected into the cold plate when the solenoid valve is open. Tests were conducted over several hours and the total consumption over that time frame for each of these injection mass flow rate settings were collected. As shown in Fig. 9, there is no discernable trend in overall required consumption with regards to injection mass flow rate. This comes from the fact that flow is regulated and thus the higher flow rates need to trigger less often. For the automated approach, lower mass flow rate settings mandate that the solenoid valve is open longer, and higher mass flow rate settings drive faster open-closed frequencies to deliver a similar total LN2 consumption over the duration of the test. If the temperature swings of a lower flow rate were deemed reliable, then efficiency and system reliability gains could be made possible through the reduced need for solenoid switching and consequent reduction in performance-robbing upstream flashing occurring each time the valve cycles. The expected life of the solenoid would also likely be increased as it would cycle less frequently.

Fig. 9
Plot of nitrogen consumption versus flow rate. All flow rates, neglecting any harmful impacts from the temperature oscillations shown in Fig. 7, result in the same net coolant flow rate consumption over continuous operation.
Fig. 9
Plot of nitrogen consumption versus flow rate. All flow rates, neglecting any harmful impacts from the temperature oscillations shown in Fig. 7, result in the same net coolant flow rate consumption over continuous operation.

4 Conclusion

Through the application of cryogenic cooling via liquid nitrogen, the authors were able to increase the electrical efficiency and frequency of a high performance processor. The implementation of solenoid controlled flow in combination with an additively manufactured cold plate automated the cooling process traditionally done by hand. Despite the 40.8% higher thermal resistance of the cold plate compared to a liquid nitrogen pot used in manual operation, the automated system managed to reduce power consumption in the processor by up to 10.6% when compared to a commercially available air-cooled approach. The more labor intensive manual approach increased this comparative power savings to 21.5%. In three separate cooling solutions (manual, automated, and air-cooled), the power-frequency relationship was established, allowing a user to evaluate the optimal overclock settings in regard to performance. The reduced power consumption of the cryogenic thermal solutions can be attributed to the reduced necessary core voltage and reduced current leakage. In order to keep the processor die within an acceptable temperature range, the flow rate must be optimized to reduce temperature swings. Temperature swings as high as ±11.5 °C were recorded at lower injection flow rates but increasing the flow rate slightly was shown to reduce these oscillations to a much more manageable ±3.5 °C. Transient temperature recordings also show how vapor build-up in the cold plate can exacerbate the magnitude of these temperature swings. This shows that more work needs to be done on the injection method and spent coolant exhaust design to prevent thermally harmful agglomeration of relatively insulating vapor on the primary heated surface as well as premature vapor formation through flashing and/or exposure to surfaces within the cold plate that are warm enough to drive boiling of these low saturation temperature cryogenic fluids. Testing then verified that instantaneous injection flow rate variation does not affect total time-averaged liquid nitrogen consumption over prolonged and steady CPU power. This suggests the possibility that static computational density of a data center is possible regardless of the injection flow rate used should the temperature oscillations inherent to lower injection flow rates be deemed acceptable. The limits of the prototype cold plate are demonstrated in the rate of liquid nitrogen consumption as the automated system used 230%–280% more liquid nitrogen than the theoretical minimum amount based on enthalpy of vaporization. However, future work can bridge this performance gap by programing into the flow control scheme mechanisms like proportional integral derivative control or machine learning to replicate the behaviors of a human operator pouring LN2 manually. Ensuring that vapor is not harmfully agglomerating in the cold plate is also critical, so future designs must address the spent coolant exhaust mechanisms for optimal operation. The implementation of cryogenic cooling at the server level has the potential to decrease processor power consumption while increasing computational density, resulting in a more compact and efficient data center.

Nomenclature

     
  • E =

    energy, J

  •  
  • f =

    CPU core frequency, Hz

  •  
  • P =

    power consumption, W

  •  
  • V =

    CPU core voltage, V

Greek Symbols

    Greek Symbols
     
  • α =

    activity factor

Subscripts

    Subscripts
     
  • CPU =

    describes power consumption of all cores in a package

  •  
  • short-circuit =

    energy consumption of a single short circuit event or power consumption derived from series of short circuits

  •  
  • static =

    power consumption component not related to switching

  •  
  • transition =

    power component components resulting from capacitors charging and discharging

References

1.
Ellsworth
,
M. J.
, and
Iyengar
,
M. K.
,
2009
, “
Energy Efficiency Analyses and Comparison of Air and Water Cooled High Performance Servers
,”
ASME
Paper No. InterPACK2009-89248.10.1115/InterPACK2009-89248
2.
Tuma
,
P. E.
,
2010
, “
The Merits of Open Bath Immersion Cooling of Datacom Equipment
,” 26th Annual IEEE Semiconductor Thermal Measurement and Management Symposium (
SEMI-THERM
), Santa Clara, CA, Feb. 21–25, pp.
123
131
.10.1109/STHERM.2010.5444305
3.
Avelar, V., Azevedo, D., and French, A., The Green Grid
,
2007
, “
The Green Grid Power Efficiency Metrics; PUE and DCiE
,” Beaverton, OR, accessed July 24, 2020, www.thegreengrid.org
4.
Eiland
,
R.
,
Fernandes
,
J.
,
Vallejo
,
M.
,
Agonafer
,
D.
, and
Mulay
,
V.
,
2014
, “
Flow Rate and Inlet Temperature Considerations for Direct Immersion of a Single Server in Mineral Oil
,”
Proceedings of the IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems
, Orlando, FL, May 27–30, pp.
706
714
.10.1109/ITHERM.2014.6892350
5.
Xu
,
H.
,
Feng
,
C.
, and
Li
,
B.
,
2013
, “
Temperature Aware Workload Management in Geo-Distributed Datacenters
,”
Proceedings of the 10th International Conference on Autonomic Computing
, San Jose, CA, June 26–28, pp.
303
314
.https://www.usenix.org/system/files/conference/icac13/icac13_xu_1.pdf
6.
Dou
,
H.
, and
Qi
,
Y.
,
2017
, “
An Online Electricity Cost Budgeting Algorithm for Maximizing Green Energy Usage Across Data Centers
,”
Front. Comput. Sci.
,
11
(
4
), pp.
661
674
.10.1007/s11704-016-5420-y
7.
Lei
,
N.
, and
Masanet
,
E.
,
2020
, “
Statistical Analysis for Predicting Location-Specific Data Center PUE and Its Improvement Potential
,”
Energy
,
201
, p.
117556
.10.1016/j.energy.2020.117556
8.
Alkharabsheh
,
S.
,
Fernandes
,
J.
,
Gebrehiwot
,
B.
,
Agonafer
,
D.
,
Ghose
,
K.
,
Ortega
,
A.
,
Joshi
,
Y.
, and
Sammakia
,
B.
,
2015
, “
A Brief Overview of Recent Developments in Thermal Management in Data Centers
,”
ASME J. Electron. Packag.
,
137
(
4
), p.
040801
.10.1115/1.4031326
9.
Turkmen
,
I.
,
Mercan
,
C. A.
, and
Erden
,
H. S.
,
2020
, “
Experimental and Computational Investigations of the Thermal Environment in a Small Operational Data Center for Potential Energy Efficiency Improvements
,”
ASME J. Electron. Packag.
,
142
(
3
), p.
031116
.10.1115/1.4047845
10.
Kasukurthy
,
R.
,
Rachakonda
,
A.
, and
Agonafer
,
D.
,
2021
, “
Design and Optimization of Control Strategy to Reduce Pumping Power in Dynamic Liquid Cooling
,”
ASME J. Electron. Packag.
,
143
(
3
), p.
031001
.10.1115/1.4049018
11.
Demetriou
,
D. W.
,
Kamath
,
V.
, and
Mahaney
,
H.
,
2016
, “
A Holistic Evaluation of Data Center Water Cooling Total Cost of Ownership
,”
ASME J. Electron. Packag.
,
138
(
1
), p.
010912
.10.1115/1.4032494
12.
Choo
,
K.
,
Galante
,
R. M.
, and
Ohadi
,
M. M.
,
2014
, “
Energy Consumption Analysis of a Medium-Size Primary Data Center in an Academic Campus
,”
Energy Build.
,
76
, pp.
414
421
.10.1016/j.enbuild.2014.02.042
13.
Petrongolo
,
J.
,
Nemati
,
K.
, and
Fouladi
,
K.
,
2020
, “
Simulation-Based Assessment of Performance Indicator for Data Center Cooling Optimization
,”
ASME J. Therm. Sci. Eng. Appl.
,
12
(
5
), p.
051009
.10.1115/1.4045962
14.
Jin
,
C.
,
Bai
,
X.
, and
Yang
,
C.
,
2019
, “
Effects of Airflow on the Thermal Environments and Energy Efficiency in Raised-Floor Data Centers: A Review
,”
Sci. Total Environ.
,
695
, p.
133801
.10.1016/j.scitotenv.2019.133801
15.
Demetriou
,
D. W.
, and
Khalifa
,
H. E.
,
2012
, “
Optimization of Enclosed Aisle Data Centers Using Bypass Recirculation
,”
ASME J. Electron. Packag.
,
134
(
2
), p.
020904
.10.1115/1.4005907
16.
Song
,
Z.
,
2016
, “
Numerical Cooling Performance Evaluation of Fan-Assisted Perforations in a Raised-Floor Data Center
,”
Int. J. Heat Mass Transfer
,
95
, pp.
833
842
.10.1016/j.ijheatmasstransfer.2015.12.060
17.
Cataldo
,
F.
,
Amalfi
,
R. L.
,
Marcinichen
,
J. B.
, and
Thome
,
J. R.
,
2020
, “
Implementation of Passive Two-Phase Cooling to an Entire Server Rack
,”
Proceedings of the 19th Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm)
, Orlando, FL, July 21–23, pp.
396
401
.10.1109/ITherm45881.2020.9190327
18.
Huang
,
H.
,
Quan
,
G.
, and
Fan
,
J.
,
2010
, “
Leakage Temperature Dependency Modelling in System Level Analysis
,”
Proceedings of the 11th International Symposium on Quality Electronic Design
(
ISQED
), San Jose, CA, Mar. 22–24, pp.
447
452
.10.1109/ISQED.2010.5450539
19.
Gough
,
C.
,
Steiner
,
I.
, and
Saunders
,
W.
,
2015
, “
CPU Power Management
,”
Energy Efficient Servers
, Apress, Berkeley, CA, pp.
21
70
. 10.1007/978-1-4302-6638-9
20.
Sadiqbatcha
,
S.
,
Zhao
,
H.
,
Amrouch
,
H.
,
Henkel
,
J.
,
Sheldon
,
X.-D.
, and
Tan
,
2019
, “
Hot Spot Identification and System Parameterized Thermal Modelling for Multi-Core Processors Through Infrared Thermal Imaging
,” Proceedings of Design, Automation, and Test in Europe Conference and Exhibition, Florence, Italy, Mar. 25–29, pp.
48
53
.10.23919/DATE.2019.8714918
21.
Jones
,
J.
,
2000
,
Contact Mechanics
,
Cambridge University Press
,
Cambridge, UK
, Chap.
6
.
22.
Aung
,
W. W.
,
2014
, “
Speedup Thin Clients With Overclocking Method for Resource Intensive Tasks
,”
IEEE International Conference on MOOC
,
Innovation and Technology in Education (MITE)
, Patiala, India, Dec. 19–20, pp.
42
46
.10.1109/MITE.2014.7020238
23.
Gizopoulos
,
D.
,
Papadimitriou
,
G.
,
Chatzidimitriou
,
A.
,
Reddi
,
V. J.
,
Salami
,
B.
,
Unsal
,
O. S.
,
Kestelman
,
A. C.
, and
Leng
,
J.
,
2019
, “
Modern Hardware Margins: CPUs, GPUs, FPGAs Recent System-Level Studies
,” IEEE 25th International Symposium on On-Line Testing and Robust System Design (
IOLTS
), Rhodes, Greece, July 1–3, pp.
129
134
.10.1109/IOLTS.2019.8854386
24.
Koutsovasilis
,
P.
,
Parasyris
,
K.
,
Antonopoulos
,
C. D.
,
Bellas
,
N.
, and
Lalis
,
S.
,
2020
, “
Dynamic Undervolting to Improve Energy Efficiency on Multicore X86 CPUs
,”
IEEE Trans. Parallel Distributed Syst.
,
31
(
12
), pp.
2851
2864
.10.1109/TPDS.2020.3004383
25.
Kim
,
H. S.
,
Vijaykrishnan
,
N.
,
Kandemir
,
M.
, and
Irwin
,
M. J.
,
2003
, “
Adapting Instruction Level Parallelism for Optimizing Leakage in VLIW Architectures
,”
SIGPLAN Not.
38
(
7
), pp.
275
283
.10.1145/780731.780770
26.
Fallah
,
F.
, and
Pedram
,
M.
,
2005
, “
Standby and Active Leakage Current Control and Minimization in CMOS VLSI Circuits
,”
IEICE Trans. Electron.
,
E88–C
(
4
), pp.
509
519
.10.1093/ietele/e88-c.4.509
27.
WikiChip
,
2019
, “
Thermal Velocity Boost (TVB) – Intel
,” accessed July 26, 2021, https://en.wikichip.org/wiki/intel/thermal_velocity_boost
28.
Patterson
,
M. K.
,
2008
, “
The Effect of Data Center Temperature on Energy Efficiency
,”
2008 11th Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems
, Orlando, FL, May 28–31, pp.
1167
1174
.10.1109/ITHERM.2008.4544393
29.
Singh
,
N.
, and
Dhir
,
V.
,
2019
, “
Competitive Analysis of Energy Aware VW Migration Algorithms in Cloud Computing
,”
Int. J. Develop. Admin. Res.
,
2
(
2
), pp.
36
49
.https://so02.tcithaijo.org/index.php/ijdar/article/view/247223
30.
Chun
,
B.-G.
,
Iannaccone
,
G.
,
Iannaccone
,
G.
,
Katz
,
R.
,
Lee
,
G.
, and
Niccolini
,
L.
,
2010
, “
An Energy Case for Hybrid Datacenters
,”
ACM SIGOPS Operat. Syst. Rev.
,
44
(
1
), pp.
76
80
.10.1145/1740390.1740408
31.
Energy Efficiency Baselines for Data Centers,
2013
, Statewide Customized New Construction and Customized Retrofit Incentive Programs Integral Group, Oakland, CA.
32.
Campbell
,
L.
,
Chu
,
R.
,
David
,
M.
,
Ellsworth
,
M.
,
Iyengar
,
M.
, and
Simons
,
R.
,
2011
, “
Multi-Fluid, Two-Phase Immersion-Cooling of Electronic Component(s)
,” U.S. Patent No. US8619425B2.
33.
Woltman
,
G.
, and
Kurowski
,
S.
,
2008
, “
GIMPS the Great Internet Mersenne Prime Search
,” accessed Aug. 24, 2020, https://www.mersenne.org/
34.
Byun
,
I.
,
Min
,
D.
,
Lee
,
G.-h.
,
Na
,
S.
, and
Kim
,
J.
,
2020
, “
Cryocore: A Fast and Dense Processor Architecture for Cryogenic Computing
,”
Proceedings of ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA)
, Valencia, Spain, May 30–June 3, pp.
335
348
.10.1109/ISCA45697.2020.00037
35.
Travers
,
M.
,
CPU Power Consumption Experiments and Results Analysis of Intel i7-4820K
,
Newcastle University
, Newcastle upon Tyne, UK.
36.
Wang
,
S. L.
,
Chen
,
C. A.
,
Lin
,
Y. L.
, and
Lin
,
T. F.
,
2012
, “
Transient Oscillatory Saturated Flow Boiling Heat Transfer and Associated Bubble Characteristics of FC-72 Over a Small Heated Plate Due to Heat Flux Oscillation
,”
Int. J. Heat Mass Transfer
,
55
(
4
), pp.
864
873
.10.1016/j.ijheatmasstransfer.2011.10.022
37.
Khalili
,
S.
,
Rangarajan
,
S.
,
Sammakia
,
B.
, and
Gektin
,
V.
,
2020
, “
An Experimental Investigation on the Fluid Distribution in a Two-Phase Cooled Rack Under Steady and Transient Information Technology Loads
,”
ASME J. Electron. Packag.
,
142
(
4
), p.
041002
.10.1115/1.4048180
38.
Mehta
,
B.
, and
Khandekar
,
S.
,
2015
, “
Local Experimental Heat Transfer of Single-Phase Pulsating Laminar Flow in a Square Mini-Channel
,”
Int. J. Therm. Sci.
,
91
, pp.
157
166
.10.1016/j.ijthermalsci.2015.01.008
39.
Chandratilleke
,
T. T.
,
Narayanaswamy
,
R.
, and
Jagannatha
,
D.
,
2011
, “
Thermal Performance Evaluation of a Synthetic Jet Heat Sink for Electronic Cooling
,”
2011 IEEE 13th Electronics Packaging Technology Conference
, Singapore, Dec. 7–9, pp.
79
83
.10.1109/EPTC.2011.6184390
40.
Ohadi
,
M. M.
,
Dessiatoun
,
S.
,
Choo
,
V.
,
Pecht
,
K. M.
, and
Lawler
,
J. V.
,
2012
, “
A Comparison Analysis of Air, Liquid, and Two-Phase Cooling of Data Centers
,” 2012 28th Annual IEEE Semiconductor Thermal Measurement and Management Symposium (
SEMI-THERM
), San Jose, CA, Mar. 18–22, pp.
58
63
.10.1109/STHERM.2012.6188826