The lifeblood of any financial institution is data. How that data is used, and how fast it is put into use, are the keys to survival in an industry that measures performance in sub-milliseconds and even nano seconds. The implementation of real-time, algorithm-driven trading, once a ‘flying car’ futuristic panacea, is now standard. Artificial Intelligence (AI), including machine learning (ML) and even deep learning (DL) are now being unleashed on problems ranging from commodity futures predictions to fraud detection. The key to all of it is the underlying technologies, and their inherent complexities.
Finance organizations are increasingly dependent on accelerated compute infrastructure to further enable first-mover advantages, while balancing new sustainability requirements. It is essential for these firms to optimize across the entire stack – from the server hardware and components such as overclocked processors, to memory latency, and adapter performance to utilization of hardware accelerators such as FPGAs and GPUs, and new options such as DPUs (data processing units) to further off-load processors and provide additional security checks.
The Heat Dilemma
Tech CEOs like to debate about the status of “Moore’s law” and when it will “end,” but they are still going at full speed to keep it going. The goal has been, and always will be for the foreseeable future, to cram as much data manipulating componentry into as small a space as possible. More components mean more electricity, which means more heat. Financial firms need to deal with rising server and data center power requirements and temperatures due to dramatic increases in nearly all system components – from processors, to memory, to storage to accelerators.
For example, in 2014, the average two-socket sever drew 400 watts of power, whereas today that average is pushing900 watts:over2X in less than a decade. And accelerators are now pushing nearly 800 watts, per accelerator. In a typical air-cooled, hot-aisle/cold-aisle data center that means more energy spent on chilling the air going into the system, more energy to drive the fans drawing that air through the system, and more hot discharge air to deal with on the back end. Fans are not data manipulating componentry. This means that customers will have to either compromise on performance or density (to accommodate the needed bigger fans and heat sinks) or utilize some sort of alternative cooling.
Liquid Cooling Solutions
Alternative cooling means utilizing liquid to remove the heat from components. Because liquid absorbs more heat than air, it is much more efficient. There are basically four types of liquid cooling:
- Rack level – using a rear-door heat exchanger to absorb the discharged hot air and remove heat.
- Liquid assisted – using liquid in either heat sinks or exchangers inside an air-cooled system to absorb heat coming off specific of the components, mostly the CPUs or GPUs. This is considered “closed loop” in that no external connections are needed for the cooling, and heat is removed via internal radiators at the rear of the server, and coolant is circulated and amplified via redundant internal pumps.
- Direct – liquid is used in lieu of air to remove the heat from all the system components. This is “open loop” in that the liquid is circulated to a CDU (coolant distribution unit) for in-row cooling, or CRAC (computer room air conditioning) unit for larger-scale cooling requirements.
- Immersive or Immersion – literally sinking the system into a vat of electro-conductive liquid
There are advantages and disadvantages to each solution, but they all offer an alternative to straight air-cooling. The fact that a standard two-socket server will likely hit a kilowatt in drawn power within the next couple of years should give financial institutions pause. And when you add in GPUs, which are now approaching800 watts each, plus increased power requirements for memory and flash storage, financial customers are going to make a liquid choice at some point soon, rather than give up performance or density.
Acceleration will lead to new breakthroughs in areas like trading and fraud detection, and will also lead to more connected, globalized systems that will be able to incorporate more input from disparate sources on different continents. Systems that will know how the overnight rainfall in Columbia impacts coffee futures the next morning, or how current traffic on certain dark web sites correlates to greater credit card fraud in different countries, and when.
Today, CPU and GPU vendors realize the value of liquid cooling, releasing some products that are designated as water cooled only. As fintech professionals make career defining calls on which technologies will yield the most compelling, actionable data in the future, they will need to continue to balance system cost, performance, and density. In addition, another major new dimension has been added regarding sustainability as many corporations are requiring their CIOs to address how they can become more efficient and “greener.” This includes becoming carbon neutral, or even carbon negative, by utilizing the “waste” heat generated for other purposes, utilizing recycled materials in products and eco-friendly packaging materials, adopting carbon offsets, and supporting the circular economy across asset lifecycles – from design and deployment through secure retirement and eco-friendly recycling.
The one aspect that remains true is that change is constant. As technology evolves and new ways to increase performance are created, they can be balanced with new sustainability initiatives to offset the increased power and heat requirements to enable smarter computing from the data center to the edge and in the cloud.