What Is A GPU Cluster? And What Is It Good For?

March 5, 2011

Computer clusters have been used since the early 1980’s mainly for scientific tasks. Today, clusters are enjoying increasing popularity among businesses requiring significant computing capabilities in order to run their operations and R&D activities. With the increasing importance of the Internet as a ubiquitous media for service and product marker presence, cluster usage is rapidly shifting its focus from pure scientific and technological environments to being deployed to facilitate and ensure uninterrupted online business presence characterized by high-availability and fast content delivery.

The desire to be able to work with more computing horsepower and to achieve heightened reliability by orchestrating a set of relatively low cost computers has given rise to a variety of architectures and configurations in cluster computing. A cluster may be a simple two-node system interconnecting just two personal computers, or may represent a truly fast supercomputer.

Arguably, modern computer clusters can be said to belong to one of two large groups:

High Performance Clusters (HPC) are used for computation-intensive purposes with relatively low number of IO-oriented operations. For instance, HPC’s may be employed to support computational simulations in areas, such as finance, weather, seismic activity, and so on. Oil and gas companies, such as Schlumberger, Chevron, TOTAL, Petrobras, Repsol, etc. use HPC solutions to analyze geological data and locate areas with abundant energy resources.

Load-balancing clusters are configurations, in which the cluster is designed to share computational workload among different nodes to provide better overall performance. For example, to optimize the overall response time in a web server or database cluster, the cluster-based system may assign different queries to different nodes, so overall load is optimized.

However, approaches to load-balancing may significantly differ according to application requirements, e.g. an HPC solution used for scientific computations would balance load with different algorithms compared to a web-server cluster that may use a simple round-robin method by assigning each new request to the next available node.


Recently introduced to the marked, GPU-based clusters also provide significant computation horsepower for HPC’s. Each node of such a cluster has one or more general-purpose GPU’s on board, each consisting of several hundred computing cores that are used as main computing devices. Today’s GPU’s amount to much more than just high performance graphics engines. In fact, in cluster solutions they are not connect to computer displays at all and they don’t even have video out capabilities. Instead, the cluster software can load general-purpose engineering, geological, scientific and graphical computing tasks onto those cores and run them in parallel. In some cases this approach can boost the computation speed several times compared to CPU-based computing. Among tasks offloaded onto GPU’s are financial calculations, atmospheric modeling, fluid dynamics, statistical data analytics, as well as large scale graphic high definition rendering.

Since general-purpose GPU’s consist of many hundreds of cores (compared to several cores in CPU’s), they provide extreme computation power for tasks that can be split into small chunks and executed in parallel. Thus, each GPU can provide peak performance of 3000 Gigaflops (Billions of floating point operations per second). In a cluster stack, this power combines with the computational resources of general CPU’s of each of the multiple nodes – the main part of the application is executed by the CPU’s, and the intensive computing parts are run in parallel on the GPU’s. The total number of nodes in such clusters can reach several thousands.

Modern computer clusters may be configured for different purposes, ranging from general-purpose business needs such as web-service support to computation-intensive scientific calculations. In either case, corporate-level clusters by and large employ a high-availability approach, so that different nodes either execute the same tasks in parallel (hot-standby), or maintain connection to the cluster in a ready state, so tasks can be started on them, should part of the cluster become unavailable (warm-standby).

Today, customers can choose, among the rest, solutions from the leaders of the computer world:

Microsoft Windows Compute Cluster Server 2003 based on the Windows Server platform providing features for High Performance Computing like the Job Scheduler, MSMPI library and extensive management tools.

The Linux world provides various cluster software. For application clustering, there is “Beowulf”, DISTCC and MPICH. Linux Virtual Server, Linux-HA are director-based clusters that allow incoming requests for services to be distributed between multiple cluster nodes. OpenSSI. openMosix, Kerrighed – are full-blown clusters integrated into the Linux kernel, which are based on single-system image implementations, providing automatic process migration among homogeneous nodes.

Recent implementations of clustered DBMS show nearly linear growth in performance with number of nodes increase.


Based in New Zealand, the Weta cluster used to render “The Avatar” movie (as well as “Lord of the Rings”, “Fantastic Four”, “X-Men”, “i-Robot” and many others) consists of 4 000+ HP BL2×220c blade servers amounting to 35 000 computing cores in total, 2 Petabyte disk array, and is controlled by Ubuntu Linux.

The SINOPEC Shanghai Offshore Petroleum Company uses HPC solution from Bright Computing, which comprises of 42 IBM and Chinese-made Inspur servers containing 84 Intel CPU’s and 16 NVIDIA GPU’s.

The Chevron Corp has been using cluster computers since 2000. To this date, even though they process information at a rate of up to 1.5 terabytes daily, Chevron are still managing their data under this set-up without the need of a computer mainframe.

Some relevant hardware: