by Matt Harvey
After a long wait, the greatly-anticipated Maxwell GPU from NVIDIA has finally arrived, in the form of the Geforce GTX980, to rave reviews from the gaming world where it has been acclaimed as the new king of the performance hill.
At Acellera, we’re always tracking the cutting edge of technology to deliver the best systems for molecular dynamics simulation, we’ve been hard at work putting these new cards through their paces. Before we see how they perform, let’s have a quick look at what’s changed.
What’s new? Maxwell GPUs: the new state of the art for molecular dynamics simulations
NVIDIA’s previous generation GPU, named Kepler, has been our workhorse for almost three years now, first in the form of the GK104 silicon, and then its big brother the GK110. These two devices both have a similar design, based on a 192-processing element block, called a “Streaming Multiprocessor“, or SM for short.
Manufactured on a 28nm process, the GK110 had 15 SMs, although it’s only with the very latest Tesla K40 and Geforce GTX780Ti that we have seen products with all of those cores enabled – the GTX780 systems we have been shipping to this past year have had only 12 SMs activated.
Maxwell GPU vs Kepler: Main differences
Maxwell’s SM is a refinement of that of Kepler, reducing the number of processing cores by a third to 128 but incorporating additional design improvements. NVIDIA claims that the real-world performance is reduced by only 10% relative to Kepler. The new Maxwell processor, GM204, is still manufactured on the same 28nm process as GK110 rather than the anticipated 20nm process. Nevertheless, the smaller SM, and the intervening refinements to the manufacturing process mean that GM204 can run at higher clock frequencies and contain more SMs than Kepler (16 versus 15) on a die about 40% smaller.
Maxwell GPU Performance in MD Simulation: Faster, stable and more efficient
So how does the Maxwell-based GTX980 fare when running ACEMD, our flagship molecular dynamics code?
Over the last year, we have been selling GTX780-based systems. On the popular dihydrofolate reductase benchmark system of 23,500 atoms, we saw single-GPU performance of around 210ns/day.
Running the same test on a GTX980, with no other performance optimisations, yields an impressive rate of 280ns/day, around 30% faster. On a Metrocubo equipped with 4 GTX980s, that’s over 1 microsecond per day of MD sampling. If you prefer maximum performance over throughput, a two-GPU run can achieve 380ns/day, a new performance record!
Benchmarking conditions: ECC off. X79 chipset. CUDA4.2 and ACEMD ver 2500 or greater. Periodic boundary conditions, 9 A cutoff, PME long range electrostatic grid size 1, hydrogen mass repartitioning, rigid bonds, Langevin thermostat, time step 4 fs. Note: Other codes make benchmarks with smaller cutoff or less atoms. Performace as of October 8th 2014. See ACEMD page for latest results, and to run a benchmark simulation with your system.
And that’s not all: compared to Kepler, the new GPU is much more power-efficient – an at the wall measurement of a 4-GPU E3-based Metrocubo system running at full tilt draws almost 200W less, making it even quieter and cooler than before.
It’s quite remarkable that such an improvement has been made without moving to a newer manufacturing node, and makes the future 20nm parts even more tantalising.
GPU hardware for MD simulations available now with Maxwell GPUs
All in all, the new Maxwell has passed its tests with flying colours and we’re very pleased to announce that we are shipping them to customers now.
As usual feel free to request a test drive. We will be more than happy to make some time available in one of our machines. Maxwell is already available for testing.
Also, feel free to request a quote.