Print 18 comment(s) - last by scrapsma54.. on Nov 17 at 6:19 PM

U.S. institution recaptures title with 17.59 petaflop showing

NVIDIA Corp. (NVDA) is bringing out the big guns in the GPU market, pushing special Kepler "accelerator" hardware -- the big brother to NVIDIA's market-leading consumer gaming hardware.  Targeted at large supercomputer deployments, the Tesla K20X accelerator offers 3.95 teraflops of peak floating point precision performance and 1.17 teraflops of peak double performance.

To put that in context, a top-of-the-line server CPU -- the Westmere-EX 12-core, 2.4 GHz Xeon E7-8870 -- gets approximately 384 gigaflops of peak double performance [source].  With an average power performance of around 90 watts per core [source] when loaded, the Intel chip musters around 355 Megaflops per watt.  By contrast, the NVIDIA card gets about 2,142.77 Megaflops per watt. In other words, it's not only more powerful in terms of pure number crunching; it's also more efficient.

Of course, that comparison is slightly misleading; there are significant differences between GPU-accelerated multi-threaded computing and CPU multi-core computing in terms of memory resources and data transfer.
Tesla K20X
The world's most powerful supercomputer is now driven by NVIDIA's Tesla K20X GPUs.

But the numbers do start to give you an idea of why so many data centers are jumping on the GPU train.  NVIDIA announced on Monday that the completed "Titan" supercomputer at Oak Ridge National Laboratory in Oak Ridge, Tenn. just earned a "number one" ranking in the Top500 list of the world's most powerful supercomputers.

Powered by 18,688 NVIDIA Tesla cards, the installation posted a LINPACK score of 17.59 petaflops.  

Titan shows NVIDIA's arch-nemesis CPU/GPU-maker Advanced Micro Devices, Inc. (AMD) some love as well, utilizing its Opteron 6274 (Bulldozer) 16-core chips.  Paired with 710 terabytes of memory, the machine is capable of performing 1,000 quadrillion calculations per second using 20 megawatts of electricity or less.

Titan supercomputer
Titan unseats reigning champion Sequoia, a more traditional CPU-driven design from IBM.

Dr. Thomas Schulthess, professor of computational physics at ETH Zurich and director of the Swiss National Supercomputing Center cheers the record-setter, remarking, "We are taking advantage of NVIDIA GPU architectures to significantly accelerate simulations in such diverse areas as climate and meteorology, seismology, astrophysics, fluid mechanics, materials science, and molecular biophysics.  The K20 family of accelerators represents a leap forward in computing compared to NVIDIA's prior Fermi architecture, enhancing productivity and enabling us potentially to achieve new insights that previously were impossible."

The previous record holder was Lawrence Livermore National Laboratory's Sequoia system a Blue Gene supercomputer from International Business Machines, Inc. (IBM).  Sequoia is a more traditional CPU-based design, which uses PowerPC A2 processor chips.


Comments     Threshold

This article is over a month old, voting and posting comments is disabled

wonder why the 1 to 1
By kattanna on 11/13/2012 10:40:56 AM , Rating: 2
i see it has 18,688 CPU's and GPU cards. wonder why the one to one when you could have multiple GPU cards per CPU.

its damn sexy looking though

RE: wonder why the 1 to 1
By amanojaku on 11/13/2012 10:52:31 AM , Rating: 3
Technically, it is many to one. CPUs have as much as 16 cores these days, but GPUs can have 80 or more ROPs, assuming those do the calculation.

RE: wonder why the 1 to 1
By kattanna on 11/13/2012 11:02:24 AM , Rating: 2

By relying on its 299,008 CPU cores to guide simulations and allowing its Tesla K20 GPUs, which are based on NVIDIA's next-generation Kepler architecture to do the heavy lifting,

each box has a single 16 core CPU and 1 GPU card..which is doing the "heavy lifting" so.. I wonder why not do 2 GPU cards per CPU blade. 16 cores should easily be able to "guide" 2 GPU cards

man.. would i love to poke around that building.

RE: wonder why the 1 to 1
By Shig on 11/13/2012 12:20:37 PM , Rating: 2
16 cores can easily feed 2 GPUs, the bottleneck is the memory subsystem. Memory isn't scaling nearly as fast as processing power / cores.

RE: wonder why the 1 to 1
By Shig on 11/13/2012 12:21:20 PM , Rating: 2
Plus heat issues.

RE: wonder why the 1 to 1
By FITCamaro on 11/13/2012 10:58:08 PM , Rating: 2
Here's my question on these computers with massive amounts of RAM.

Do they use traditional modules for memory or custom made, much larger pieces with far more memory on them? I mean that much memory even with 16GB modules is over 44,000 modules.

RE: wonder why the 1 to 1
By kattanna on 11/14/2012 10:01:08 AM , Rating: 3
just ran the numbers and it seems each blade is using 2 16GB dimms.. or 4 8GB modules, im guessing 4x8 due to price. so the CPU has 32 GB ram. to get their total they are also counting the 6GB ram on board the GPU to bring each blade up to 38GB ram x 18,688 blades.. and you get their 710TB memory.

RE: wonder why the 1 to 1
By scrapsma54 on 11/17/2012 6:19:35 PM , Rating: 2
what if the system isn't maxed out currently? What if the boards on them are only using a 1 to 1 ratio, because of current budgets?

"A lot of people pay zero for the cellphone ... That's what it's worth." -- Apple Chief Operating Officer Timothy Cook

Most Popular Articles5 Cases for iPhone 7 and 7 iPhone Plus
September 18, 2016, 10:08 AM
Laptop or Tablet - Which Do You Prefer?
September 20, 2016, 6:32 AM
Update: Samsung Exchange Program Now in Progress
September 20, 2016, 5:30 AM
Smartphone Screen Protectors – What To Look For
September 21, 2016, 9:33 AM
Walmart may get "Robot Shopping Carts?"
September 17, 2016, 6:01 AM

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki