backtop


Print 42 comment(s) - last by Performance Fa.. on Jul 2 at 2:17 PM


IBM researcher Shawn Hall inspects a new Blue Gene/P supercomputer (Source: IBM)
IBM's Blue Gene/P triples the performance of its previous supercomputer

IBM has announced Blue Gene/P, the second generation of the world's most powerful supercomputer. Blue Gene/P nearly triples the performance of its predecessor, Blue Gene/L – which also held the title of being the world's fastest computer.

The Blue Gene/P scales to operate continuously at speeds exceeding one petaFLOP – or one-quadrillion operations per second – and can be configured to reach speeds in excess of three petaflops.

The performance jump from Blue Gene/L and Blue Gene/P is due to several factors. In hardware, the Blue Gene/P supercomputer doubles the number of processors per chip, with each processor operating at a higher clock speed. More memory is added along with an SMP mode to support multi-threaded applications. This new SMP mode moves the Blue Gene/P system to a programming environment similar to that found in commercial clusters. The system’s software is also upgraded for Blue Gene/P with refinements to system management, programming environment and applications support.

"Blue Gene/P marks the evolution of the most powerful supercomputing platform the world has ever known," said Dave Turek, vice president of deep computing, IBM. "A new group of commercial users will be able to take advantage of its new, simplified programming environment and unrivaled energy efficiency. We see commercial interest in the Blue Gene supercomputer developing now in energy and finance, for example. This is on course with an adoption cycle – from government labs to leading enterprises – that we've seen before in the high-performance computing market."

Four IBM PowerPC 450 processors running at 850 MHz are integrated on a single Blue Gene/P chip, with each chip capable of 13.6 billion operations per second. A two-foot-by-two-foot board containing 32 of these chips churns out 435 billion operations every second, making it more powerful than a typical, 40-node cluster based on two-core commodity processors. Thirty-two of the compact boards comprise the 6-foot-high racks. Each rack runs at 13.9 trillion operations per second, 1,300 times faster than today's fastest home PC.

The one-petaFLOP Blue Gene/P supercomputer configuration is a 294,912-processor, 72-rack system harnessed to a high-speed, optical network. The Blue Gene/P system can be scaled to an 884,736-processor, 216-rack cluster to achieve three-petaflop performance – though a standard Blue Gene/P supercomputer configuration will house 4,096 processors per rack.

Not only is the Blue Gene/P designed to be blazingly fast, it is also energy efficient. IBM says that the Blue Gene/P supercomputer is at least seven times more energy efficient than any other supercomputer today.

The power of the Blue Gene/P could be applied to the medical field, such as modeling an entire human organ to determine drug interactions, for example. Drug researchers could run simulated clinical trials on 27 million patients in one afternoon using just a sliver of the machine's full power.

Some of the world's leading research laboratories and universities have already placed orders for Blue Gene/P supercomputers. The U.S. Dept. of Energy's Argonne National Laboratory, Argonne, Ill., will deploy the first Blue Gene/P supercomputer in the U.S. beginning later this year.



Comments     Threshold


This article is over a month old, voting and posting comments is disabled

just wondering
By Xajel on 6/28/2007 2:50:34 AM , Rating: 2
Why IBM do't use Cell CPU's in there Supercomputer things ??

I know too much diff. for example, the CPU's used in current BlueGene clocked at les than 1GHz, while Cell is 3.2GHz beast, this will need very hard cooling thing, but it will rock the Peta flop for more than 10 or even more ( I didn't caclulated it, will someone make some math here :D )




RE: just wondering
By lompocus on 6/28/2007 3:44:42 AM , Rating: 1
Amazingly cell is scalable up to 6GHz :D.

I am curious as to how they cool the things, I mean there's only so much air conditioning the building in which the supercomputer is housed can have, right?


RE: just wondering
By Hoser McMoose on 6/28/2007 4:51:26 PM , Rating: 2
The SPEs in the Cell only do single precision calculations, which is often pretty much useless for this sort of device. If you look at it's double-precision performance the Cell is pretty ho-hum since it can only do that in the PowerPC unit. So it's score in terms of Petaflops would be rather abysmal (worse then an equivalent number of Opteron or Xeon chips). IBM has done some work on an HPC design using Cell processors, but from what I've seen there isn't too much interest.

IBM could take steps to correct the single-precision-only issue as well as modifying the chip somewhat to make it more useful for HPC workstations, but in all likelihood they would end up with something that looks an awful lot like their current BlueGene design.


RE: just wondering
By Schmeh on 6/29/2007 1:15:17 AM , Rating: 2
This is not at all true. The SPEs are fully capable of doing both single precision and double precision floating point operations. The only caveat is that there is a 7 cycle latency for DP on the SPEs.

From RealWorldTech.com:

quote:
A quick glance at the microarchitecture of the CELL processor reveals that the SPE’s are capable of performing 4 (non IEEE754 compliant) SP floating point multiple-add (FMADD) operations per cycle or 2 (IEEE754) DP FMADD operations every 7 cycles. Consequently, the 8 SPE’s alone can achieve the 256 SP GFlops rating at 4 GHz without the aid of the PPE. Presumably, the (DD2) PPE can also produce 4 SP FMADD’s per cycle, and the (DD2) CELL processor should instead be rated as 288 Gflops at 4 GHz when the compute power of the PPE are taken into consideration. Similarly, the 2 DP floating point multiply-add operations every 7 cycles results in 18.3 DP GFlops per second for the 8 SPE’s at 4 GHz, and the PPE can sustain a peak throughput of 1 DP FMADD operation per cycle, producing 8 DP GFlops at 4 GHz. The total of 26.3 GFlops matches nicely with IBM’s claim of > 26 DP GFlops.


http://www.realworldtech.com/page.cfm?ArticleID=RW...


"There is a single light of science, and to brighten it anywhere is to brighten it everywhere." -- Isaac Asimov

Related Articles
IBM's Next-Generation Supercomputer
February 28, 2007, 1:52 AM













botimage
Copyright 2014 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki