Rajeeb Hazra, General Manager of Intel's Technical Computing Group holding “Knights Corner”
Over 1 TeraFLOPS on a single chip

GPGPU and cloud computing have been hot topics for the last several years. Intel has shown off several designs like Larrabee and the Single-chip Cloud Computer in the past. However, it is Knights Corner that will be the firm's first commercial product to use the Many Integrated Core (MIC) architecture. The co-processor will be offered as a PCIe add-in board.

The MIC concept is simple: Use architecture specifically designed to process highly parallel workloads, but ensure compatibility with existing x86 programming models and tools.

This would give MIC co-processors the ability to run existing applications without the need to port the code to a new programming environment, theoretically allowing maximum CPU and co-processor performance simultaneously with existing x86 based applications. This would dramatically save time, cost and resources that would otherwise be needed to rewrite them to alternative proprietary languages.

AMD and NVIDIA have been trying to do with their latest architectures by enabling support for languages like C++, but Intel wants to challenge them in this potentially lucrative market.

Knights Corner will be manufactured using Intel’s latest 3-D Tri-Gate P1270 22nm transistor process and will feature more than 50 cores. Intel demonstrated first silicon of Knights Corner at the SC11 conference yesterday. The co-processor wowed the crowd by delivering more than 1 TeraFLOPS of double precision floating point performance.

The firm also touted its "commitment to delivering the most efficient and programming-friendly platform for highly parallel applications", and showed off the benefits of the MIC architecture in weather modeling, tomography, protein folding, and advanced materials simulation at its booth.

There is no timeframe on when Knights Corner will enter production or be available to customers.

RE: holy hell..
By andre-bch on 11/16/2011 6:09:07 PM , Rating: 2
Lot's of misinformation in the above posts.

The highest performing single GPU tesla is M2090 which can do 665 DP GFLOPs.

Geforce GTX 4xx and 5xx cards' DP performance are intentionally limited to 1/8 of their SP performance. Teslas' DP is 1/2 of their SP performance.

Meaning; a GTX 480 can do 1344 SP GFLOPs, but only 168 DP. A tesla M2090 achieves 1331 SP GFLOPs, but 665 DP.

All in all, the upcoming 28 nm kepler, will at least have twice the DP performance of fermi and should be available in 1H 2012. Knights Corner will have a hard time competing with that.

RE: holy hell..
By andre-bch on 11/16/2011 6:22:46 PM , Rating: 2
Forgot that 460, 550 and 560 are also GTX. Why isn't there an EDIT button again?!

Anyways, among geforces, GF100 and GF110 based cards have 1/8 DP number of their SP performance, for the rest, GF104, 106, 114, 108, etc., it is 1/12.

RE: holy hell..
By someguy123 on 11/16/2011 8:09:13 PM , Rating: 2
I can see nvidia outpacing them in raw performance, especially since there's no window of release, if ever released.

They're touting x86 compatibility, though. With GPGPU you're porting/building new CUDA/openCL software. Theoretically this chip would work with current software, which is very enticing for developers and for those of us using slow adobe software.

RE: holy hell..
By Khato on 11/17/2011 12:08:59 AM , Rating: 2
As I stated above, it's quite important to differentiate between theoretical and actual performance. The Tesla M2090 has a peak theoretical throughput of 665 GFLOPS. From what I've found on NVIDIA's own presentations for their C2050, that theoretical throughput on the M2090 will likely go down to around 470 GFLOPS on the common DGEMM. The 1TFLOPS demonstration on Knights Corner was on DGEMM, it's not a theoretical max.

