backtop


Print 39 comment(s) - last by acompsys.. on Nov 29 at 1:09 PM


Rajeeb Hazra, General Manager of Intel's Technical Computing Group holding “Knights Corner”
Over 1 TeraFLOPS on a single chip

GPGPU and cloud computing have been hot topics for the last several years. Intel has shown off several designs like Larrabee and the Single-chip Cloud Computer in the past. However, it is Knights Corner that will be the firm's first commercial product to use the Many Integrated Core (MIC) architecture. The co-processor will be offered as a PCIe add-in board.

The MIC concept is simple: Use architecture specifically designed to process highly parallel workloads, but ensure compatibility with existing x86 programming models and tools.

This would give MIC co-processors the ability to run existing applications without the need to port the code to a new programming environment, theoretically allowing maximum CPU and co-processor performance simultaneously with existing x86 based applications. This would dramatically save time, cost and resources that would otherwise be needed to rewrite them to alternative proprietary languages.

AMD and NVIDIA have been trying to do with their latest architectures by enabling support for languages like C++, but Intel wants to challenge them in this potentially lucrative market.

Knights Corner will be manufactured using Intel’s latest 3-D Tri-Gate P1270 22nm transistor process and will feature more than 50 cores. Intel demonstrated first silicon of Knights Corner at the SC11 conference yesterday. The co-processor wowed the crowd by delivering more than 1 TeraFLOPS of double precision floating point performance.

The firm also touted its "commitment to delivering the most efficient and programming-friendly platform for highly parallel applications", and showed off the benefits of the MIC architecture in weather modeling, tomography, protein folding, and advanced materials simulation at its booth.

There is no timeframe on when Knights Corner will enter production or be available to customers.


Comments     Threshold


This article is over a month old, voting and posting comments is disabled

holy hell..
By solarrocker on 11/16/2011 10:16:02 AM , Rating: 2
50 cores?! That insane, does AMD got anything to fight this?




RE: holy hell..
By kensiko on 11/16/2011 10:26:50 AM , Rating: 2
It's not a LGA2011 CPU, it a completely different architecture.


RE: holy hell..
By StevoLincolnite on 11/16/2011 10:43:02 AM , Rating: 5
From the article:
quote:
It will be offered as a PCIe add-in board.


AMD has it's GPU's to take this on. Should be interesting.


RE: holy hell..
By Chaca on 11/16/2011 11:47:18 AM , Rating: 2
Does anyone care for a Core i8-DX ;-)


RE: holy hell..
By swizeus on 11/16/2011 12:59:30 PM , Rating: 2
So Basically TDP can go as high as 300 watts and even more (considering 2 8-pins power for PCI-E) instead of 130W limit on today's CPU slot. This is awesome


RE: holy hell..
By Guspaz on 11/16/2011 6:20:35 PM , Rating: 2
It's similar to what we're seeing in the ARM world. AppliedMicro's X-Gene chips scale up to 128 ARM cores, using up about 2W per core (pretty close to that 300W figure Intel is touting for a 50-core part).

Calxeda is doing similar things in a more distributed manner with an eye on cloud computing; they're doing lots of low-power quad-core ARM chips in super high density servers. Four cores per CPU, four CPUs per board, eighteen boards per chassis, and four chassis per rack. Each CPU has its own RAM, acting as an independent server, meaning a single rack can host 288 quad-core servers. Each server is relatively low-power, but performance-per-watt should still beat out traditional servers on an aggregate level, and most cloud instances don't need super-powerful servers; they're either massively distributed, or simply low-usage. There's a reason why atom-based servers have become popular in the datacenter for dedicated servers.


RE: holy hell..
By kattanna on 11/16/2011 10:39:33 AM , Rating: 3
yes, their video cards.

thats what this is going up against, GPU computing, not your desktop CPU computing.

and i cant wait to see the BOINC apps being able to use this thing, that be sexy


RE: holy hell..
By ViroMan on 11/16/2011 11:36:07 AM , Rating: 2
GPU computing for bionc like apps usually offers many times more performance then a CPU. Im willing to bet 3-5 GPUs will beat 50 CPUs on a board. Although I don't know what the wattage for the CPU card would be, 5 GPUs sure as hell going to need an external PSU.


RE: holy hell..
By kattanna on 11/16/2011 11:58:28 AM , Rating: 2
you can already get teraflop computing within your desktop

http://www.nvidia.com/object/personal-supercomputi...

using a pair of these cards. If intel can do the same using one addon card, and if the price is right, then they could do very well in this area.


RE: holy hell..
By benny638 on 11/16/2011 12:42:05 PM , Rating: 2
515 Gigaflops is what the Tesla is able to do in double precision. The Intel drop in card can do 1000+ Gigaflops in double precision. This makes the intel card twice as fast. Impressive to say the least.


RE: holy hell..
By Khato on 11/16/2011 12:55:33 PM , Rating: 2
Actually, the Tesla C2050 can do 515 DP GFLOP peak. According to this nvidia presentation (http://www.nvidia.com/content/GTC-2010/pdfs/2057_G... it looks like they get around 360 GFLOP DGEMM. Hence the single Knights Corner chip is delivering around 3x the performance of the Tesla C2050. It'll be quite interesting to see what production performance looks like... not to mention power consumption.


RE: holy hell..
By Samus on 11/16/2011 2:20:22 PM , Rating: 2
Not true.

My GTX480's in 2x SLI rank up 615 gigaflops in cudaprof, and this configuration is nearly two years old. GTX580's should scale to around 750 gigaflops, and this is something available RIGHT NOW.

Tesla has been able to achieve a Teraflop in 3xSLI configuration for years.

Intel offering a chip that can do this is simply an entra point to the sector, not a groundbreaking achievement. However, knowing Intel, they will dominate the field in power consumption.


RE: holy hell..
By EJ257 on 11/16/2011 2:54:02 PM , Rating: 2
True but Intel is doing this on a single chip vs. the SLI configuration which still requires two or three GPUs. I'm sure Intel is also working on a way to make multiples of these chip work in parallel like SLI/X-Fire.


RE: holy hell..
By Belard on 11/17/2011 10:41:46 PM , Rating: 2
Really?! Where can you buy this SINGLE chip from intel? Which *IS* not a drop CPU replacement. It requires a PCIe slot.

If this intel chip make it to market in 1-2 years, AMD and Intel will have similar GPU systems up before then.

This is why Intel canceled their 3D video card plans... they don't have the drivers or technology to compete against AMD & Nvidia. They can't match or surpass them.


RE: holy hell..
By someguy123 on 11/16/2011 3:30:32 PM , Rating: 2
You're comparing 3 cards to a single knights ferry....


RE: holy hell..
By Aloonatic on 11/16/2011 3:46:48 PM , Rating: 2
To be fair, he's comparing 3 cards that actually exist and can be bought in the shops to one specially selected piece of silicon that only exists Intel's super secret research lair.

By the time that Intel's processor reaches the market, it'll be interesting to see what one Nvidia card could do, or how ever many that you could buy and put in a rig for the same money as the rig that you'll need to get those figures out of a Knights processor.

Still, at least it's good to see that the Larrabee time and money might not be going to waste after all.


RE: holy hell..
By someguy123 on 11/16/2011 3:49:44 PM , Rating: 2
I'm not saying intel will dominate, or that this card will even see the light of day within time to be relevant, but to say something like "we can already do this, but with 3 cards!" is silly. Similar to someone dropping a petaflop chip and saying "we've done that with supercomputers for years!"


RE: holy hell..
By Argon18 on 11/16/2011 11:49:44 PM , Rating: 3
knights ferry??? lmao i don't know why but that's funny!


RE: holy hell..
By kattanna on 11/16/2011 5:30:03 PM , Rating: 2
quote:
GTX580's should scale to around 750 gigaflops, and this is something available RIGHT NOW.


according to BOINC, my HIGH OC 560ti can get 901 Gflops peak


RE: holy hell..
By Da W on 11/16/2011 3:55:57 PM , Rating: 3
Telsa is 40nm, Intel's 50 core is 22 nm. Next generation Telsa will get there.


RE: holy hell..
By andre-bch on 11/16/2011 6:09:07 PM , Rating: 2
Lot's of misinformation in the above posts.

The highest performing single GPU tesla is M2090 which can do 665 DP GFLOPs.

Geforce GTX 4xx and 5xx cards' DP performance are intentionally limited to 1/8 of their SP performance. Teslas' DP is 1/2 of their SP performance.

Meaning; a GTX 480 can do 1344 SP GFLOPs, but only 168 DP. A tesla M2090 achieves 1331 SP GFLOPs, but 665 DP.

All in all, the upcoming 28 nm kepler, will at least have twice the DP performance of fermi and should be available in 1H 2012. Knights Corner will have a hard time competing with that.

http://en.wikipedia.org/wiki/Nvidia_Tesla

http://techreport.com/articles.x/18682


RE: holy hell..
By andre-bch on 11/16/2011 6:22:46 PM , Rating: 2
Forgot that 460, 550 and 560 are also GTX. Why isn't there an EDIT button again?!

Anyways, among geforces, GF100 and GF110 based cards have 1/8 DP number of their SP performance, for the rest, GF104, 106, 114, 108, etc., it is 1/12.

http://www.anandtech.com/show/4221/nvidias-gtx-550...

http://www.anandtech.com/show/3973/nvidias-geforce...


RE: holy hell..
By someguy123 on 11/16/2011 8:09:13 PM , Rating: 2
I can see nvidia outpacing them in raw performance, especially since there's no window of release, if ever released.

They're touting x86 compatibility, though. With GPGPU you're porting/building new CUDA/openCL software. Theoretically this chip would work with current software, which is very enticing for developers and for those of us using slow adobe software.


RE: holy hell..
By Khato on 11/17/2011 12:08:59 AM , Rating: 2
As I stated above, it's quite important to differentiate between theoretical and actual performance. The Tesla M2090 has a peak theoretical throughput of 665 GFLOPS. From what I've found on NVIDIA's own presentations for their C2050, that theoretical throughput on the M2090 will likely go down to around 470 GFLOPS on the common DGEMM. The 1TFLOPS demonstration on Knights Corner was on DGEMM, it's not a theoretical max.


RE: holy hell..
By encia on 11/16/2011 9:26:49 PM , Rating: 2
AMD GCN already includes some AMD64/X86-64 IP


Do you know what this means for gaming?
By Tequilasunriser on 11/16/2011 1:24:42 PM , Rating: 2
I foresee this processor achieving many important goals, but what I'm most excited about is the prospect of more realistic physics in PC games!

The only thing that would hold this achievement back are the developers that won't go out on a limb in order to cater to the dirty, troglodyte, console gamers whose console can not produce the processing power for more realistic physics.

Consoles <--- This is why we don't have nice things.




RE: Do you know what this means for gaming?
By LordanSS on 11/16/2011 3:28:36 PM , Rating: 2
We've been there before....

PCs had the Ageia PPU for extra physics processing, but most game developers never cared about it, and there were a couple of implementations that weren't all that good.

Then comes nVidia and buys out Ageia, and locks out PhysX to their own hardware. If you're one of the unlucky few who bought an original Ageia card, and have an AMD video card plugged in... unless you use some modified third-party drivers, you lose your PhysX capabilities. Thanks, nVidia. Not.

It just kinda baffles me, though... the other big physics name, Havok, is owned by Intel. And as expected, Intel never cared to expand much on it. Would be interesting if they switched over Havok to OpenCL/DirectCompute, but there's simply no way Intel would be giving their competitors a free lunch like that. =/


By someguy123 on 11/16/2011 4:05:17 PM , Rating: 2
Portions of the physx library are used in tons of games. Most games just don't use the more complicated particle physics to avoid being forced into the smaller physx userbase. Technically developers should be able to use something like OpenCL if they really wanted physics acceleration on GPU, similarly to what ID did with their gpu-based megatexture streaming.


By andre-bch on 11/16/2011 6:46:56 PM , Rating: 2
There are two major physics engines, havok and physx.

There was a GPU accelerated multi-platform version of havok called FX which apparently got canceled. Just as you said maybe intel doesn't want to give the competition a free lunch.

nvidia's physx can only be accelerated through their proprietary CUDA API, with no openCL and DirectCompute support in sight.

If things continue like this, we won't see GPU accelerated physics catching on any time soon.


Reminds me of this...
RE: Reminds me of this...
By Belard on 11/17/2011 10:31:27 PM , Rating: 2
LOL.... thanks...

Blow his friggin head off!!!

Ahhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh!


Intel WANT this in your desktop
By werfu on 11/16/2011 12:37:44 PM , Rating: 2
Intel has been pushing for a while for ray-tracing and could easily past NVidia and AMD if it ever happens to rise. Current GPU are designed around shaders and are really poor at ray-tracing. Even more, going x86 (without the legacy instructions) on GPU could prove to be highly poppular.




By smilingcrow on 11/16/2011 12:53:35 PM , Rating: 2
This will certainly start off being very expensive and aimed at the HPC community.
They don’t have to match nVidia in performance but just be close enough as the gains in software productivity is the real eye opener here.


needs a correction
By dgingerich on 11/16/2011 3:39:15 PM , Rating: 2
The article title marks this as a "CPU". It decidedly is not. a "CPU" is a "Central Processing Unit". This thing is a co-processor. it can be programmed to do wide, fast (although I'm thinking a GTX580 would do far wider and far faster) processing, but it cannot be used as the central processor in a system. It needs a real CPU to operate the system and send it commands. Therefore, this is definitely not a "CPU".




RE: needs a correction
By IntelUser2000 on 11/19/2011 1:43:06 AM , Rating: 2
Intel doesn't think so. From their SC11 presentation, various modes:

Native mode: KNF(Knights Ferry) is a fully networked Linux system.
Offload mode: KNF is an attached accelerator
Cluster mode: parallel application distributed across multiple KNF and hosts using MPI

For flops, enlighten yourself from reading other posts.


Archi?
By boobo on 11/16/2011 4:01:19 PM , Rating: 2
What core designs does it use? 64-bit P54Cs?




RE: Archi?
By acompsys on 11/29/2011 10:59:15 AM , Rating: 2
both 32 n 64


Latest Intel DX79SI board
By acompsys on 11/29/2011 11:03:08 AM , Rating: 2
Does any one used this intel board?




RE: Latest Intel DX79SI board
By acompsys on 11/29/2011 1:09:59 PM , Rating: 2
"Can anyone tell me what MobileMe is supposed to do?... So why the f*** doesn't it do that?" -- Steve Jobs














botimage
Copyright 2014 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki