Internal Intel documents reveal that SSE4 is on the way and will be here with Conroe

Several internal memos and roadmaps from Intel confirmed what many of us already suspected: Intel's Conroe (and Merom) will feature SSE4.  The next generation SSE instructions were listed as part of the the platforms "Significant Video Enhancements" built into the upcoming Intel platforms.  

Among the other video enhancements listed included Clear Video Technology (CVT) -- Intel's attempted answer to ATI Avivo -- and UDI support.  Both of these technologies seem dependant on the 965 core logic and not necessarily the CPU.  Other Intel corporate roadmaps have defined CVT as:

Intel Clear Video Technology supporting advanced decoding, post-processing capabilities and enhanced 3D

The Conroe platform and most of its features have not been disclosed yet.  Intel is expected to announce the Conroe architecture name at the Intel Developer Forum on March 7th. 

Didn't AMD mention new instructions sometime ago?
By formulav8 on 2/14/2006 3:22:42 PM , Rating: 2
I thought I also remember AMD saying they were working on having new specific SIMD style instructions for their next gen cpu (Or next major refresh). Around the time Vista is to be released? Maybe something to do with specific Video style instructions like SSE4?

But really, how often does even SSE3 get used? SSE2 is just now being used fairly common. But I guess you've got to get it started sometimes.


By Questar on 2/14/2006 3:30:34 PM , Rating: 2
AMD tried their own SSE instructions some time ago. It was named 3Dnow!, and it bombed huge.

RE: Didn't AMD mention new instructions sometime ago?
By Cygni on 2/14/2006 4:12:21 PM , Rating: 3
Some notes here...

One is that 3Dnow! came out before SSE. 1998 for the K6-2 w/ 3Dnow!, 1999 for the Pentium III w/SSE.

Another thing is that both SSE and 3dnow are simply SIMD command sets. They arent really anything that special. Also, 3dnow hardly "bombed huge"... its commands (as well as the 3dnow+ commands) are integrated into every graphics engine and graphics driver around today.

By Questar on 2/15/2006 10:31:09 AM , Rating: 2
Really? Please link to a current game or application that has support for 3Dnow.

I don't know what a "command set" is. SMID extensions are cpu instructions. Yeah, I'm nit-picking.

Anyway, the reason why it bombed? It was very slow. Why was it slow? Because the intructions are 2 operand, which makes them half as fast as SSE, which is 4 operand.

This above was posted by the LAME developers on why they don't support 3Dnow. You can look it up an hydrogen audio if you wish.

By Marlin1975 on 2/14/2006 4:12:24 PM , Rating: 2
How did 3Dnow bomb? If I remember correct games, that supported it, would run MUCH faster when 3Dnow first came out. Intel came back with SSE and being with their bully ways just forced everybody to adopt it and little to no support for 3Dnow was left in its trails.

Of course now intel was last to follow amd witht he AMD64 setup.

By finalfan on 2/14/2006 7:04:13 PM , Rating: 2
Nobody was forced to adopt SSE. It was stupid not to adopt SSE which are available on 85% of the PCs. 3DNow? Even AMD does not mention that anymore.

By Shining Arcanine on 2/14/2006 3:38:45 PM , Rating: 2
Video Codecs, like XviD and DivX.

By smitty3268 on 2/14/2006 4:07:05 PM , Rating: 2
The real problem is that it didn't improve performance any. I think those codecs sped up by about 1% after they added SSE3. Hopefully SSE4 won't be as disappointing.

By BaronMatrix on 2/14/2006 4:10:18 PM , Rating: 2
Actually SSE3 gave Intel a large advantage in things like Photoshop and Encoding. That's why AMD included it in the E revision of SanDiego. This may give them a boost again, but AMD can just license it.

By smitty3268 on 2/14/2006 4:28:43 PM , Rating: 2
Actually, you could be right. The test I saw was for AMD chips with SSE3, and they barely improved at all. But Intel chips could have gotten a significant boost. The only tests for them I've seen was comparing Prescott(SSE3) vs. Northwood(SSE2) and obviously that isn't fair for SSE3.

This was for video encoding, which actually only uses 1 of the SSE3 instructions (lddqu?)

RE: Didn't AMD mention new instructions sometime ago?
By MDme on 2/14/2006 6:37:49 PM , Rating: 2
IIRC, there was a certain issue whether the intel compilers allowed SSE2/3 code to be run on AMD CPUs. I remember something like if there was no Genuineintel Flag that it would disable SSE2/3 instructions. There was even a hack around it and it showed AMD CPUs getting even faster than they currently are with the patch involved. I can't remember where I saw it though. Was in the forums a long time ago.

By Plasmoid on 2/14/2006 8:55:32 PM , Rating: 2
Yeah it seems like programs compiled before the E revision AMD cores dont use the SSE3 fully on them. On newer programs though it seems both Intel and AMD chips get decent speed boost in specialised areas.

3dNow was a great instruction set for Athlon XP's
It took a long time to take off but when the K6 started to come to an end of its life 3dNow took off in in encoders and some games because it gave genuine speed boosts (not like MMX, and a lot more then SSE)

By Questar on 2/15/2006 2:17:35 PM , Rating: 2
You are mistaken that 3Dnow was faster that SSE.

See my post above.

RE: Didn't AMD mention new instructions sometime ago?
By mpeny on 2/14/2006 4:21:51 PM , Rating: 2
Actually Adobe Premiere Pro pretty much requires it. There is considerable performance increase here - and this is true for many prosumer software out there.

This is why Anandtech should stop using Premiere 6/7 for the benchmarks. Start using a more recent versions of the applications like After Effects and Premiere.

By mindless1 on 2/14/2006 9:45:33 PM , Rating: 2
Actually, no that' s a horrible idea. WHen people buy a system, they don't consider buying ALL NEW software to be a prerequisite. If that is the case, then taking the P4 for example, we can't say a $400 P4 has score : X in a benchmark, we have to call it a $2560 P4 (or whatever applies per software) since it would be manditory to buy the software to get the performance.

If anything, benchmarks are already using software that is too new. People just do not continually upgrade their software, particularly the expensive titles and businesses do it even less often except in a production environment.

By Hulk on 2/14/2006 4:36:29 PM , Rating: 2
Then AMD followed with 3DNow!

By meson2000 on 2/14/2006 5:59:08 PM , Rating: 2
But MMX was just for integer instructions. 3DNow was first for floating point instructions.

By SexyK on 2/14/2006 9:53:26 PM , Rating: 2
lol, yeah, forgot to mention that AMD needed 3dnow! because the native floating point on the K6 chips was miserably slow. Any software that didnt support 3dnow! and relied on floating point ops was painfully slow on the AMD chips of the era.

By ohnnyj on 2/15/2006 1:49:49 AM , Rating: 2
Anyone know when we should see Merom?

RE: When?
By kelmon on 2/15/2006 5:09:36 AM , Rating: 2
Q3-4 2006 was the last I heard on that front. I'm waiting for a Merom-based MacBook so I'm still relatively hopeful that this will arrive before the end of the year, hopefully not long after summer when the savings account should be looking good.

Merom is delayed
By frankie1969 on 2/15/2006 11:08:08 AM , Rating: 2
Check Google News for Intel+Conroe Intel+Woodcrest Intel+Merom

Intel is pushing to get Conroe out the door first, as early in Q3 as possible.

The downside is that Merom won't arrive until Q4 or possibly even Q1 2007. Therefore, no MacBook 64 for you, kelmon.

By onewingedangel on 2/14/2006 4:14:50 PM , Rating: 2
No doubt the competition will adopt any sse or simd instructions the other uses where it benefits the architecture. Even if it takes amd or intel a while to add support for the rivals standards it has a way of getting done.

3Dnow! lacked support
By MrKaz on 2/15/2006 12:45:41 PM , Rating: 2
Quake2 3dnow enabled version, if I am not mistaken put an K6 300 with the same performance level of P2 400 at the time. I think the achievement was done with:
- 3dnow "special" Voodoo2 driver.
- Quake2 patch

K6 FPU "sucked" vs P2, but the 3dnow achieved a very good performance boost.

I just remember 3 games with 3dnow that I played: Quake2, Drakan and Expandable.

There is also suppord of 3dnow in Ati, nvidia, ... drivers.

I think the best bet to boost performance is x64, because even without native 64 bit code, by just using new compilers they will make use of the extra registers adding performance, but with native x64 code, there will be significant boosts.

By dgingeri on 2/15/2006 3:47:24 PM , Rating: 2
3Dnow! came out first, by almost a year, and were faster than SSE, but just barely. 3Dnow! (now on it's 3rd iteration) and SSE are both incorporated into the latest nVidia and ATI video drivers. 3Dnow! professional, the latest version, includes all the SSE, SSE2, and most SSE3 instructions, also includes several instructions not used by Intel. SSE3 includes a few instructions that were introduced by AMD with the original 3Dnow! set that improved MPEG2 decoding.

SSE2 was what gave Intel the bigger boost with the P4. The reason was that A. they would use up to 8 operands, but preferably 4, and B. they used a shorter summation routine. This reduces accuracy ont he calculations, but only by about 10^-8, or .00000001, or less. IEEE requirements for X87 floating point calculations of trig functions were for summantions of 20 steps, costing the designers 20 cycles of the processors to calculate. Intel reduced that to 12 in many cases when using SSE2. They cheated. AMD, when they implimented SSE2, kept the 20 step summation, making their implimentation of SSE2 slightly slower. It's not necessarily bad due to the amount precision is reduced, but it makes it so that scientific calculations are reduced. Intel's compiler doesn't know scientific calculations from game calculations, so it inserts it's SSE2 instructions everywhere. This makes scientific software compiled with Intel's less accurate. However, because even the IEEE standard calculations are inaccurate in many cases, most scientific software uses integer summations on the order of 100 steps instead of the X87 calculations. This allows AMD to take the lead in speed.

So, when you fall through the world in an MMO because of a non-visible 'crack' between surfaces, or miss a shot in an FPS game that should have been dead on, blame Intel, their SSE2 miscalculated, or rather, that can be your excuse. The likelyhood of it actually affecting a game is very, very remote, but possible. I do know of one spot in EQ that P4 owners would actually fall through the world while AMD owners wouldn't due to the slight separation of 2 surfaces. It took SOE 6 patches to fix that one.

"Intel is investing heavily (think gazillions of dollars and bazillions of engineering man hours) in resources to create an Intel host controllers spec in order to speed time to market of the USB 3.0 technology." -- Intel blogger Nick Knupffer
