Print 24 comment(s) - last by dgingeri.. on Feb 15 at 3:47 PM

Internal Intel documents reveal that SSE4 is on the way and will be here with Conroe

Several internal memos and roadmaps from Intel confirmed what many of us already suspected: Intel's Conroe (and Merom) will feature SSE4.  The next generation SSE instructions were listed as part of the the platforms "Significant Video Enhancements" built into the upcoming Intel platforms.  

Among the other video enhancements listed included Clear Video Technology (CVT) -- Intel's attempted answer to ATI Avivo -- and UDI support.  Both of these technologies seem dependant on the 965 core logic and not necessarily the CPU.  Other Intel corporate roadmaps have defined CVT as:

Intel Clear Video Technology supporting advanced decoding, post-processing capabilities and enhanced 3D

The Conroe platform and most of its features have not been disclosed yet.  Intel is expected to announce the Conroe architecture name at the Intel Developer Forum on March 7th. 

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

By dgingeri on 2/15/2006 3:47:24 PM , Rating: 2
3Dnow! came out first, by almost a year, and were faster than SSE, but just barely. 3Dnow! (now on it's 3rd iteration) and SSE are both incorporated into the latest nVidia and ATI video drivers. 3Dnow! professional, the latest version, includes all the SSE, SSE2, and most SSE3 instructions, also includes several instructions not used by Intel. SSE3 includes a few instructions that were introduced by AMD with the original 3Dnow! set that improved MPEG2 decoding.

SSE2 was what gave Intel the bigger boost with the P4. The reason was that A. they would use up to 8 operands, but preferably 4, and B. they used a shorter summation routine. This reduces accuracy ont he calculations, but only by about 10^-8, or .00000001, or less. IEEE requirements for X87 floating point calculations of trig functions were for summantions of 20 steps, costing the designers 20 cycles of the processors to calculate. Intel reduced that to 12 in many cases when using SSE2. They cheated. AMD, when they implimented SSE2, kept the 20 step summation, making their implimentation of SSE2 slightly slower. It's not necessarily bad due to the amount precision is reduced, but it makes it so that scientific calculations are reduced. Intel's compiler doesn't know scientific calculations from game calculations, so it inserts it's SSE2 instructions everywhere. This makes scientific software compiled with Intel's less accurate. However, because even the IEEE standard calculations are inaccurate in many cases, most scientific software uses integer summations on the order of 100 steps instead of the X87 calculations. This allows AMD to take the lead in speed.

So, when you fall through the world in an MMO because of a non-visible 'crack' between surfaces, or miss a shot in an FPS game that should have been dead on, blame Intel, their SSE2 miscalculated, or rather, that can be your excuse. The likelyhood of it actually affecting a game is very, very remote, but possible. I do know of one spot in EQ that P4 owners would actually fall through the world while AMD owners wouldn't due to the slight separation of 2 surfaces. It took SOE 6 patches to fix that one.

"I mean, if you wanna break down someone's door, why don't you start with AT&T, for God sakes? They make your amazing phone unusable as a phone!" -- Jon Stewart on Apple and the iPhone
Related Articles

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki