With Intel’s Conroe Core 2 Duo launch in June 2006, Intel added several new SSE optimizations. The new SSE optimizations included with Intel’s Core 2 architecture sped up SSE, SSE2 and SSE3 operations two-fold. This was performed by optimizing the Core 2 architecture to execute a 128-bit SSE, SSE2 and SSE3 instruction in a single clock cycle. Intel’s previous Netburst and Core architecture required two clock cycles to execute the same instruction. These extensions and optimizations of SSE3 were not actually new instructions but more or less an improvement in efficiency.Intel’s Pat Gelsinger announced today that Intel has published the white paper on its SSE4 instructions that will appear in its next-generation 45nm products. The new SSE4 instructions add 50 new performance enhancing instructions. These instructions optimize vector compiling, media, string and text processing and application targeted accelerators. The Core architecture implemented on the Core 2 Duo processors added 32 additional supplimental streaming instructions to SSE3. These instructions, dubbed Supplimental Streaming SIMD, are not SSE4 and should not be confused as such. SSE4 instructions are expected to arrive incrementally in Intel’s first 45nm product that is expected to sample in the second half of 2007. This includes Intel’s upcoming Nehalem, which will be Intel’s second generation Core architecture, and Penryn, a 45nm shrink of Core 2 Duo. Intel Penryn and other 45nm processors are expected to begin sampling the second half of 2007 and begin shipping in the first half of 2008. Full implementation of SSE4 is only planned for Nehalem at this time.More details are available in Intel's whitepaper on the subject.
quote: 3DNow includes a subset of the original SSE.
quote: yeah, 3DNow was first, but it still only contains a subset of SSE.
quote: Why not work with others in the industry (like AMD) and release instructions that software developers are asking for and will pledge support for.