Print 64 comment(s) - last by Parhel.. on Aug 27 at 3:52 PM

CPU ID screen shot of "Yorkfield" at 2.33 GHz  (Source: DailyTech)

Intel "Yorkfield" ScienceMark 2 L2 cache performance  (Source: DailyTech)
DailyTech managed to snag a quad-core "Yorkfield" for a few quick benchmarks

Benchmarks of Intel’s Penryn based dual-core Wolfdale have appeared a couple times in the past month. The early benchmarks tested engineering sample processors and showed Wolfdale, on average, performing 5 percent faster, clock for clock then Conroe. However, benchmarks of the quad-core Yorkfield are virtually non-existent to the public.

Intel’s Yorkfield is not a native quad-core design. As with Kentsfield, Yorkfield features two dual-core dies fused together. The design results in each pair of cores having access to its own pool of shared L2 cache. Since Penryn has more cache, each pair of cores has access to 6MB of L2 for a total of 12MB – up from the 4MB per pair and 8MB total of Kentsfield.

In addition to the increased cache size, Penryn features a faster 24-way associative L2 cache, which cuts off a few clock cycles. Kentsfield has an 16-way associative L2 cache.

also features new SSE4 instructions catered towards multimedia tasks. SSE4 introduces 47 new instructions to improve performance of video accelerators, graphics building blocks and streaming load. Intel claims a 2x performance gain in video acceleration tasks. There are 14 new instructions for video accelerator performance enhancement. Intel improves compiler auto-vectorization performance with 32 new instructions.

Intel expects SSE4 optimizations to deliver performance improvements in video authoring, imaging, graphics, video search, off-chip accelerators, gaming and physics applications. Early benchmarks with an SSE4 optimized version of DivX 6.6 Alpha yielded a 116 percent performance improvement due to SSE4 optimizations.

Also new to Penryn is the Super Shuffle Engine. Intel’s Super Shuffle Engine allows for shuffling unpacking, packing, align concatenated sources, wide shifts, insertion and extraction, and setup for horizontal arithmetic functions. Intel claims a “2x faster SSE shuffle instruction execution,” according to earlier briefing documents.

Although Yorkfield uses a 45nm fab process and consumes less power, Intel plans to stick to its existing 95 Watt and 130 Watt thermal design power ratings.

DailyTech previously presented quick and dirty benchmarks of AMD’s 1.6 GHz Barcelona processor last June. Today, DailyTech has a few quick and dirty benchmarks of Intel’s quad-core Yorkfield Core 2 processor, in an LGA775 package.

The testing configuration is as follows:
  • Intel Core 2 Extreme QX6700 @ 2.33 GHz, 1333 MHz front-side bus
  • Intel Yorkfield 2.33 GHz, 1333 MHz front-side bus
  • Intel P35 Express based motherboard
  • 2x1GB DDR3-1333 memory
  • AMD ATI Radeon HD 2600 XT
Since Intel does not have a 2.33 GHz Kentsfield processor, a Core 2 Extreme QX6700 is used. The Core 2 Extreme QX6700 has an unlocked multiplier, which allowed us to clock it at 2.33 GHz with a 1333 MHz front-side bus.

 SiSoft Sandra XII CPU-Arithmetic

2.33 GHz
2.33 GHz

 SiSoft Sandra XII CPU Multimedia

2.33 GHz
2.33 GHz

 SiSoft Sandra XII Memory Bandwidth

2.33 GHz
2.33 GHz

Synthetic benchmarks do not really reveal too much of a performance difference between Kentsfield and Yorkfield. However, SiSoft Sandra XII does not contain SSE4 optimizations yet.

Unlike AMD, Intel relies on an off-chip memory controller. Although AMD achieves low latencies with its integrated memory controller, Intel manages the same feat with a northbridge-installed controller. Intel managed to offset the latencies associated with off-die memory controllers with increased L2 cache. Yorkfield’s additional L2 cache and speedier 24-way associative L2 cache yields an approximate memory bandwidth boost of 7 percent.

 Cinebench 10 Performance

2.33 GHz
2.33 GHz

 DivX 6.6

2.33 GHz
2.33 GHz

Cinebench 10 yields an approximate 8 percent boost in single and multithreaded rendering. Encoding a video file into DivX also yields a similar 8 percent performance boost.

Overall, with our limited time with Yorkfield, performance of the quad-core processor is roughly 8 percent faster clock for clock than Kentsfield. However, this is expected as Yorkfield is essentially a 45nm die shrink of Kentsfield with a few tweaks here and there.

Expect Intel to begin shipping Yorkfield in mass quantities in Q1 2008. Quad-core Xeon X5400 Harpertown processors, which are somewhat similar to Yorkfield, will ship in November.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

RE: Not bad
By EarthsDM on 8/24/2007 12:11:54 AM , Rating: 2
We may reenter a time where Intel and AMD are better in different applications. Yorkfield doesn't seem much stronger than Conroe or Kentsfield in floating-point calculations, which is supposedly Barcelona's strong point. Also, AMD's chips seem to get a larger performance boost from running a 64-bit OS, as shown by Anand in his comparison of AMD64 and EMT64.

If this is true (and seeing as Barcelona has enhancements for 128-bit FP) AMD may beat out Intel in certain apps. We could see a time where AMD is better for scientific calculations.

RE: Not bad
By JumpingJack on 8/24/2007 1:43:54 AM , Rating: 5
Each of xbit's graphs has the raw number, go through bench by bench and calculate the % improvement, then average... you will find that they miss calculated AMD's 64-bit speed up...

Finally, if you throw out the skewed data -- Science mark where it has been optimized for K8 and the Sandra SSE results due to the wide SSE on Conroe, the actual performance average is even steven. Heck, here I will do it for you:

It takes about 10 minutes to type these numbers up in excel, run the calculation, and show that 64-bit is not an advantage for AMD.

RE: Not bad
By EarthsDM on 8/24/2007 10:03:27 AM , Rating: 2
Wow. Ok, you're right. I don't know what else to say except that I should do more of my own math...

RE: Not bad
By JumpingJack on 8/24/2007 8:40:15 PM , Rating: 2
Well, after rereading my post to you I should apologize, it came across at a bit curt ....

I would have not noticed it except it had generated some debate when Xbit posted the article, when I was looking at the data I had originally said 6%, not bad... but when eyeballing their chart it did not look 6% delta to me...

Truth is, if you dig around in some reviews you find these types of mistakes often, not routine but often.. some are honest mistakes (such as this one), others are subtly hidden for what ever reason.

In this particularly case, I see this linked now and again providing a somewhat misleading conclusion...

RE: Not bad
By nineball9 on 8/25/2007 11:17:01 AM , Rating: 2
Good analysis and spreadsheet design. Labeling the last 2 columns "64 bit speedup" was a bit misleading (at least to me). I interpreted these columns to mean the percentage improvement of 64-bit operation over 32-bit operation when the numbers are actually just (64-bit value / 32-bit value) * 100. At first, seeing "64 bit speedup" values of around 100% led me to believe 64-bit operation was around twice as fast which didn't make much sense.
Nice work though!

"Young lady, in this house we obey the laws of thermodynamics!" -- Homer Simpson
Related Articles
More "Penryn" Benchmarks Revealed
August 22, 2007, 2:25 PM
Intel Sets "Penryn" Launch Date
August 14, 2007, 6:18 PM
"Penryn" Benchmarks Hit The Web
August 7, 2007, 4:04 PM
Quick and Dirty AMD K10 Cinebench
June 6, 2007, 5:12 AM

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki