Print 31 comment(s) - last by Kougar.. on Jun 29 at 12:26 PM

AMD "Barcelona" die shot (Source: AMD)

SPECfp_rate2006 performance (Source: AMD)

SPECint_rate2006 performance (Source: AMD)
AMD showcases a 2.6 GHz simulated "Barcelona" -- and everyone has something to say about it

As more details of AMD Barcelona continue to trickle out, I had the opportunity to discuss some of the newest benchmarks with a few analysts.

Just a few weeks ago, AMD's website unveiled specific SPEC benchmarks claims.  A footnote to the benchmarks claims the following:
The comparison presented above is based on the best performing x86 Dual-CPU Dual-Core configurations with the Xeon 5160 and AMD Opteron processor Model 2222 SE, and Dual-CPU Quad-Core configurations with Xeon 5355.  Dual-CPU Quad-Core AMD Opteron processor estimates based on internal AMD simulations at 2.6GHz.
For reference, the Intel Xeon 5355 is clocked at 2.66 GHz and is priced around $1,600.  The Intel Xeon 5160 has a core frequency of 3.0 GHz and runs approximately $850.  AMD's 3.0 GHz Opteron 2222 SE runs just under $1,000 at retail.

AMD guidance puts the SPECint_rate performance of two quad-core 2.6 GHz Barcelona approximately 23% higher than the quad-core 2.66 GHz Xeon 5355; a score of approximately 104 versus 84.8.  SPECfp_rate performance puts the Barcelona performance almost 58% higher than that of the Intel Xeon 5355; 92 versus 58.8.  Seperate AMD documentation puts these figures at 21% and 50%, respectively.

In January of this year AMD corporate vice president for server and workstation products, Randy Allen, boldly stated, "We expect across a wide variety of workloads for Barcelona to outperform Clovertown by 40 percent."  That 40%, it would appear, is potentially in-line with figure proposed by the SPECfp_rate benchmark.

Industry analyst, David Kanter, tells DailyTech, "Historically high performance computing is the greatest strength for AMD."  He clarifies, "For HPC, AMD is going to wipe the floor with Intel."

Barcelona HPC improvements include a wider instruction set, L3 cache, new SIMD support and better branch prediction.  Kanter also claims, "Improvements that Barcelona is making are not necessarily as targeted for single threaded performance." Specifically, Kanter discounts SSE improvements as a major performance head turner, but for some applications it certainly is a huge single threaded help.  For example, it's not going to make web browsers or word processors faster; but it would certainly help single threaded performance for games and numerical stuff.

AnandTech founder Anand Lal Shimpi disagrees on Kanter's dismissal of new SSE instructions on Barcelona.  "Many of the major changes to Barcelona were driven by one significant change: what AMD is calling SSE128," he states.  Shimpi tells DailyTech, "The culmination of the SSE128 improvements is very similar to some of the changes made in the Yonah to Merom transition."

However, what the SPECint_rate and SPECfp_rate benchmarks don't show is the ability to handle process-to-process throughput rates.  Kanter highlights this to DailyTech, stating, "For stuff like web serving, application serving, I think Barcelona will kinda be a mixed bag, won't be a real home run."  He clarifies this by emphasizing many of the K10 changes have possible drawbacks, including the split power-plane. 

"The split power-plane, while saving power, has tendencies to make moving data between them a little awkward." Kanter continues, "It's a subtle thing, but in the end it will all depend on latency."

On the other hand, changes to the architecture are actually specifically geared at improving socket-to-socket performance.  Four socket systems will now utilize one 16-bit HyperTransport link to each socket on the system -- eight-socket systems will utilize one 8-bit HyperTransport link to each socket.  But, as Kanter stated earlier, this is largely an HPC change and will not affect desktop and dual-socket performance.

Ars Technica's Jon Stokes puts it the most succinctly, "If I could sum up Barcelona's substantial changes to the K8 core in one phrase, it would be: Barcelona makes better use of system bandwidth."

While Stokes, Kanter and Shimpi all allude to stronger single-core performance from Intel's Core architecture, Shimpi doesn't rule out Barcelona on the desktop just yet. "Barcelona will be a success for AMD; the long awaited architectural update to K8 should yield significant performance improvements, especially in current areas of weakness for the K8 (e.g. video encoding)," he claims.

However, there is still a lot unsaid even this late in the game.  While AMD's multi-core SPEC guidance claims the simulation runs on a 2.6 GHz Barcelona processor, guidance from the company as late as last month states the fastest debut K10 cores will top out at 2.3 GHz.

Kanter closes, "Given the evidence I saw at ISSCC 2007, I'm pretty confident the chip can make it to 2.8 GHz."

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

By PrezWeezy on 5/31/2007 6:34:44 PM , Rating: 2
Penryn is supposed (according to Intel) also give a 40% performance gain over current 65nm technology. I'm wondering if it will be in the same areas that barcelona performs well in. If so, AMD is in a bit of a tough spot. Anyone more knowledgable about CPU's care to comment on where the 40% from Intel will really take effect?

RE: 40%
By TomZ on 5/31/2007 7:33:26 PM , Rating: 2
AMD is in a tough spot in either case. Problem is, Intel has great sales momentum at the moment, and AMD will have to show serious performance benefits to gain share again, as it did in the past. Being the "same as" Intel isn't going to cut it, especially considering Intel's super-agressive pricing these days (a result of the AMD-initiated price war, I might add).

What AMD needs are processors that are clearly superior so that it can maintain a decent profit margin instead of having to react to Intel price cuts, and so it can continue to increase its market share in the profitable server and HPC market segments.

RE: 40%
By cochy on 5/31/2007 7:41:56 PM , Rating: 3
Well I think when speaking strictly in the HPC segment AMD's memory controller approach and HyperTransport will continue to make Opteron a superior product to Xeon for the next while. So no real surprises here.

RE: 40%
By Phynaz on 5/31/2007 10:56:23 PM , Rating: 2
Problem is the HPC segment is miniscule, a few thousand chips per quater.

That volume isn't going to save AMD's ass.

RE: 40%
By TomZ on 6/1/2007 12:06:19 AM , Rating: 2
The volume is in the server segment. The HPC applications mainly provide a nice marketing story.

RE: 40%
By KristopherKubicki on 6/1/2007 2:10:52 PM , Rating: 2
I'd be very curious to see how IBM's new POWER chip is doing in HPC against Barcelona myself.

RE: 40%
By Amiga500 on 6/1/2007 5:36:36 AM , Rating: 3
Correct me if I'm wrong, but wasn't the 40% jump on Penryn limited to SSE4 applications?

On others, was the jump not around 25% (IIRC)?

Anyway, for those bemoaning AMD not releasing the thing - it was the same story with the R600, yet when it was released, we had other moaning about the drivers not being optimised, it not being 65nm etc etc.

Damned if they do, damned if they don't.

RE: 40%
By Mojo the Monkey on 6/6/2007 12:33:41 PM , Rating: 2
no, they were several months late on the release date AND had problems as you describe.

RE: 40%
By Mitch101 on 6/1/2007 11:48:35 AM , Rating: 2
Only in applications using the new SSE4 instructions otherwise its 2-3% faster than the existing Intel chips.

However 45nm might open the chip up to higher speeds closing any advantage AMD might have if they ever get the chip out the door.

Either way AMD has to do a big disclosure before the Quad Intel chips drop thier prices in about 3 weeks otherwise Im getting Intel Quads reguardless if AMD's chips eventually wind up superior.

RE: 40%
By Proteusza on 6/1/2007 11:52:23 AM , Rating: 2
Surely the improvements to SSE, such as the wider execution units etc, will speed up all SIMD instructions, of which SSE4 forms a part? Thus, shouldnt any SSE instruction get a speed up.

I'm buying a pc either end of june or july (yes I know about the intel price drop) and AMD had better off something sweet to tempt me.

At the very least I'm thinking AM2+ motherboards.

You know, I'm disappointed in AMD lately. My last two CPU's have been AMD and my last two graphics cards have been ATI, and as much as I want a phenom, I have to admit Intel has AMD by the proverbial balls, and have to seriously consider that quad core offer.

RE: 40%
By Mitch101 on 6/1/2007 12:25:02 PM , Rating: 2
Oh yea on the AMD side all SSE applications will get a boost but on the Intel side its still early to tell if its just limited to the new SSE4 instructions.

I too am dissapointed in AMD lately however its not imporant until Im ready to build my own and buy coporate machines.

I also take into consideration the cost of the CPU with Mobo and ram. Like right now I dont see any need for DDR3 other than to cost more money. The cost difference between DDR2 and DDR3 could lead me to purchase a faster CPU or GPU in the end or get me to choose one platform over the other.

Bang for buck comes down to the cost of CPU, Overclock, MOBO, and RAM not just the cpu. So Phenom might do well in the cost factor because it supports AM2 sockets.

RE: 40%
By Proteusza on 6/1/2007 1:14:40 PM , Rating: 2
True, but if the prices of AM2+ mobos are too high, that will negatively hurt adoption of Barcelona because people usually want to pair the latest with the latest. Its a nice feature - AM2 backwards compatibility - and something I may even use, but I worry about how much performance I will lose due to the lower HT speed.

DDR3 just isnt worth it, no. 9-9-9-27 latencies? no way.

RE: 40%
By AnnihilatorX on 6/1/2007 2:42:35 PM , Rating: 2
People who want to pair latest with latest shouldn't worry too much about price.

There are DDR3 modules with CAS 7 latencies. They are just too expensive and performance gain is just 2-3%

RE: 40%
By Thorburn on 6/6/2007 12:41:50 PM , Rating: 2
The actual time delay with regards to latencies decreases as the clock speed rises.
1333MHz DDR3 modules running at 7-7-7-21 should actually have similar or lower latency than say DDR2-1066 5-5-5-18 modules.

RE: 40%
By FITCamaro on 6/4/2007 1:18:10 PM , Rating: 1
One place the current C2D fails is on the highend. To get dual 16x lanes with SLI right now, you have to buy a 680i motherboard which is a minimum of $200. Average board price is like $250. To get a crossfire board is far cheaper.

"Vista runs on Atom ... It's just no one uses it". -- Intel CEO Paul Otellini
Related Articles

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki