backtop


Print E-mail del.icio.us 128 comment(s) - last by vignyan.. on Sep 12 at 1:21 AM


AMD "Barcelona" die shot.  (Source: AMD)

AMD's Opteron and upcoming Phenom logos.  (Source: AMD)

AMD guidance suggests the breakout of AMD's three thermal envelops for Opteron.  (Source: AMD)

AMD's performance estimates compared to comparatively priced Intel processors. (Fluent 6.4.3 is the actual version used, AMD made a typo)  (Source: AMD)

AMD's explanation of the new Average CPU Power metric.  (Source: AMD)
AMD pulls the wraps off "Barcelona," partners are now receiving shipments

AMD is prepared to launch its next-generation Barcelona CPU architecture this Monday. Barcelona is the first K8-based product to feature a substantial amount of architectural changes since the original launch of AMD’s Opteron and Athlon 64 processors. Substantial architectural changes aside, Barcelona features evolutionary enhancements to the existing K8.

Barcelona is the company’s first quad-core CPU architecture and features a native quad-core design. Intel’s previously released Clovertown, Kentsfield and upcoming Harpertownand Yorkfieldquad-core processors feature two Core-architecture dies on a single package – effectively quad-core, however, not a native design like Barcelona.

AMD equips Barcelona with plenty of new tweaks and features to boost performance. New features of Barcelona include tweaked cache, memory controller, branch predictors, prefetch logic, power management and additional AMD-V extensions.

Barcelona’s cache configuration includes L3-cache – a feat AMD has not taken advantage of since its K6-III+ and K6-2+ processors. All CPU cores on Barcelona-based processors share 2MB of L3-cache. L1 and L2-cache remain unchanged with 128KB of L1-cache per core and 512KB of L2-cache per core. The cache configuration is unchanged with 2-way associative L1-cache and 16-way associative L2-cache. The shared L3-cache is 32-way associative.

Barcelona-based processors feature a total of 4.5MB of on-die cache. In comparison, Intel’s Clovertown and Kentsfield quad-core architectures feature 64KB of L1-cache, 4MB of shared L2-cache per pair of cores for a total of 8.25MB on-die cache.

AMD tweaked Barcelona’s memory controller for greater bandwidth efficiency and lower latency. This time around, AMD took a different approach for the memory controller. Instead of a single 128-bit wide memory controller, AMD split the memory controller into two 64-bit wide memory controllers. This allows the memory controllers to achieve greater efficiency by operating independently.

AMD designed the new memory controller with future memory technologies in mind. Barcelona will initially debut with support for DDR2 memory, but it’s first refresh, in the form of Shanghai, will support DDR3 memory.

New to the memory controller is a DRAM prefetcher. The DRAM prefetcher intelligently prefetches data it deems useful in the future. DRAM prefetching does not store data in the L1, L2 or L3-caches as it has access to its own buffer.

Barcelona features a new 512-entry indirect branch predictor – a feat Intel debuted on its Pentium M processor. The new indirect branch predictor reduces mispredicted branches for greater efficiency. Greater efficiency also translates into lower power consumption as well.

In addition to the new 512-entry indirect branch predictor, Barcelona has improved prefetcher logic too. The new prefetcher logic retains the same two prefetchers per core as the K8 architecture; however, AMD has tweaked it for greater performance. With the new improved prefetcher logic, Barcelona brings prefetched data directly into the L1-cache. AMD’s K8 architecture brought prefetched data into L2 cache.

AMD’s SSE implementation sees substantial upgrades as well. Barcelona increases the SSE execution width to 128-bits. K8 featured an SSE execution width of 64-bits that can execute two 64-bit SSE instructions at the same time.

Although K8 featured parallel 64-bit SSE instruction execution capabilities, 128-bit SSE instruction execution required extra time to divide the 128-bit instructions into two 64-bit operations. This allows Barcelona to execute SSE instructions quicker than K8.

SSE instruction fetch bandwidth is also improved. Instruction fetch bandwidth increases to 32-bytes per cycle over the previous 16-bytes per cycle of K8 with Barcelona. AMD increased the internal interconnect between the memory controller to L2-cache to 128-bits per cycle over K8’s 64-bits per cycle too.

AMD’s Barcelona has power management changes too. With Barcelona, the power planes are split, allowing the processor and memory controller to operate independently at different speeds and voltages. However, to take advantage of split power planes, a new motherboard is required, as current motherboards lack the required power circuitry.

Each processor core can dynamically adjust its clock speed depending on load too. The new power management features allow Barcelona quad-core processors to operate with the same thermal envelope as current dual-core Opteron processors.

Lastly, AMD has added new AMD-V instructions. The new instructions provide hardware acceleration of shadow paging – allowing guest operating systems to have independent memory management. AMD refers to the new feature as nested paging.

All the architectural improvements and quad-cores bring the Barcelona transistor count to 463-million transistors. Intel’s Kentsfield features 582-million transistors, though it has nearly twice as much cache. Nevertheless, Barcelona-based processors will be manufactured on a 65nm fabrication process.

Power consumption of quad-core Barcelona processors is identical to dual-core counterparts. AMD has three thermal bins for Barcelona, similar to dual-core models. Standard, HE low-power and SE high-performance thermal bins will be available. However, AMD will not launch SE models until Q4’07.

AMD’s ACP measures the entire CPU’s power draw, including cores, memory controller and HyperTransport links. The measurements are conducted using “commercially useful high utilization workloads,” according to AMD’s Barcelona presentation.

The workloads used to measure ACP include TPC-C, SPECcpu2006, SPECjbb2005 and STREAM. AMD ACP ratings result in lower power consumption numbers, which the company claims is more reflective of real world use, instead of the overestimation of the TDP rating system.

AMD Opteron 2300 Series

Model
Core
Frequency
TDP
Launch
Price

2350
2.0 GHz 95W$372

23471.9 GHz 95W$312
2347 HE
1.9 GHz 68W$372
2346 HE
1.8 GHz 68W$251
2344 HE
1.7 GHz
68W
$206


AMD Opteron 8300 Series

Model
Core
Frequency
TDP
Launch
Price

8350
2.0 GHz 95W$1,004

83471.9 GHz 95W$774
8347 HE
1.9 GHz 68W$861
8346 HE
1.8 GHz 68W$688

AMD has nine Barcelona-based Opteron 2300 and 8300 series models set for launch. Launch clock speeds range from 1.7 GHz-to-2.0 GHz, with higher speeds available Q4'07. The company also expects speeds to ramp up to 2.3 GHz and above in Q4’07 with SE-bin models, which typically have 120-Watt TDPs.

Expect AMD to debut Barcelona-based Opteron 2300 and 8300-series on September 10. Socket AM2 users looking for a quad-core processor will have to wait until later this year for Budapest-based single-socket Opteron or Agena-based Phenom X4 and FX processors.


Comments     Threshold


This article is over a month old, voting and posting comments is disabled

the first wait is almost over
By nerdye on 9/7/2007 6:37:41 PM , Rating: 3
This article leaves me starving for some benchmarks, seeing how inexpensive some of these chips truly are has me concerned. Whether or not these chips beat their intel counterparts we all are dying to see the advantages of this new architecture. I'm assuming that AMD will not be able to fight intel for the heavy weight performance crown until ramping the clock speed on Barcelona a bit. I hope 4th quarter this year (the second wait)is the time if it is not this upcoming monday.




RE: the first wait is almost over
By zpdixon on 9/7/2007 8:46:33 PM , Rating: 5
Look at this comparison of the Opteron 2350 (2.0 GHz) against the Xeon 5345 (2.33 GHz): http://images.dailytech.com/nimage/5924_large_amd_...

Despite a 14% slower clock frequency, the Opteron is:
- faster : between 7% and 189% faster than the Xeon [1]
- cheaper : $372 vs. $455

I told you so ! [2]

Interestingly enough, it means that the only quad-core Xeon (2P) processors faster than the Opteron 2350 are the Xeon X5355 and X5365 at, respectively, $744 and ~$1100. IOW, since the high-end processors ($700+) represent only 10-20% of the 2P market, Barcelona seems the fastest/cheapest solution for 80-90% of the 2P market !

Can't wait for the 2.3 GHz Barcelona announced for Q4 :)

[1] These AMD's benchmarks should give reasonably accurate numbers, at least as accurate as the Intel's benchmarks for Core2 in Q1/Q2 2006.
[2] "K10 will assuredly regain the lead over the Core2 microarchitecture:"
http://www.dailytech.com/article.aspx?newsid=8200&...


RE: the first wait is almost over
By TomZ on 9/7/2007 8:57:48 PM , Rating: 2
Don't get so aroused yet - remember you're looking at AMD marketing benchmark numbers. Let's wait for real, third-party benchmarks from AnandTech and the like before we get all luvvy-duvvy.


RE: the first wait is almost over
By Spuke on 9/7/2007 9:47:22 PM , Rating: 3
Oh so it's ok to use Intel's benchmarks and get all giddy over those but AMD's benchmarks are crap? Right! Quit bogarting Tom. Puff puff pass...puff puff pass.


RE: the first wait is almost over
By TomZ on 9/7/2007 9:59:40 PM , Rating: 4
No, I distrust both AMD and Intel-supplied benchmarks equally. Obviously it is each company's best interest to cherry-pick benchmarks to show the superiority of their product relative to the other. That happens all the time.

The only first-party benchmarks that I would trust even a little would be those where all the details are provided so they can be independently verified. Better yet is when a third-party can "witness" the benchmark running on real hardware. Even then you don't see a complete performance picture because the hardware vendor is guiding you towards only seeing results that flatter their product, again.

AMD's benchmark slide offers none of the above assuances of authenticity, hence I would take them with a grain of salt.


RE: the first wait is almost over
By vignyan on 9/12/2007 12:06:22 AM , Rating: 2
Hey... completely in agreement with you... The STREAM benchmarks are made by AMD themselves! I would like to see the day where a company posts benchmarks that had poor performance wrt competition.

Also, the SPECfp benchmark really projects the bandwidth of the processor... which AMD has advantage with its NUMA... IT DOES NOT GIVE THE ACTUAL FLOATING POINT PERFORMANCE... SPECfp is not even a realistic usage model... even in the MP segments!


RE: the first wait is almost over
By nerdye on 9/7/2007 9:00:21 PM , Rating: 2
I do also hope and forsee amd's barcelona as being a great bang for the buck, but we honestly cannot judge its performance based on 8 benchmark tests, obviously barcelona will have advantages in some benchmarks over core 2, but not until a complete set of tests ran by anandtech/dailytech is completed can we fully speculate on true performance of these new chips. The wait continues.


RE: the first wait is almost over
By shabby on 9/7/2007 9:03:51 PM , Rating: 1
You actually believe those "benchmarks"? You seem very gullible... your the perfect amd customer.


RE: the first wait is almost over
By zaki on 9/7/2007 10:53:56 PM , Rating: 2
im with shabby, until 3rd party people test out these new amd cpus, i wont be convinced of their performance,

also, arent there new (higher clocked) intel cpus coming out, regardless, of course amd wont mention the tests in which the probably still cant beat intel, its anybody's guess, despite this single benchmark pic.

BUT i really hope these benchmarks are correct, that would be awesome for the consumers.


RE: the first wait is almost over
By wetlegs6 on 9/8/07, Rating: 0
RE: the first wait is almost over
By Continuation on 9/8/07, Rating: 0
By omnicronx on 9/8/2007 6:30:45 PM , Rating: 2
And you sound like the sort of person that likes to compare apples to oranges.. Lets wait for the real benchmarks people before the flaming commences. Especially since Barcelona is the server model, and phenom is barely on the horizon..

And lets face it people, there is not going to be a clear cut winner here, AMD is going to be much better at floating point, and Intel is going to be much better at integer calcs. There is no way in my mind that either of these cpus are going to kick ass in every area with the discrepancies between the two designs.