Print 70 comment(s) - last by tecknurd.. on Oct 13 at 6:53 PM

AMD FX series chart (click to enlarge)
Eight-core Zambezi flagship to challenge Intel's Core i7

AMD has launched its much anticipated FX series of desktop CPUs using the Bulldozer architecture. Codenamed Zambezi, the 32nm chips represent the company's top offerings for enthusiasts. Bulldozer is the first complete redesign of AMD’s processor architecture since the K7 Athlon was launched in 1999, and features significant improvements in manufacturing, design, and cost reduction.
The drive for efficiency and greater instructions per clock (IPC) was the original impetus for Bulldozer. Long gone are the days of simply increasing clock speed for easy performance gains. AMD and Intel have been increasing the number of CPU cores, but that takes up a lot of die space. Intel has been pushing HyperThreading as its way of maximizing efficiency, and is pretty good when a CPU stalls due to a cache miss, branch misprediction, or data dependency. However, AMD has decided to go a markedly different route.
Each Bulldozer module provides an independent, dedicated integer and scheduler unit for each core. A single floating point unit is shared between the two cores in a Bulldozer module, along with the fetch and decode units and a 2MB L2 cache. There is a 16KB L1 data cache per core, as well as a 64KB L1 instruction cache per module. This adds up to an impressive 128KB L1 data cache, 256KB L1 instruction cache, and 8MB L2 cache for an eight-core FX processor.
Theoretically, this should provide much better performance than HyperThreading, which functions best when there are a lot of CPU stalls because all threads must compete for available execution resources. HyperThreading increases performance by approximately 30% at a cost of 5% extra die space, but the second integer core in Bulldozer could almost double integer performance at a die cost of only 12%.
The Bulldozer architecture was originally supposed to debut in the first half of 2009, and would've enabled AMD to compete toe-to-toe with Intel on pure performance, rather than on pricing alone. However, various financial difficulties and a major recession led to delays, while the divestment of its manufacturing capacity into GlobalFoundries led to some technical delays. Almost three years late, the design has been updated significantly in order to accommodate the latest technologies and manufacturing processes.
FX chips are built by GlobalFoundries on its 32nm Silicon on insulator (SOI) process. The eight core Zambezi chips have around two billion transistors and a die size of approximately 315mm2. An integrated northbridge unit supplies an 8MB L3 cache, four 16-bit HyperTransport 3.0 links, and the integrated memory controller. Depending on the model, it runs at either 2.2Ghz or 2.0GHz. The most significant update to the integrated dual-channel memory controller is native support for DDR3 memory at 1866MHz (DDR3-1866/PC3-14900). ECC memory is still supported; a welcome relief to those who are planning on FX-based workstations, as Intel only supports ECC memory on its much more expensive Xeon workstations.
There are instruction sets aplenty: SSE3, SSE4.1/4.2, AES, and AVX. AMD is also introducing support for FMA4 and XOP. FMA4 can be thought of as specific instructions designed to speed up Fused Multiply–Add (FMA) operations. XOP is a revision of the SSE5 instruction set, redesigned to be more compatible with Intel's AVX.
The four new FX chips being launched today will require motherboards with socket AM3+, but the good news is that enthusiasts will be able to upgrade to a top of the line FX-8150 for $245. The FX-8120 will be available for $205, while the six-core FX-6100 will be priced at $165. The four-core FX-4100 with a 95W TDP is available for only $115. All FX chips are unlocked, and AMD has already set the Guinness World Record for the “Highest Frequency of a Computer Processor” by overclocking a Zambezi chip to 8.429 GHz.

Several speed bumps are already planned for Q1 and Q2 of 2012 as GlobalFoundries 32nm process matures. However, Bulldozer won't move into into the mainstream until the Piledriver refresh next year. Trinity cores featuring DirectX 11 Fusion technology will replace Llano chips, while the 10-core Komodo processor will supplant Zambezi as the FX flagship.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

RE: Pentium 4
By someguy123 on 10/12/2011 4:13:23 PM , Rating: 2, higher clocks does not mean more work done. That's the entire problem with the p4 and now bulldozer. 10ghz does not matter if a 4ghz chip manages to get more instructions per clock.

They're able to artificially boost the clock speeds by lengthening the pipeline, but clearly their IPC ended up lower than even their last generation of processors.

RE: Pentium 4
By nafhan on 10/12/2011 5:11:33 PM , Rating: 2
If you want to increase the number of instructions that a chip processes in a given amount of REAL time, there are two ways to do it:
1) More instructions per clock (higher IPC)
2) Increase the clock speed

I'm not sure what you mean by artificially increasing clock speeds. It's either increased or not. Anyway, IPC is pretty complex - especially on these new chips. On highly threaded, integer, and vector optimized workloads, BD does extremely well. It's not a great gaming chip, and it won't be all that good for most other consumer workloads, but it's definitely got some strengths. For instance, the low end BD chip will appears to be the cheapest way to get into full disk encryption with hardware accelerated AES.

RE: Pentium 4
By someguy123 on 10/12/2011 6:50:41 PM , Rating: 2
By artificial I mean increased clocks at reduced IPC. thuban clearly has higher IPC.

And you're contradicting yourself here. First you say it's complex, but then you point to limited scenarios where BD theoretically excels. With real world software and rendering tests, BD is slightly worse than the x6 phenom, except with substantial overclocking. It's spec'd at 3.6ghz stock, but all of these tests are running at full turbo or even higher than turbo with larger aftermarket heatsinks.

Add in the massive power draw, these chips don't look good from any angle.

RE: Pentium 4
By nafhan on 10/13/2011 10:51:11 AM , Rating: 2
you're contradicting yourself here.
I'd say you misread my post and don't understand the concept of IPC very well. First, IPC isn't a fixed value, it's going to depend on the type of "instruction" being processed. I listed a few areas where it looks like BD will have good throughput (i.e. areas where it will probably beat Thuban and occasionally even SB). Second, I specifically stated that it probably won't be good for most consumer applications (what you refer to as "real world" stuff, because, of course, enterprise and HPC tasks don't happen in the real world...).

To clarify, I don't think BD is the best chip ever or anything; I'm actually kind of disappointed. I was just pointing out some areas where it will probably do well.

"We’re Apple. We don’t wear suits. We don’t even own suits." -- Apple CEO Steve Jobs

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki