Print 64 comment(s) - last by podknocker.. on Aug 24 at 4:34 PM

The first native x86 quad-core processor is finally taped out

With the news of AMD's DDR2 Opteron launch, AMD managed to squeeze in one tidbit of information definitely newsworthy: quad-core Opterons have been taped out. AMD's Executive Vice President Henri Richard had previously dubbed these native quad-core design as the K8L architecture.  Internally at AMD, this architecture is known as Greyhound.

The company's press release claims "AMD plans to deliver to customers in mid-2007 native Quad-Core AMD Opteron processors that incorporate four processor cores on a single die of silicon." For a little historical perspective, AMD's dual-core Opteron was taped out in June 2004, and then officially introduced in late April, 2005.

The press release further adds that the quad-core Opteron will be compatible with the dual-core DDR2 Opteron motherboards.  The news of backwards compatibility with existing DDR2 Opteron motherboards is in line with AMD's previous announcements on its other platforms.  On roadmaps earlier this year the company also claimed that AM3 processors would be compatible with AM2 motherboards.

Intel has recently accelerated its quad-core plans; the company recently announced that quad-core desktop and server chips will be available this year.  Intel's initial quad-core designs are significantly different than AMD's approach.  The quad-core Intel Kentsfield processor is essentially two Conroe dice attached to the same package.  AMD's native quad-core, on the other hand, incorporates all four cores onto the same die.  AMD countered Intel's accelerated roadmap by claiming the new quad-core processors would be demonstrated this year.

However, absent from AMD's quad-core announcement is any news of non-native quad-core processors.  Non-native quad-core Opterons, previously dubbed Deerhound, existed on AMD's roadmap as late as May of this year.  These 65nm processors were essentially two 65nm dual-core Opterons on the same package, but AMD has made virtually no comment on any 65nm dual or single-core processors since the AMD Analyst Day on June 1 of this year.  AMD still plans to introduce 65nm dual-core processors for desktops this year.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

To quote Ron Burgundy
By BaronMatrix on 8/15/2006 7:56:37 AM , Rating: 2

By qdemn7 on 8/15/2006 8:02:14 AM , Rating: 2
So what are the advantages / disadvantages of native vs non-native quad cores?

RE: So.......
By Griswold on 8/15/2006 8:09:54 AM , Rating: 1
Well, in this case, I would think the 4 cores can exchange data via the crossbar switch instead of doing it via RAM.

On Intels side, theres shared cache for 2 cores alright, but if the 2 packages want to share data, it'll crawl over the FSB to RAM and back to the other package via FSB instead.

RE: So.......
By icarus4586 on 8/15/2006 8:34:15 AM , Rating: 2
With "non-native" (if that's what they're calling it now) multicore solutions, data does not have to go core->fsb->ram->fsb->other core. It doesn't even have to go to the northbridge. Just core->fsb->other core. The way you described would have absurdly high latency.

RE: So.......
By Griswold on 8/15/2006 10:27:50 AM , Rating: 2
Ok. But that would depend on how many FSB channels there are per socket. Is it just one for both packages or 2 channels, one for each package? If it's the latter, I fail to see how it would work without at least doing the NB hop.

RE: So.......
By Flunk on 8/15/2006 2:49:25 PM , Rating: 2
Yes, I'm reasonably certain they have to go CPU1->Northbridge->CPU2 and vice versa.

Except for AMD designs where depending on the number of hypertransport links even non-native multicore processors can connect directly to each other assumeing there are enough HT links on the cores (for a non-native quad core composed of two native dual cores they would each need 2 HT links. One to the northbridge and on to the other on-chip processor die.).

RE: So.......
By ZachSaw on 8/16/06, Rating: 0
RE: So.......
By JeffDM on 8/17/2006 12:22:26 PM , Rating: 2
The Kentsfield will share an FSB internally, it's not like two separate busses within the processor socket. However, I don't know if the Intel CPUs would share data like that.

With AMD's arrangement, it looks like they will simply share cache, I don't think cores would directly communicate to each other with HyperTransport unless you have a two-socket system.

Missing is that the generation after Kentsfield will be a shared die, shared cache setup, last I heard, it was set for about mid-2007.

RE: So.......
By TomZ on 8/15/2006 5:25:20 PM , Rating: 2
Well, in this case, I would think the 4 cores can exchange data via the crossbar switch instead of doing it via RAM.

Why are you guys so interested in CPU-to-CPU data transfers? I would think that nearly all computing scenarios would not have much of that type of transfer. Actually, I have a hard time thinking of any such application. What would be an example of one?

RE: So.......
By ZachSaw on 8/16/2006 3:55:58 AM , Rating: 2
Inter-CPU cache syncs via MESI protocol.

This is a rather significant overhead for multi-cpu configs. Kentfield will have this problem, but it won't be apparent (thanks to its low mem bandwidth usage). But if you have 2 Kentfields running, it'll be a different story.

RE: So.......
By TomZ on 8/16/2006 9:00:10 PM , Rating: 2
Inter-CPU cache syncs via MESI protocol.

What sorts of applications would require inter-CPU cache syncs?

RE: So.......
By ZachSaw on 8/17/2006 3:46:41 AM , Rating: 2
Anything that uses more than one processor... DUH!

RE: So.......
By TomZ on 8/17/06, Rating: 0
RE: So.......
By icarus4586 on 8/15/2006 8:31:32 AM , Rating: 2
Native is a silly thing to call it, in my opinion. My guess is that, just as Griswold said, this means that the cores communicate through the L2 cache crossbar instead of the FSB. So, practically speaking, you'll get a little bit bigger relative advantage between 2 and 4 cores than you would with Intel. Probably a couple percents.

RE: So.......
By KristopherKubicki on 8/15/2006 8:43:29 AM , Rating: 2
I am fairly certain I read somewhere that Kentsfield allows the two cores to communicate directly on the chip package and not over the FSB. I will try to dig that up.

The "real" difference between a native and a non-native design is that in a native design it's a single die while a non-native design would be multiple dies. Whether or not this actually maeks a difference to performance is something else entirely.

RE: So.......
By The Boston Dangler on 8/15/2006 9:03:28 AM , Rating: 3
Shouldn't prices be lower on non-native parts? The use of 2 dice increases yields, for a small manufacturing hit.

RE: So.......
By Furen on 8/15/2006 9:45:26 AM , Rating: 2
They should but they likely won't. I'm guessing it'll just have higher profit margins for Intel.

RE: So.......
By TomZ on 8/15/2006 5:27:01 PM , Rating: 2
Yes, prices are determined based on market demand and value of the product, not just on manufacturing cost.

RE: So.......
By ElFenix on 8/16/2006 12:27:21 AM , Rating: 2
intel had a dual dice package that was tremendously expensive. it was the pentium pro. if just one die was broken (either the core or the SRAM) both had to be tossed out. that is the reason intel moved to putting processors on daughtercards with a backside cache on the daughtercard. eventually manufacturing caught up to design and both could be integrated into a single die.

of course, two dice like that might be less expensive than trying to make one massive die. intel must have thought so to begin with.

RE: So.......
By Viditor on 8/15/2006 10:36:01 AM , Rating: 2
I am fairly certain I read somewhere that Kentsfield allows the two cores to communicate directly on the chip package and not over the FSB. I will try to dig that up

Interesting...please do! I wonder what protocol they would use and how the signals will be sent/received?

RE: So.......
By KristopherKubicki on 8/19/2006 10:27:21 AM , Rating: 2
Check out ZackSaw's post a little bit above this one.

RE: So.......
By defter on 8/15/2006 1:00:28 PM , Rating: 2
this means that the cores communicate through the L2 cache crossbar instead of the FSB

However, AMD's "native quad-core design" won't have shared L2 cache...

The term "native" shouldn't overused or overhyped simply because there are so many ways to define it. For example is native quad/dual-core design which has:
- shared L1 cache (AFAIK none of these kind of designs exists)?
- separate L1 but shared L2 cache (Yonah/Merom)?
- separate L1 and L2 caches but shared L3 cache (Tulsa/new Itanium/K8L)?
- separate caches but shared interface to the memory/IO (K8)?
- separate caches and interfaces to the memory/IO but a single die (Smithfield)?

As you can see, "native" quad/dual-core can mean many things.

RE: So.......
By psychobriggsy on 8/15/2006 1:44:22 PM , Rating: 2
Smithfield was two dies, in a single package.

Otherwise, when restricting the definition to single-die chips, it's merely a matter of choosing where to put the arbiter that interfaces the multiple cores + their individual caches to the shared resources (lower level caches and interface to FSB/integrated northbridge.

"Native" in AMD's case means 'single die that has multiple cores, but externally appears to have the same I/O as a single die should'.

As an aside: Indeed you could even stretch that to include multiple dies, as long as the CPU package appears to the system as a single processor, not multiple bus loads on a FSB, i.e., to any system these are used as replacements they will work as expected, losing nothing. That's why I wondered why AMD didn't do multi-die consumer processors using HyperTransport on the package to communicate between dies. I guess that Multi-Chip-Modules aren't AMD's forté.

RE: So.......
By defter on 8/15/2006 3:32:09 PM , Rating: 2
Smithfield was two dies, in a single package.

No, Smithfield is a single die design:

RE: So.......
By PT2006 on 8/15/2006 6:20:33 PM , Rating: 2
Smithfield is a single die. Presler is basically two Cedar Mills slapped together though.

RE: So.......
By Viditor on 8/15/2006 8:09:54 PM , Rating: 2
Smithfield is a single die. Presler is basically two Cedar Mills slapped together though

No, both Smithfield and Presler are 2 dies glued together (known as MCMs), and they require the FSB to negotiate cache coherency.

RE: So.......
By coldpower27 on 8/16/2006 12:57:33 AM , Rating: 2

Smithfield is a single die, unlike Presler which is 2 dice.

However Smithfield is something like 2 Prescotts side by side, since it is a Single Dice.

Presler has the advantage of being any 2 dice, on a wafer and since they don't come from side by side necessarily, they are sperate dice.

Hence why the die size is 2x81mm2 for Presler but 206mm2 for Smithfield.

RE: So.......
By Viditor on 8/15/2006 8:13:48 PM , Rating: 2
That's why I wondered why AMD didn't do multi-die consumer processors using HyperTransport on the package to communicate between dies

Speed...HT isn't nearly as fast as the crossbar used in both the native dual core or the upcoming native quad core (and neither is Intel's MCMs).

RE: So.......
By raddude9 on 8/16/2006 5:07:37 AM , Rating: 2
Shared cache is one of the advantages.
One of the advantages that Conroe has over the current AMD chips is that more cache can be given to the CPU that needs it more, unlike the dual-core AMDs where the cache is exclusive to the core. The dual core AMD's get abour this somewhat by having a HT link between the cores, unlike the nasty old pentium D's where the two cores had to talk off-chip as it were.

RE: To quote Ron Burgundy
By Comdrpopnfresh on 8/23/2006 9:35:04 PM , Rating: 2
I see great things for this chip. Perhaps something like RHT (even though it is supposedly a myth...). With it coming out at the time of vista on the market, four cores, and HT-interconnects between the cores and the chip to the system... With the native x64 support, and an x64 OS to support it I think this chip's biggest asset, especially because of it being an Operton (maybe with any athlon-like quads too), is that it will be able to handle up to terabytes of memory. Maybe we won't start there, but imagine 16 gigs of say, DDR3, or XDR? Imagine the possibilites! Unless you had to save a document, everything would be contained on the ram, and within the L-cache. This type of technology cannot be met by Intel; because, in order to have communications between the cores and the ram too, the FSB wuld be split way too thinly, but with an onboard memory contorller like AMD has, they can go right ot native. Conroe as it is (w/o a mem controller) will perform horribly if put in a native-quad, and even the dual-dual they are coming out with. There simply isn't enough bandwidth to share between cores and the memory.... This is also the reason why any quad-core laptop chips will have lowest power crown go to amd- more buses, less speed per bus- after all, isn't that the selling point on why dual-core is superior to single?

I wanna see when there will no longer be an L2, and the CPU can simply use a partition of the ram just as quickly- talk about performance and power-saving there! (not to mention heat reduction!)

What exactly is Tape Out
By mendocinosummit on 8/15/2006 11:23:51 AM , Rating: 3
Can someone explain tape out to me.

RE: What exactly is Tape Out
By OrSin on 8/15/2006 11:34:43 AM , Rating: 2
Actually working chips,instead of similations.

RE: What exactly is Tape Out
By mendocinosummit on 8/15/2006 11:39:52 AM , Rating: 2
So pretty much Rev A

RE: What exactly is Tape Out
By tofer on 8/15/2006 12:23:29 PM , Rating: 2
Nope. Tape-out means that the physical design is final and can now be used to create the mask or blue print.

RE: What exactly is Tape Out
By Master Kenobi on 8/15/2006 1:20:03 PM , Rating: 2
Yea basically a pre-production blueprint that they can create a litho-print or whatever you wanna call it so they can actually create a chip with it. Still close to a year away from production since they need to finish retooling a few manufacturing lines (I believe this chip is supposed to be 65nm). Then do a few sample runs to test yields and make any tweaks to the process to improve yields or go as is and improve later. I'd say probably late Q1'07 or Early Q2'07. Maybe an April/May release window.

Intel's roadmap is more logical than anything else. They will throw 2 Conroe cores on a die and call it their first Quad core, then 6 months later they are supposed to roll out a "Native Quad Core" and subsequently shrink it down to 45nm. So from Intel Look for 45nm by Q2'07 pending something doesnt hold them up, and I doubt anything would. The only thing they seem to do consistently right and on schedule is process improvements and die shrinks. They have almost completed retooling a few lines for 45nm and are in the process of tooling a few over to 32nm for Q1'08.

Should be interesting.

By mendocinosummit on 8/15/2006 1:37:08 PM , Rating: 2
Intel did the same thing with the first Pentium D's and look what happened. AMD does not have the means to slap two dual cores togehter and then six months later create a native quad. It worked for them with the X2's. Are these going to be called the X4's

RE: What exactly is Tape Out
By psychobriggsy on 8/15/2006 1:46:41 PM , Rating: 2
Intel have already stated that 45nm will not be operational until Q1 2008, presumably for production manufacturing. That's only 2 and a bit years since the start of 65nm however, still a good rate of progress.

RE: What exactly is Tape Out
By kilkennycat on 8/15/2006 9:23:47 PM , Rating: 3
The digital data tapes used to create the master patterns for the silicon and conductor layers on the silicon wafers from which the individual chips are cut after probe-testing. Tape out means that the chip-design is complete for the target process ( presumably 65nm in this case ). Following first tape-out, masks are made and prototype wafers are run. With in-house mask-creation facilities, these two steps usually take 8-10 weeks. Can be shorter, but I'm sure that a lot of careful double-checking will go into a first run of a new part on a new process. After the wafers are run and probe-tested, the real hard work of checking out all of the logic and electrical functionality begins. Typically a new CPU needs several design/mask/fab/exhaustive-test/ cycles before it can be shipped. And some of these design changes are likely to include tweaks to improve process yields, as well as correcting logic problems. Hence, the estimate for the quad-core being customer-available mid-07 is not far off the mark.

RE: What exactly is Tape Out
By Questar on 8/15/2006 10:57:40 PM , Rating: 2
And some of these design changes are likely to include tweaks to improve process yields

You've got it correct except for that line. It would be insane to make design changes at this point to attempt to increase yeild. Design to process rules are already complete long, long before a chip makes it to tape out. It would be a major bungle to have to respin a chip due to missing yeild targets, and hugely costly. Hell, the delay could be a year.

RE: What exactly is Tape Out
By kilkennycat on 8/16/2006 1:49:33 PM , Rating: 2
The design tweaks to improve chip-yields are mostly related to tweaking logic to improve timing margins. Other tweaks are also applied to optimize power-consumption. Device and timing modeling for a new process is not initially as accurate as for a mature process; models continue to be honed as the process matures. Thus the tweaks in each mask-set update are usually a combination of logic fixes and yield/power-consumption optimizations.

RE: What exactly is Tape Out
By Phynaz on 8/16/2006 4:00:17 PM , Rating: 2
Device and timing modeling for a new process is not initially as accurate as for a mature process

Which is why no sane person (or company) would link a new chip design to a new manufacturing process.

By barjebus on 8/15/2006 11:11:04 AM , Rating: 2
I think that AMD's quad core is going to be superior than Intels on the basis that its a native core. Integration almost always leads to lower latencies. But, the real disappoint is the lack of any kind of revolutionary architecture. Increasing page sizes and page table sizes is simply an inevitable change, as is increasing the bit size of many protocals etc. as technology moves forwards and things get smaller and smaller.

RE: hmm
By Samus on 8/15/2006 12:57:34 PM , Rating: 2
It is the K8L after all, still 'K8'

AMD is busy on 'K9' if thats what you want to call it. Infact they snatched up a few Intel engineers that were laid off recently to help out with it. We wont see it until 2009 though.

However, don't underestimate K8, it is an excellent scallable architecture. Look how long Intel forced Netburst down our throats, and it was complete crap from the get-go.

RE: hmm
By JonM on 8/15/2006 1:19:06 PM , Rating: 2
Unless I am misstaken, they named it K8L to avoide the term K9.

RE: hmm
By Engine of End on 8/15/2006 5:30:19 PM , Rating: 2
Actually, it appears both K9 and K10 are "dead."

Although, there IS a next-generation architecture on the horizon. Odds are it won't be called K9 or K10.

RE: hmm
By The Cheeba on 8/15/2006 6:23:07 PM , Rating: 2
Considering inquirer was the source for most of that stuff, I'd be surprised if it even existed at all. It's pretty common for the media to just say a company changed its plans rather than admit they got the story wrong in the first place.

Especially at the inq.

RE: hmm
By jarman on 8/16/2006 6:17:02 PM , Rating: 2
Yeah, since they were pretty much the first to release news of the pending AMD/ATI merger, when everyone thought they were insane. Wait a mintue...

Keep on drinking that Kool-Aid pal.

RE: hmm
By shadowzz on 8/19/2006 10:34:07 AM , Rating: 2
Yeah, since they were pretty much the first to release news of the pending AMD/ATI merger, when everyone thought they were insane. Wait a mintue...

Nope, Forbes was first. Then Inquirer said it would happen. Then they said it wouldn't. Then they said it would.

It's pretty typical over there -- they just have someone publish every possible outcome so that they can't be wrong. And yet surprisingly they still haven't correctly reported that Dell is going with AMD (didn't it happen in like May?)

RE: hmm
By Viditor on 8/16/2006 5:19:10 AM , Rating: 3
Actually, it appears both K9 and K10 are "dead."

Which is a good example of why wikipedia isn't really an authoritative source...the articles can be written by anybody.

RE: hmm
By coldpower27 on 8/16/2006 1:02:45 AM , Rating: 2
Well I certainly hope it will be superior to Intel's first attempt at Quad Core which is launching in Late 2006.

Intel will have a "native" solution sometime after they shift over to the 45nm process.

By The Boston Dangler on 8/15/2006 8:34:15 AM , Rating: 2
Does this mean the chip is another evolution of the A64? Not that I find anything wrong with that, but it does exclude any next-generation advancements like what Intel has cranked out (i.e. I have no reason to dump my Opteron 165 system for AM2). I'd like to see AMD get busy with virtualization and some others, but leave out the Trusted Computing thx.

RE: K8L?
By Viditor on 8/15/2006 10:39:10 AM , Rating: 1
The Press Releases I've read from AMD call it their "Next-Generation" core, which means K8L...looks like the predictions that HKPC made were off by at least 6 months (IIRC, they said the quad core K8L wasn't due until 2008).

RE: K8L?
By coldpower27 on 8/16/2006 12:59:58 AM , Rating: 2
Well form what I remember HKEPC had roadmaps of server Quad Cores for Mid 2007. The K8L Quad Cores for desktop are the ones that are apparently not due till 2008.

RE: K8L?
By KristopherKubicki on 8/19/2006 10:31:44 AM , Rating: 2
Well form what I remember HKEPC had roadmaps of server Quad Cores for Mid 2007. The K8L Quad Cores for desktop are the ones that are apparently not due till 2008.

There were plans to have non-K8L 65nm Revision G quad-cores by mid 2007 as of May 2006 (think two Brisbane chips on a single package), but it seems like the company will just go with Rev H (K8L, native quad-core) instead. I'm not sure why the change, but AMD partner roadmaps don't go five quarters in advance like Intel so it's a little hard to track them.

RE: K8L?
By mamisano on 8/15/2006 10:50:36 AM , Rating: 2
A64 has not really changed too much over the years. Different revisions have added some enhancements but the core itself has not changed too much since its inception.

Now, the K8L is still based on A64, but supposedly with MANY additional features.
0. Native quad core
1. Hypertransport up to 5.2GT/s
2. Better coherency
3. Private L2, shared L3 cache that scales up.
4. Separate power planes and pstates for north bridge and CPU
5. 128b FPUs - see 14,15
6. 48b virtual/physical addressing and 1GB pages
7. Support for DDR2, eventually DDR3
8. Support for FBD1 and 2 eventually
9. I/O virtualization and nested page tables
10. Memory mirroring, data poisoning, HT retry protocol support
11. 32B instead of 16B ifetch
12. Indirect branch predictors
13. OOO load execution - similar to memory disambiguation
14. 2x 128b SSE units
15. 2x 128b SSE LDs/cycle
16. Several new instructions

RE: K8L?
By Wwhat on 8/17/2006 8:00:52 PM , Rating: 2
AMD AM2 CPU's already support hardware virtualisation.

By GoatMonkey on 8/15/2006 8:30:15 AM , Rating: 3
Isn't this a downgrade if most people already have 6 sided dice?

RE: Downgrade?
By Jellodyne on 8/15/2006 8:54:58 AM , Rating: 3
You ever step on a 1d4? Yee-ouch!

RE: Downgrade?
By marvdmartian on 8/15/2006 8:55:45 AM , Rating: 2
Yeah, the D&D geeks will definitely be laughing at this technology!!

RE: Downgrade?
By PrinceGaz on 8/16/2006 8:52:59 AM , Rating: 2
I have 20-sided dice!

By SilthDraeth on 8/23/2006 6:37:03 AM , Rating: 3
Zach, take this light hearted as it is meant. But maybe take it easy on us. I know I am no processor engineer, so these discussions though technical in nature, do help me understand a bit more of what is going on.

I mean you are telling some guy to go back to highschool because he isn't as cpu savvy as yourself, or other people. And his post was made in the form of a question, which meant he was asking for clarification/correction, or confirmation that he had the idea right.

Anyways, glad to have you on board. And if you have been here in the past, then welcome back too.

By MadAd on 8/24/2006 3:06:08 PM , Rating: 2
oh quad core (yawn) yeah thats really useful- now if they made it a triple core with the 4th a parallel processor then id be made up.

and if dual core is anything to go by, we might get applications to use quad core by, oooh, 2010 perhaps?


RE: fandabydozy
By podknocker on 8/24/2006 4:34:13 PM , Rating: 1
Well, you may scoff at all this multicore 'nonsense', but I run the Enterprise Desktop Edition of Linux and this is a deeply multithreaded Operating System. A lot of my software, especially the graphics stuff, would be amazingly fast over 16 or 32 cores.

Many games run on Linux and more and more software is written to reap the benefits of parallel processing these days. It won't be long until most software, including games, is recompiled to take advantage of multiple socket computers and their associated multiple cores.

Quad, quad core for gamers?
By podknocker on 8/18/2006 9:27:59 PM , Rating: 1
On a related topic, if anyone would like to have a look at the Tyan site:

Does anyone know if this monster platform will allow 2 ATI Crossfire cards to run in 2 out of the 4 16 speed PCIe slots at positions 1 and 3 (from the memory)?

You'd think it would but will true Crossfire work on this new nVIDIA chipset?

Linux on 8 or 16 cores would be awesome on this board. There is a daughterboard for this that allows another bunch of chips and slots and memory etc. Unreal.

Would double to 32 cores, when the quaddies arrive!!!

"So, I think the same thing of the music industry. They can't say that they're losing money, you know what I'm saying. They just probably don't have the same surplus that they had." -- Wu-Tang Clan founder RZA
Related Articles

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki