Understanding AMD's "TLB" Processor Bug
December 5, 2007 11:56 AM
comment(s) - last by
Much of AMD's bad luck over the last three months revolves around a nasty bug it just can't shake
Erratum, to those in the hardware or software industry, is a nice way of saying "we missed a test case" during development and design.
The Tech Report
AMD's iteration of Intel's F00F bug
. The bug, which has been documented since at least early November, can cause a deadlock during recursive or nested cache writes.
How does the TLB erratum occur? All AMD quad-core processors utilize a shared L3 cache. In instances where the software uses nested memory pages, this processor will experience a race condition.
AMD's desktop product marketing manager Michael Saucier describes a race condition as a series of events "where the other guy wins who isn't supposed to win."
In the software world, a typical memory race condition occurs when the memory arbiter is instructed to overwrite an older block of memory, but write the old block of memory to somewhere else in cache. In the instance where two arbiters follow this same rule set, its easy to see how a race condition can occur: both arbiters attempt to overwrite the same blocks of information, resulting in a deadlock.
From what AMD engineers would tell
, this example is very similar to what occurs with nested memory pages in virtualized machines on these K10 processors.
AMD has since released a new BIOS patch for all K10 motherboards, including
the often cited but rarely seen MSI K9A2 Platinum
. This patch, confirmed by
, will result in at least a 10% reduction in general computing speed.
AMD partners tell
that all bulk
shipments have been halted pending application screening based on the customer. Cray, for example, was allowed its latest allocation for machines that will not use these nested virtualization techniques. Other AMD corporate customers were told to use Revision F3 (K8) processors in the meantime.
The TLB erratum will be fixed in the B3 stepping of all AMD quad-core processors, including Phenom and
. However, AMD considers the B3 stepping a "March" item on its 2008 roadmap. Processors shipped between then and now will still carry the TLB bug, though with the BIOS workaround these machines will not experience a lockup.
The delayed Phenom 9700 is affected by the TLB bug, though AMD insiders tell
the upcoming 2.6 GHz Phenom 9900 is not affected
. This indicates Phenom 9900 will carry the B3-stepping designation.
AMD's latest roadmap hints that its tri-core processors are merely quad-core processors with one core disabled. The company also indicated that it will introduce some of these tri-core processors with the L3 cache disabled. Removing the shared-L3 cache from the chip design eliminates the TLB bug.
In a likely-related event, AMD's newest corporate roadmap scheduled three Phenom processors for the first half of 2008; one of which is the Phenom 9700. The company will launch
eleven new 65nm K8 processors in the same time period
This article is over a month old, voting and posting comments is disabled
12/5/2007 5:22:57 PM
Is it possible that the actual reason for some of the inability of the Phenom to complete some of the tests done here at Anand the other day is due to this, rather than to some BIOS problem, at least in part?
RE: testing failures
12/5/2007 7:31:57 PM
AMD is doing their best to spin the Barcelona/Phenom situation as having this "one bad bug".
It is by no means the reality of the situation. The current steppings of Barcelona/Phenom are riddled with bugs: small, medium, and large.
The recent B2 stepping was so bad it was immediately dumped and work begun on a B3 stepping. It is not likely that further steppings will substantially fix the design. The Barcelona is a fabulous prototype design, but it is a long way away from being a polished production level design.
So AMD is going to plan B -- 65nm K8-based chips -- and will wait on volume shipments of K10 until the reworked Barcelona B+ is finished sometime 2H08.
There will some small shipments of K10 chips between now and 2H08, but only the foolish will buy these machines. If you have clout with AMD, insist on a replacement policy.
Overall, K10 is just too much change at the wrong time. Split-plane power is a nice idea, but it obsoletes all existing motherboards. A massive change from K8 is warranted for massive gains, but the gains have been elusive and even when found, merely marginal.
AMD should have used their own technology and built a simple K8+ quad-core that glued together two dual-core processors. All the cores should have been glued together with HT3, giving the processor a massive jump in on-chip bandwidth while protecting backward compatibility. Bump the cache sizes up with a move to 65nm and you have a chip that required almost zero development effort that is ready to ride. A few tweaks to the K8 memory controller to support unbuffered ECC 800Mhz and ECC 1066Mhz DDR2 and you would still have a very competitive processor. On the desktop, the many fast low-latency DDR2 modules could also be supported.
Too bad. Brute force often works better than finesse. This is something that AMD should have learned from Intel by now.
"This week I got an iPhone. This weekend I got four chargers so I can keep it charged everywhere I go and a land line so I can actually make phone calls." -- Facebook CEO Mark Zuckerberg
AMD Phenom 2008 Roadmap
December 5, 2007, 10:08 AM
AMD Resurrects K8 Architecture for 2008 Roadmap
December 5, 2007, 10:22 AM
Leaked AMD Memo Sheds Light on Phenom CPU, Motherboard Availability
December 4, 2007, 1:22 PM
"Prepare to be Punished": Microsoft is Killing OneDrive With Cuts, Blames Users
November 3, 2015, 8:23 PM
Apple's New "Magic" Peripheral Line Packs High Tech, High Prices
October 13, 2015, 9:39 PM
Samsung Adds 2 TB 850 EVO, PRO SSDs for $800, $1000
July 7, 2015, 4:23 PM
Seagate Senior Researcher: Heat Can Kill Data on Stored SSDs
May 13, 2015, 2:49 PM
How to Recover Most Apps After Your NVIDIA Driver Crashes in Windows 10
March 30, 2015, 12:54 PM
Tinkerer Gets Old School Mac Plus Running on the Modern Web
March 24, 2015, 6:41 PM
Latest Blog Posts
Sceptre Airs 27", 120 Hz. 1080p Monitor/HDTV w/ 5 ms Response Time for $220
Dec 3, 2014, 10:32 PM
Costco Gives Employees Thanksgiving Off; Wal-Mart Leads "Black Thursday" Charge
Oct 29, 2014, 9:57 PM
"Bear Selfies" Fad Could Turn Deadly, Warn Nevada Wildlife Officials
Oct 28, 2014, 12:00 PM
The Surface Mini That Was Never Released Gets "Hands On" Treatment
Sep 26, 2014, 8:22 AM
ISIS Imposes Ban on Teaching Evolution in Iraq
Sep 17, 2014, 5:22 PM
More Blog Posts
Copyright 2016 DailyTech LLC. -
Terms, Conditions & Privacy Information