Print 64 comment(s) - last by clnee55.. on Dec 24 at 2:33 AM

Much of AMD's bad luck over the last three months revolves around a nasty bug it just can't shake

Erratum, to those in the hardware or software industry, is a nice way of saying "we missed a test case" during development and design. 

Yesterday, The Tech Report confirmed AMD's iteration of Intel's F00F bug.  The bug, which has been documented since at least early November, can cause a deadlock during recursive or nested cache writes. 

How does the TLB erratum occur?  All AMD quad-core processors utilize a shared L3 cache.  In instances where the software uses nested memory pages, this processor will experience a race condition. 

AMD's desktop product marketing manager Michael Saucier describes a race condition as a series of events "where the other guy wins who isn't supposed to win." 

In the software world, a typical memory race condition occurs when the memory arbiter is instructed to overwrite an older block of memory, but write the old block of memory to somewhere else in cache.  In the instance where two arbiters follow this same rule set, its easy to see how a race condition can occur: both arbiters attempt to overwrite the same blocks of information, resulting in a deadlock.

From what AMD engineers would tell DailyTech, this example is very similar to what occurs with nested memory pages in virtualized machines on these K10 processors. 

AMD has since released a new BIOS patch for all K10 motherboards, including the often cited but rarely seen MSI K9A2 Platinum.  This patch, confirmed by DailyTech, will result in at least a 10% reduction in general computing speed. 

AMD partners tell DailyTech that all bulk Barcelona shipments have been halted pending application screening based on the customer.  Cray, for example, was allowed its latest allocation for machines that will not use these nested virtualization techniques.  Other AMD corporate customers were told to use Revision F3 (K8) processors in the meantime. 

The TLB erratum will be fixed in the B3 stepping of all AMD quad-core processors, including Phenom and Barcelona.  However, AMD considers the B3 stepping a "March" item on its 2008 roadmap.  Processors shipped between then and now will still carry the TLB bug, though with the BIOS workaround these machines will not experience a lockup. 

The delayed Phenom 9700 is affected by the TLB bug, though AMD insiders tell DailyTech the upcoming 2.6 GHz Phenom 9900 is not affected.  This indicates Phenom 9900 will carry the B3-stepping designation.

AMD's latest roadmap hints that its tri-core processors are merely quad-core processors with one core disabled. The company also indicated that it will introduce some of these tri-core processors with the L3 cache disabled.  Removing the shared-L3 cache from the chip design eliminates the TLB bug.

In a likely-related event, AMD's newest corporate roadmap scheduled three Phenom processors for the first half of 2008; one of which is the Phenom 9700.  The company will launch eleven new 65nm K8 processors in the same time period.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

By DesertCat on 12/5/2007 2:16:10 PM , Rating: 2
Not that I'm terribly thrilled about the idea of turning off this fix, but it sounds like the BIOS erratum fix may be something that can be turned on and off to regain the performance.

The guys at Techreport orginally said that the BIOS makers might include a toggle for this fix. They later had to correct themselves because it sounds like AMD does not want this to be an optional BIOS setting. It does sound, however, like there may be a switch inside AMD's new Overdrive tweaking tool for turning the "fix" on and off.

It's certainly not ideal, but it may be interesting to see how the enthusiasts out there deal with this. For my part, I will do comparisons on the 9600 I ordered. I have the original Asus BIOS that enables Phenom support for my motherboard (BIOS 1302 for my M2N-SLI Deluxe MB - Nvidia 570 chipset). It will be interesting to see how much of a hit I actually see in my own applications. Even with the hit, I suspect to see a substantial improvement over my current X2 3800+, especially in video encoding.

By Oregonian2 on 12/5/2007 2:20:38 PM , Rating: 2
A proper BIOS fix should have it look at the die rev and only apply the fix appropriately. I think the article stated the die rev ("or above") where the fix wouldn't need to be applied by the BIOS. It should then use that as the cutoff rev.

By TomZ on 12/5/2007 2:34:55 PM , Rating: 1
In addition, such a BIOS patch should allow the user to enable/disable the workaround, so that an informed user can themselves decide whether the risk of the bug outweighs the 10-20% performance hit (or whatever the real figure ends up being).

“We do believe we have a moral responsibility to keep porn off the iPhone.” -- Steve Jobs
Related Articles

Most Popular ArticlesAre you ready for this ? HyperDrive Aircraft
September 24, 2016, 9:29 AM
Leaked – Samsung S8 is a Dream and a Dream 2
September 25, 2016, 8:00 AM
Inspiron Laptops & 2-in-1 PCs
September 25, 2016, 9:00 AM
Snapchat’s New Sunglasses are a Spectacle – No Pun Intended
September 24, 2016, 9:02 AM
Walmart may get "Robot Shopping Carts?"
September 17, 2016, 6:01 AM

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki