It is reported that AMD is trying to
track
down as many as 3,000 Opteron processors which could experience
erratic behavior under high-temperature conditions. Processors
affected include a number of single-core Opteron processors
manufactured within the past six months.
The chips were shown to experience
higher than normal core temperatures when running in a high
temperature environment. This caused the chips to flub some
floating-point calculations. From InformationWeek:
Because of the tests, AMD has changed
the screening process for rating the two product lines as the chips
come off the production line, Taylor said. As a result, some chips
that would have been rated with clock speeds of 2.8 MHz in the past
would be listed at 2.6 MHz, making them less likely to be used in
extreme computing environments.
This appears to be a separate issues
that was earlier reported by The Register claiming that a bug in a
batch of Opteron processors will result in incorrect results in
iterations with millions of loops. Coupled with high ambient
temperatures, the processor will corrupt data. The Register states:
The problem is believed to affect only a fraction - perhaps no more
than 3,000 individual CPUs - which managed to slip through AMD's
screening net. It is not known how this so-called 'test escape'
ocurred, but it took place "in part of 2005 and early 2006", an AMD
spokesman said.
Although only a few processors are
defective, the fact that no one can place an exact bearing on which
batch of processors has the problem is troubling at best. AMD claims
measures have been put in place to prevent the bug from happening
again, but also stresses that the condition is not likely to happen in
financial environments.
Intel made similar claims during the early Pentium days with the now infamous "F00F" bug.