AMD has proven to be a tough opponent for Intel in the server processor marketplace. AMD said that its latest architecture dubbed Shanghai is now ready to go. Shanghai parts are quad-core processors that are targeted at the server market.
Shanghai is also AMD's first 45-nanometer processor. AMD's last big architecture aimed at the server marketplace was the Barcelona processor that ran into significant problems in the market. One of the biggest issues with Barcelona was that the processors were delayed for eight months after their introduction.
Once the processors were finally shipped and available, AMD had issues with the performance of the parts and the CPUs suffered from other glitches. AMD promises that Shanghai will not be another Barcelona.
Pat Patla, AMD server and workstation business general manager, told CNET News, "We had some mis-starts in getting Barcelona to market and wanted to bring as much velocity to Shanghai as possible. Learn from our mistakes and, as a company, never do that again."
To help ensure that Shanghai succeeds and offer users a better experience than Barcelona, AMD put one engineer in charge of the entire Shanghai project. Patla continued saying, "the product that we put in the hands of our partners is going to be of substantial stability so they can do lots of early validation."
To improve the performance of Shanghai over Barcelona, AMD is counting on several factors. First is the move from the 65nm process of Barcelona to the 45nm process used in Shanghai. This will allow for more efficiency and better performance than Barcelona.
AMD is also moving the cache memory from 2MB to 6MB to improve performance and it says that instructions per clock cycle will be increased as well. Shanghai will also utilize AMD's HyperTransport 3, which AMD expects to be validated in Q1 2009. A 45nm desktop Shanghai part will be offered in Q1 2009 according to CNET News.
quote: K8L's(barcelona) two floating-point/SSE pipes give it two 128-bit SSE ops/cycle, and its FSTORE pipe can do another 128-bit SSE move per cycle, for a total of three per cycle peak. This is half of Conroe's peak theoretical throughput of six 128-bit SSE ops/cycle.