backtop


Print E-mail del.icio.us 81 comment(s) - last by crazydrummer45.. on Jul 29 at 8:41 PM


AMD and ATI are already planning scalable designs for 2008
"Torrenza" platforms and unified GPU/CPU processors

AMD announced the $5.4B USD takeover of ATI earlier today, but the new company is already making large plans for the future.  Dave Orton, soon-to-be Executive Vice President of AMD's ATI Division, claimed that AMD and ATI would begin leveraging the sales of both companies by 2007.  However, a slide from the AMD/ATI merger documentation has already shown some interesting development plans for 2008.

Specifically, it appears as though AMD and ATI are planning unified, scalable platforms using a mixture of AMD CPUs, ATI chipsets and ATI GPUs.  This sort of multi-GPU, multi-CPU architecture is extremely reminiscent of AMD's Torrenza technology announced this past June, which allows low-latency communications between chipset, CPU and main memory. The premise for Torrenza is to open the channel for embedded chipset development from 3rd party companies. AMD said the technology is an open architecture, allowing what it called "accelerators" to be plugged into the system to perform special duties, similar to the way we have a dedicated GPU for graphics.

Furthermore, AMD President Dirk Meyer also confirmed that in addition to multi-processor platforms, stating "As we look towards ever finer manufacturing geometries we see the opportunity to integrate CPU and GPU cores together onto the same die to better serve the needs of some segments."  A clever DailyTech reader recently pointed out that AMD just recently filed its first graphics-oriented patent a few weeks ago.  The patent, titled "CPU and graphics unit with shared cache" seems to indicate that these pet projects at AMD are a little more than just pipe dreams.

During the AMD/ATI merger conference call, Meyer furthermore added that not too long ago, floating point processing was done on a separate piece of silicon.  Meyer claimed that the trend for the FPU integration into the CPU may not be too different than the evolution of the GPU into the CPU.

Bob Rivet, AMD's Chief Financial Officer, claims the combined company will save nearly $75M USD in licensing and development overlap in 2007 alone, and another $125M in 2008.  Clearly the combined development between the two companies has a few cogs in motion already.


Comments     Threshold


This article is over a month old, voting and posting comments is disabled

By rrsurfer1 on 7/24/2006 9:38:01 AM , Rating: 4
*May* be???

We're talking shared, extremely fast cache. There's no better way to keep a GPU fed. The CPU will be able to closely work with the GPU on-die. There's no doubt in my mind this type of solution will not only be faster - but also much more efficient. If done correctly this could yield huge increases in performance, and decreases in overall power use. With ATI and AMD working together, this is more than possible. I can't wait.


By DallasTexas on 7/24/2006 9:55:18 AM , Rating: 2
I agree but I'll avoid saying "definitely" because discounting the discrete path is a bit premature.

My guess is that once physics acceleration takes root, the discrete graphics option will yield better results than integrated 3D graphics. Of course, physics will some day ALSO be integrated but we're talking about 2D/3D graphics in this thread - at least I was.

regards


By Merry on 7/24/2006 9:55:57 AM , Rating: 2
but then surely if you wanted to upgrade you graphics you'd need a new processor and/or motherboard?

I dont think many would be happy with that.


By rrsurfer1 on 7/24/2006 10:00:27 AM , Rating: 2
True, that is a downside. But if they can use the on-die nature of the GPU to destroy the discrete competition, not many people would have a problem with having it on-die. Especially if it takes discrete GPUs many generations to catch up. Conceivably, with low-latency, high bandwidth access to shared cache and specialized CPU-GPU interaction, you canould make a CPU/GPU that would be unmatched by anything that has to go through a bus, and the associated latency.


By Spoonbender on 7/24/2006 10:00:18 AM , Rating: 3
Except for one thing. The GPU doesn't work on ~4mb data sets. It rushes through 250+ MB of data very quickly. So sharing cache with a CPU isn't an obvious improvement. But like the article said, it'll be great for specific customers. It could make for some nice low-power laptops with decent performance.


And about the FPU's disappearing, try rereading the article, especially the bits about the Torrenza platform. Looks like the FPU might be back with a vengeance... Full circle indeed. :)

I think the same might happen with CPU's. For everyday tasks, an integrated GPU might be a great solution. Lower costs, lower power consumption, low latency on CPU/GPU traffic.
But for "serious graphics", you'll still want to plop down a dedicated chip.


By rrsurfer1 on 7/24/2006 10:06:08 AM , Rating: 2
Good point. However, It races through *high-latency* relatively low-bandwidth memory. Cache is much faster/higher bandwidth. There are optimizations you could use there that are impossible to implement with discrete solutions. But like you I agree this would probably be most applicable in the beginning, to low-power laptops.


By SexyK on 7/24/2006 10:17:09 AM , Rating: 3
I don't know why everyone is saying the latency will be lower with this dual socket setup. You're still going to need 256-512MB+ framer buffers and last time I checked, the memory integrated onto discreet graphics cards was WAY faster than the main system memory. In fact that's one of the benefits of discrete graphics, they can keep the memory near the chip and not use sockets etc, which makes routing easier and keeps the clock speeds up.... maybe they'll have a solution for this problem with this dual socket system, but I'm not holding my breath.


By rrsurfer1 on 7/24/2006 10:28:39 AM , Rating: 2
With a good integrated memory controller on-die this would cease to be a problem. If you look it up you'll find DDR2 and DDR3 have roughly comparable bandwidth. The reason is NOT because its faster than system memory, it's because its faster than going off the discrete GPU board, and through the memory controllers and system bus. With an ON-DIE (not dual socket as you stated) GPU, the memory could be shared with the system without the additional latency that discrete boards using system memory have to deal with.


By SexyK on 7/25/2006 12:26:08 AM , Rating: 2
quote:
by rrsurfer1 on July 24, 2006 at 10:28 AM

With a good integrated memory controller on-die this would cease to be a problem. If you look it up you'll find DDR2 and DDR3 have roughly comparable bandwidth. The reason is NOT because its faster than system memory, it's because its faster than going off the discrete GPU board, and through the memory controllers and system bus. With an ON-DIE (not dual socket as you stated) GPU, the memory could be shared with the system without the additional latency that discrete boards using system memory have to deal with.


Huh? I think you're confused. A 7900GTX has over 50GB/s of bandwidth between the memory and the GPU. An AM2 system even maxed out with DDR2-800 only has a theoretical max of ~12.8 GB/s of bandwidth. That is a LOT of ground to make up.


By wingless on 7/24/2006 10:44:20 AM , Rating: 2
This is a good point and Im worried about this too, but we all should know that DDR3 is on its way to the desktop in 2007 and 2008. Also having a CPU and GPU damn near plugged together like a LEGO on this Hypertransport bus may make things very fast. They may show us the coolest tech we've ever seen in 2008 and 9.


By Clauzii on 7/25/2006 5:10:39 PM , Rating: 2
:O

That was a BIG framebuffer :O

I want that 11K x 11K resolution NOW :)


By Clauzii on 7/25/2006 5:13:21 PM , Rating: 2
... as a reply to this: "You're still going to need 256-512MB+ framer buffers..."


By SexyK on 7/25/2006 9:22:08 PM , Rating: 2
quote:
:O

That was a BIG framebuffer :O

I want that 11K x 11K resolution NOW :)


With AA and AF you can fill a 256-512MB frame buffer at much lower resolutions than that.


By Clauzii on 7/26/2006 9:44:50 PM , Rating: 2
My fault :)

I was thinking 2D :(


By pnyffeler on 7/24/2006 10:06:38 AM , Rating: 3
With the advent of Windows Vista, lumping the CPU & GPU into the same memory pool will not only be feasible but also the next logical move. Before Vista, GPU's were more or less beyond the control of the OS, so in order for them to work, they needed to have their own supply of memory that they controlled themselves. That was either in the form of on-card memory or shared memory for built-in GPU's. As everyone knows, shared memory sucks because the bandwidth is too small.

Now enter Vista. The OS can now manage the GPU as it does for the CPU. That also means that it can regulate the memory allocated to the GPU, and having separate memory supplies for the CPU and GPU becomes wasteful. Currently, if the GPU isn't active, the CPU can't use the GPU's unused memory space, and vice versa. By giving the two processors access to the same memory, you can allocate memory use as needed to either, or, even cooler, you can point the GPU to directly read information that the CPU has just written.

Finally, with Vista being a 64-bit OS, you've eliminated the 4 GB memory limit, making it possible to stuff you're rig with RAM. With 8 GB of RAM, you could have 3-4 GB allocated to your game of choice, 2 GB of the RAM allocated to the GPU to make it look really pretty, and still have enough RAM left over to keep all of your other programs happy.

Better start saving your allowances now....


By rrsurfer1 on 7/24/2006 10:11:07 AM , Rating: 2
Real good point.


By piraxha on 7/24/2006 12:54:13 PM , Rating: 2
The merging of CPUs and GPUs has already started, at VIA:

http://www.viaarena.com/default.aspx?PageID=5&Arti...

"To achieve this, VIA’s hardware strategy involves the explicit design of more performance per watt at the silicon level and more features per square inch at the platform level. To demonstrate this, Wenchi showed the fourth generation VIA processor named John. John features the CPU, chipset and graphics processor in the one package."

It should make for some interesting competition.