backtop


Print E-mail del.icio.us 92 comment(s) - last by phantomlives.. on Aug 27 at 4:21 AM


  (Source: lol screenzors)

Compare this image, with the image above. Note the missing components, and how they are present in the image above. These images will be combined to form the final image. This division is based on how fast each GPU is, and allows for near-linear scaling.  (Source: PC Perspective)

On the left the full screen is displayed, on the right one of the GPU's workload. Note the scene on the right is missing the floor, which is being rendered on the other GPU.  (Source: PC Perspective)
New chips open the door to gaming rigs with a mix of ATI and NVIDIA cards

NVIDIA is busily plugging away with its 200 series and marketing various SLI solutions in the form of anything from a pair of 8000 or 9000 cards to its top end -- a pair of 280 GTXs.  AMD is similarly pushing its 4850/4870 CrossFire solutions along with CrossFire for its new dual-GPU 4870 X2 cards.  The key thing is AMD/ATI cards are not compatible with NVIDIA cards -- CrossFire and SLI are two different technologies.  Furthermore, most motherboards either support SLI or support CrossFire -- most don't do both.

Enter Lucid, also known as LucidLogix, a fabless semiconductor designer (meaning it outsources its chips to other company's fabs, such as TSMC).  Lucid is far from a known name in the graphics industry, though that may soon change.  With the help of Intel Capital backing and over 50 patents, it has developed a technology that seems poised to rock the graphics industry.

The groundbreaking technology is titled the HYDRA Engine.  The accomplishment of the engine is nothing short of unbelievable to those who follow the graphics industry.  It uses hardware and software to allow virtually any AMD/ATI and NVIDIA GPU to work together and share workloads with the CPU, scaling programs almost linearly.  You could probably call the HYDRA Engine CrossFire-SLI, though you might run into a spot of legal trouble in trying to do so.

Lucid isn't just redeploying existing technologies -- it's improving on them.  AMD/ATI and NVIDIA use two technologies for their multi-GPU solutions.  One is split frame rendering, in which each card renders a part of the frame.  The drawback of this is that it requires synchronization of all texture and geometry data on both GPUs and thus memory bandwidth limitations from a single card remain.  The other solution commonly used is alternate frame rendering.  This approach also has a significant downside, in that it introduces latency in the time it takes to switch between GPU connections.

The HYDRA Engine offers a hybrid solution.  The heart of the engine is the silicon chip, which splits up the graphics workload in hardware.  Lucid has a unique driver which will interface Direct X to GPU vendor drivers, after the division of workload.  Information from games first gets passed to Hydra's software, which splits it into tasks.  The set of tasks is then set to Lucid's hardware, which splits the work up between up to 4 GPUs.  A typical task might be rendering a specific part of a scene, adding lighting, or other common graphical chores.  After the GPUs finish their respective parts, it sends them to one of the GPUs to coalesce into a final output.  The whole process is very fast.

According to the Lucid the system has "virtually no CPU overhead and no latency" when compared to single-card solutions.  The approach is very different from AMD/ATI and NVIDIA's work in that it actually intercepts Direct X calls before sending them to the GPU and intelligently splits scenes up, as opposed to "brute force" rendering them.

While the engine is capable of cruder split frame rendering, which it performs well, it really shines when it splits the scene up with this custom logic.  Individual components in a scene -- say part of the floor and windows -- are sent to one GPU while other parts -- say your character and the walls are sent to the other.  With virtually no additional overhead the entire scene is rendered nearly twice as fast.  Where SLI/CrossFire offer only 50-70 percent scaling at best, Lucid claims its solution is near 100 percent -- linear scaling.

One of the strongest points is that the engine is not reliant on specific graphics drivers.  Thus graphics cards and drivers can come and go, but the engine will still work. 

The hardware/software may find its way into graphics setups in two ways.  The first, it could be added to motherboards to allow improved multi-GPU and support for both AMD/ATI and NVIDIA.  Second, it could be deployed by card manufacturers on dual-GPU boards, such as the 4870 X2, in place of the standard PCIe bridge.

Unfortunately there is one key catch with current technologies.  Current operating systems like Windows Vista only support one graphics card driver simultaneously running.  So until Microsoft allows AMD/ATI and NVIDIA drivers to coexist, a dual system remains entirely impossible and out of Lucid's hands, short of hacking the OS.

Even if this capability is never supported, though, if Lucid can merely live up to its claims, it will be a groundbreaking development in the graphics industry.  Not only will it allow for repurposing of old graphics cards, but it will render NVIDIA's SLI chipsets and ATI's CrossFire connectors essentially obsolete.  Further, if offered at a reasonable price, it would be hard for motherboard makers to not put one of these on their board to offer users to choose between the two graphics giants.

CrossFire-SLI may be a bit in the future still, but Lucid is dreaming big, and NVIDIA and ATI/AMD better watch out.



Comments     Threshold


This article is over a month old, voting and posting comments is disabled

if it's too good to be true, it's probably not true
By Dribble on 8/20/2008 11:21:51 AM , Rating: 5
While I am happy to be proved wrong, history tells us that unknown companies promising magical solutions generally don't deliver.




By Brandon Hill (blog) on 8/20/2008 11:22:52 AM , Rating: 5
You mean to tell me you don't have a Segway?


By jabber on 8/20/2008 11:31:49 AM , Rating: 5
Hopefully we'll be looking back in 5 years and laughing at how utterly retarded SLI etc. was.

"Yes folks used to buy two $500 graphics cards even though they knew that chances are they would only get a 20% boost over using just one card!"

Mmmm intelligent render load balancing (my description).

IRLB...sounds catchy.


By jabber on 8/20/2008 12:30:51 PM , Rating: 5
"No one is "retarded" for paying extra for more performance. "

Hmmmmmm careful now. I think you are on thin ice there.

Doesnt your first statement contradict your second statement above?


By Diesel Donkey on 8/20/2008 1:17:20 PM , Rating: 5
Right, because the only thing a Ferrari has on a Civic is higher top end speed...I think something is missing here.


By feraltoad on 8/21/2008 5:21:39 AM , Rating: 3
You're right. The Civic has cup-holders. The nice kind that can hold a Big Gulp or a Route44 Cherry Limeade. Plus, if you buy the kind of cup holder that hangs on the door, when the Ferrari's butterfly door opens the drink will fall out on your head. Boy, that would be embarrassing!


By sviola on 8/21/2008 9:22:58 AM , Rating: 3
The word you are missing is "p****-magnet" :)

(read it with Cazakistan accent, please :D)


By mmntech on 8/20/2008 12:23:25 PM , Rating: 3
The whole idea of multi card systems became retarded when dual core CPUs were introduced. I would have thought we'd see GPUs with multi-cores on one die by now. I guess there's more financial incentive to sell the same item twice, rather than seeking more practical solutions. The X2 cards (two GPUs on one PCB) are starting to change that but they're doing nothing that 3DFX didn't already try 10 years ago.

This certainly is an interesting concept though it is not that revolutionary. I've heard of modders being able to coax Crossfire to run on SLI boards and vice versa. There was even a rumour that allowed SLI to be run on ASRock's PCIe/AGP boards. The real sticking point is the card mixing. I doubt AMD or nVidia is going to sit on their laurels when that piece of tech comes out since it takes away the multi-GPU monopoly both companies have.


By Silver2k7 on 8/20/2008 1:28:42 PM , Rating: 2
A hydra chip and 3-4 GPU chips on a single PCB its a nice dream, who will be the first to realize it ;)


By djc208 on 8/20/2008 2:27:36 PM , Rating: 5
Probably no one since you'd need a small nuclear reactor to power the monster. Then there's the cooling tower needed for heat dissipation (from the card, the reactor needs it own).


By StevoLincolnite on 8/20/2008 2:03:07 PM , Rating: 2
quote:
they're doing nothing that 3DFX didn't already try 10 years ago.


Actually ATI was the first Manufacturer to release a Dual GPU Single PCB Design called the "Rage Fury Maxx" Which was 2 Rage Fury Cards on the one PCB, released in 1999. (Holy Cow! 9 years ago! I feel Old, I had one of those cards when they first came out...)

Actually I think the "GPU" naming scheme only applied to cards which supported TnL? I can remember when the Geforce 256 was released how nVidia was claiming it as a "GPU" and described the difference against other non TnL capable cards, and ATI then called it's cards "VPU's".

While 3dfx Muttered that TnL was Useless and Anti-Aliasing as well as it's T-Buffer or F-Buffer was the way to go.

While Matrox held the 2D Image Quality crown, and S3 were having issues with it's Savage Cards having TnL faults.

The Voodoo 5 which came much later had 2 VSA100 chips, and the Voodoo 5 6000 was going to have 4 chips and a separate TnL chip, however it never reached the market.


By gforcefan on 8/20/2008 7:10:26 PM , Rating: 2
Actually, you forgot about the obsidian x24. That was two voodoo2's on one board. And that came out a long time before the ATI.


By Oscarine on 8/20/2008 8:29:03 PM , Rating: 1
Before the X-24 there was SLI Voodoo 1....

QUANTUM3D SHIPS NEW HIGH-PERFORMANCE REALTIME 3D GRAPHICS ACCELERATOR
Scaleable Realtime 3D Accelerator for PCs Provides Industry Leading Performance

SANTA CLARA, CA - OCTOBER 21, 1997 - Quantum3D, Inc. announced today that it has begun shipping its new scaleable realtime 3D graphics accelerator, the Obsidian 100SB. A replacement for the company's dual-board Obsidian 100 series of products, the 100SB delivers equivalent performance in a single PCI slot along with a host of new integration features— at a much lower price. The 100SB joins the Obsidian family of products for visual simulation, training, coin-op, location based entertainment and game enthusiast applications. The Quantum3D Obsidian 100SB starts as low as $795 MSRP.

Based on a scan-line-interleaved, 4- or 6-chip implementation of 3Dfx Interactive's award-winning Voodoo Graphics chipset, the Obsidian 100SB employs the chipset's patent-pending “texture streaming” architecture to produce up to 2.4 Gigabytes per second of dedicated graphics memory bandwidth. This high level of low-latency bandwidth enables the 100SB to deliver filtered texture fill rate performance of 90 Megapixels per second, with trilinear or bilinear texture filtering with per pixel LOD mip mapping, z-buffering, alpha blending, perspective correction and per pixel fog enabled-- which significantly exceeds the performance delivered by all other PC graphics accelerators, as well as most graphics workstations and image generators, irrespective of cost.

On Gemini Technology's OpenGVS Real World Benchmark Release 2.0, the Obsidian 100SB-4440, when coupled with a 300 MHz Intel Pentium II PC, delivers an average frame rate of 97.1 frames per second--approximately four times the performance delivered by the graphics boards based on the new Evans & Sutherland/Mitsubishi/VSIS 3Dpro/2mp chipset (score of 26.2), five times the performance of the Silicon Graphics O2 (score of 15.4), and almost twice the performance of the Real3D Pro 1000 model 1400 (score of 59.9). Additional information on the OpenGVS Real World Benchmark Release 2.0 may be found at (http://www.opengvs.com).

"Falcon Northwest integrates only the fastest, most reliable and best supported hardware in the industry into our gaming PCs. Quantum 3D's new Obsidian 100SB has surpassed our standards on all counts, giving us twice the 3D performance of any other PC on the market," said Kelt Reeves, president of Falcon Northwest, makers of the Falcon Mach V Gaming PC. "We're pleased to offer it to our customers."

As an exclusive mode 3D accelerator, the 100SB's advanced pass-though design operates transparently with popular 2D/VGA windows accelerators. In addition, the 100SB offers the option of adding an integrated 2D/VGA capability by means of “MGV”— a 2MB windows accelerator daughter card which eliminates the need for a second 2D/VGA graphics board, which in most systems frees up an additional PCI slot. Another new integration feature on the 100SB is SyncLock-- which enables developers and integrators to synchronize video refresh across up to 13 displays for wide field of view 3D applications and totally immersive environments. This new feature greatly reduces the occurrence of “beat frequencies” and other annoying artifacts that are distracting in multi-channel visual simulation, training, and entertainment applications. The Obsidian 100SB also features simultaneous RGB and NTSC/PAL “TV-out” output capability with support for both S-Video and composite formats. The new accelerator is optimized for running applications under leading primary 3D APIs, including Microsoft Direct3D, OpenGL and 3Dfx Interactive's Glide. The 100SB solution also has a unique on-board authentication feature designed to enhance the protection coin-op games, visualization applications and other proprietary software from software pirates.

"With the Obsidian 100SB, I get a combination of SLI performance and competitive 2D with the MGV daughter card--all in one slot as opposed to three," says W. Garth Smith of MetaVR. "Quantum3D's new 100SB graphic accelerator together with MetaVR's run-time format enables our VRSG product to be a viable alternative to proprietary visual simulation image generators. With the power of P-II based systems and the 100SB, many of our customers have eliminated the need for expensive SGI systems for deployment applications


By goku on 8/20/2008 3:13:10 PM , Rating: 2
Dual core GPUs aren't needed like they are with CPUs. GPUs are inherently parallel and making it more parallel is in a sense making it "multicore". Multicore processors are just multiple processors on one die, not multiple (die/dice/dies?) with a processor per die hooked up with wires. The reason for this is because you can't add complexity to a CPU with out having software specifically written for it but on a GPU all you need are new drivers.

Since you can't add complexity to a CPU with out any major software changes, OS changes, patches or what ever, you're limited in how to boost performance. Smaller processes allow for higher clock speeds which is what you've seen, but because 100 million transistors makes for a very small die at 32nm process, you end up with a high yield in silicon and you have to produce a lot MORE processors in order to break even in costs. So what do you do? Add more transistors. Well since you can't be adding instructions like SSE, MMX etc. because it takes time for software to take advantage of and is limited in how much it can improve performance, you've got to find a way to use up more silicon space.

That is where multicores come in. So instead of one processor, you have four, on one die, connected together in a very crude way (Intel) and now you've just increased the amount of silicon you're using while theoretically increasing performance 400%, a win for you (since you need to use up the silicon) and win for the consumer who supposedly gets a massive performance increase (never happens).

For GPUs, you don't need to worry about multiple GPU cores. Why? Because GPUs are proprietary, there generally isn't any special software being written for them, no OS interaction with them (unlike a CPU), the only thing that bridges the OS and the GPU is a simple driver, a driver that can be rewritten at anytime. So because GPUs are inherently parallel, that you can add "X number of stream processors" or what ever concoction they thing of next in order to boost performance (aside from a simple clock speed increase), there is no need for "Dual Core GPUs".

I'm not saying a Dual Core GPU is not possible, but what I am saying is that efforts towards a dual core GPU would be better spent making a better, more efficient GPU architecture, something that Intel and AMD can't, won't or don't need to do. (Cause they've already done it?)

Also one important thing you should remember is that GPUs run FAR hotter than CPUs, are FAR more complex (tons more wires hooking up into it), and TONS more power than a CPU. So imagine doubling the amount of heat being produced in the exact same space and you end up with a disaster. But because CPUs have gotten cooler now, multicore works a lot better than it did for the P4 series as those ran hot even on the smaller processes.


By Garreye on 8/20/2008 7:44:17 PM , Rating: 2
I think another big problem with multicore GPUs is yields. GPUs are already huge chips and it's hard to get good fab yields on them as it is and doubling or tripling the size probably wouldn't help matters


By someguy123 on 8/20/2008 11:14:37 PM , Rating: 2