Print 50 comment(s) - last by surt.. on Mar 15 at 2:37 PM

Intel says parallel software is more important for many-core CPUs like "Larrabee"

Multi-core processors have been in the consumer market for several years now. However, despite having access to CPUs with two, three, four, and more cores, there are still relatively few applications available that can take advantage of multiple cores. Intel is hoping to change that and is urging developers of software to think parallel.

Intel director and chief evangelist for software development products talked about thinking parallel in a keynote speech he delivered at the SD West conference recently. James Reinders said, "One of the phrases I've used in some talks is, it's time for us as software developers to really figure out how to think parallel." He also says that the developer who doesn’t think parallel will see their career options limited.

Reinders gave the attendees eight rules for thinking parallel from a paper he published in 2007 reports ComputerWorld. The eight rules include -- Think parallel; program using abstraction; program tasks, not threads; design with the option of turning off concurrency; avoid locks when possible; use tools and libraries designed to help with concurrency; use scalable memory; and design to scale through increased workloads.

He says that after half a decade of shipping multi-core CPUs, Intel is still struggling with how to use the available cores. The chipmaker is under increasing pressure from NVIDIA who is leveraging a network of developers to program parallel applications to run on its family of GPUs. NVIDIA and Intel are embroiled in a battle to determine if the GPU or CPU will be the heart of future computer systems.

Programming for processors with 16 or 32 cores takes a different approach according to Reinders. He said, "It's very important to make sure, if at all possible, that your program can run in a single thread with concurrency off. You shouldn't design your program so it has to have parallelism. It makes it much more difficult to debug."

Reinders talked about the Intel Parallel Studio tool kit in the speech, a tool kit for developing parallel applications in C/C++, which is currently in its beta release. Reinders added, "The idea here [with] this project was to add parallelism support to [Microsoft's] Visual Studio in a big way."

Intel says that it plans to offer the parallel development kit to Linux programmers this year or early next year. The CPU Reinders is talking about when he says many-core is the Larrabee processor. Intel provided some details on Larrabee in August of 2008.

One of the key features of Larrabee is that it will be the heart of a line of discrete graphics cards, a market Intel has not participated in. Larrabee is said to contain ten of more cores inside the discrete package. If Larrabee comes to be in the form Intel talked about last year it will be competing directly against NVIDIA and ATI in the discrete graphics market.

NVIDIA is also rumored to be eyeing an entry into the x86 market as well. Larrabee will be programmable in the C/C++ languages, just as NVIDIA's GPUs are via the firms CUDA architecture.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

RE: What am I missing here?
By sinful on 3/14/2009 3:58:18 AM , Rating: 2
Now suppose it has a multi-threaded AI engine (let's say 4 threads). Whenever the time to render a screen comes up, a bunch of threads can compute the reaction of every character simultaneously, rather than one-by-one. So if you got 20 characters and need an average of 2 ms to determine the course of action per character, you use about 10 ms to compute that (20 characters x 2 ms each / 4 threads). That's parallel processing.

In theory , the parallel engine could handle 80 characters while maintaining the same performance level of the serial engine.

Of course, the multi-threaded solution has some overhead, the numbers are simplified. And I'm not even talking about the mess you can get into if a bug sneaks his way into your data synchronization mechanism. But you should get the idea why a multi-threaded game engine has more potential than a single-threaded one.

But you're comparing practical *real-world* vs. *theoretical*, and no surprise, the theoretical sounds better.

In *theory*, space travel is simple. In reality, it's really complicated to actually do.
Lots and lots of cores offers amazing speedups in *theory*, but in reality it's extremely complicated to actually do.

For example, a counterpoint to your example:
Let's say the single thread can utilize the same data for each character. Instead of each additional character taking another 2ms, you might have the 1st one cost 2ms and the subsequent ones cost .5ms. Computing it serially, your 20 characters cost about 12ms.

In contrast, the multiple threads might not be completely independent - what happens when all 20 characters are in the same area? What other factors in the environment affect them? You also assume the total cost of the AI is 100% CPU bound, and not determinant upon something else - like memory speed, etc. You also assume that all 20 AI threads would execute at the same time - requiring 20 cores. In reality, you might only have 8 cores, so not all characters can be computed simultaneously. In fact, if you've got 8 cores, you're stuck waiting for at least one core to compute 3 character AI's sequentially (i.e. 20 characters/8 cores = 2.5 characters per core = at least one core computing 3 characters). Thus, 3 characters x 2ms = 6ms - without your extra thread overhead.

After all is said and done, we'll say your multi-core approach now takes 8ms. Yes, 8ms is better than 12ms.... but how much is that going to help in the real world? And how much more complicated have you made things?

Granted, multiple cores offers some improvement - but it quickly reaches limits as to how far it scales, and to how much it can improve things.
As such, it only really shines when you have huge amounts of data to process - and the data is independent.

In your example, the benefits of saving 8ms probably wouldn't be worth it (Even though it sounds great on paper!).

Now, if you're talking about 20,000 characters, then yes, it would help.

There's a HUGE difference between theory and reality.

RE: What am I missing here?
By Scrogneugneu on 3/15/2009 11:52:52 AM , Rating: 2
Do you know anything about programming?

Multiple threads can read from the same data all the same than a single thread. Concurrency problems only happen when 2 threads want to write to the same memory emplacement, reading can be infinite. The state of everything in the game can be read, but no change will happen to it until the next frame render, so each and every thread can read the same data at the same time.

This isn't mentioning that I talked about a 4 threads engine, and you picked up and went with a 20 threads engine. If you want to compute 20 characters' actions, splitting it in 4 threads requires each thread to compute 5 characters sequentially. One thread per character is very, very wasteful.

Plus, the advantage I pointed out was that you could manage more AI resources in the same time. You can go from there and add a lot of complexity to the handling of the AI, thus ending up with a much more intelligent character. Suppose we do, and the computation time goes up to 5ms per character. By taking your own numbers (supposing the data reuse you speak of saves us 1.5ms per character), we end up with 5 + (19x3.5) = 66.5ms sequentially. Using your 20 threads example, that would be 3 characters x 5ms = 15ms.

Threading isn't meant to gain tremendous speed on everything. It's meant to handle large workloads better. Nobody will implement threading on simple tasks, but the capacity to lower the additional cost per character on AI computation is huge. The same logic goes for physics.

"Spreading the rumors, it's very easy because the people who write about Apple want that story, and you can claim its credible because you spoke to someone at Apple." -- Investment guru Jim Cramer
Related Articles
Intel Talks Details on Larrabee
August 4, 2008, 12:46 PM

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki