Print 112 comment(s) - last by MrPoletski.. on Jan 27 at 11:45 AM

Sandia simulations reveal memory is the bottleneck for some multi-core processors

Years ago, the hallmark of processor performance was clock speed. As chipmakers hit the wall on how far they could push clock speeds processor designs started to go to multiple cores to increase performance. However, as many users can tell you performance doesn't always increase the more cores you add to a system.

Benchmarkers know that a quad core processor often offers less performance than a similarly clocked dual-core processor for some uses. The reason for this phenomenon according to Sandia is one of memory availability. Supercomputers have tried to increase performance by moving to multiple core processors, just as the world of consumer processors has done.

The Sandia team has found that simply increasing the number of cores in a processor doesn't always improve performance, and at a point the performance actually decreases. Sandia simulations have shown that moving from dual core to four core processors offers a significant increase in performance. However, the team has found that moving from four cores to eight cores offers an insignificant performance gain. When you move from eight cores to 16 cores, the performance actually drops.

Sandia team members used simulations with algorithms for deriving knowledge form large data sets for their tests. The team found that when you moved to 16 cores the performance of the system was barely as good as the performance seen with dual-cores.

The problem according to the team is the lack of memory bandwidth along with fighting between the cores over the available memory bus of each processor. The team uses a supermarket analogy to better explain the problem. If two clerks check out your purchases, the process goes faster, add four clerks and things are even quicker.

However, if you add eight clerks or 16 clerks it becomes a problem to not only get your items to each clerk, but the clerks can get in each other's way leading to slower performance than using less clerks provides. Team member Arun Rodrigues said in a statement, "To some extent, it is pointing out the obvious — many of our applications have been memory-bandwidth-limited even on a single core. However, it is not an issue to which industry has a known solution, and the problem is often ignored."

James Peery, director of Sandia's Computations, Computers, Information, and Mathematics Center said, "The difficulty is contention among modules. The cores are all asking for memory through the same pipe. It's like having one, two, four, or eight people all talking to you at the same time, saying, 'I want this information.' Then they have to wait until the answer to their request comes back. This causes delays."

The researchers say that today there are memory systems available that offer dramatically improved memory performance over what was available a year ago, but the underlying fundamental memory problem remains.

Sandia and the ORNL are working together on a project that is intended to pave the way for exaflop supercomputing. The ORNL currently has the fastest supercomputer in the world, called the Jaguar, which was the first supercomputer to break the sustained petaflop barrier.

Comments     Threshold

This article is over a month old, voting and posting comments is disabled

RE: 2x4
By SlyNine on 1/18/2009 11:52:39 PM , Rating: 2
You ever check out your CPU load on these games. It almost never hits 100% on any core if you are not running a 8800GT or better.

I had a 4200X2 939 for some time and also upgraded to the Q6600. Ran the 4200x2 with a 1900XT and then upgraded to 8800GT, Then upgraded to a Q6600 and got my other 8800GT card after a time.

The biggest performance increase by FAR was going to the 8800GT on the 4200X2, the second biggest was adding another 8800GT. The Q6600 upgrade was nice, but not nearly the jump.

The 1900XT>7900GS , the 4850>8800GT. You had an even bigger jump in video and yet say the CPU was your biggest jump. You sir are full of poop. Or you are running all your games at 1024x768 with no AA.

RE: 2x4
By RubberJohnny on 1/19/2009 7:11:04 AM , Rating: 2
What res were you running? CPU load on the games i mostly play - TF2, BF2 and Company of heroes were maxing out some of the time at 1920x1200 with the X2 4200+ (BF2 1600x1200). When i turned off AA things did not really get any smoother so it must have been the cpu holding the 4850 back. Things didn't get acceptably smooth till i lowered the res back to 1280x1024.

However when i got the Q6600 i could run all of those games on full quality at 1920x1200 and they were sliky smooth (ok maybe COH isn't silky on full ;) Perhaps my X2 4200+ system had some other hardware issue (was a clean xp build) but i always felt that 4850 didn't perform like i expected till i paired it with the Q6600.

RE: 2x4
By SlyNine on 1/19/2009 11:53:10 PM , Rating: 2
Well, Im going to eat alittle crow, I have not played or tested those games,

I play FC2, Crysis, Americas army, SupCom ( supreme commander), Gears of war(big improvment going to the Q6600), Grid, and Stalker.

But still in all my games the biggest improvment was going to the better video card, perhaps you were having some other problem that was fixed with your Q6600 setup.

You have a much better setup moving forward then you did with the 4200x2 so no need to worry.

"We basically took a look at this situation and said, this is bullshit." -- Newegg Chief Legal Officer Lee Cheng's take on patent troll Soverain
Related Articles

Copyright 2016 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki