backtop


Print E-mail del.icio.us 39 comment(s) - last by ttowntom.. on Aug 16 at 8:46 AM

Is the general purpose processor here to stay? Or is it just a blip on the future architectural landscape?

Back in 2001, I raised a few eyebrows with the prediction that, by the year 2020, desktop processors would contain a hundred or more cores, many of them specialized to certain tasks.  Today, with both Intel and AMD striving towards that goal, this prediction seems rather...mundane.  So lets look a little further into the future.

Is there any effective limit on the number of individual processors one can put on a chip?  1,000?  10,000?  Certainly a thousand cores by 2030 seems likely.  Right now, software developers seem to be having a hard time writing code that uses more than one effectively, but this will change.  Those programmers will die out, if need be, to make way for the new breed.

Specialized cores are inevitable for one simple reason. If you design a chip for a specific task, the same number of transistors can perform that task 10-100 times faster.  Sometimes more.   These kind of performance gains can't be ignored.

If a 100-core chip has room for a few speciailized cores, what will our thousand-core monster look like?  Certainly some just for physics, for media operations,  and one or more graphics processors-- why have a separate 3D accelerator card when you have transistors to burn?   

But what else?   Why not include some cores designed specifically to run certain applications?  Include a core for rendering HTML and running Javascript...and you have a hardware-accelerated web browser.  Why not one for word processing?   In the future, will Intel and AMD be bowing to pressure from Microsoft to include "Office 2030" cores in their chips?  Certainly a few designed just for the Windows kernel seem inevitable.  Throw in a few hundred cores, each specialized to a different set of algorithms (sorting, string searching, etc) and one has to ask-- will we need any generalized processing?  Just bounce a thread around from core to core, depending on its current needs. 

Does this mean the general-purpose processor will eventually go the way of the dinosaur?  One can argue that, while the cores themselves are specialized, the chip as a whole still isn't...and you'd be at least half-right.  But it seems clear that on an architectural level, we are eventually going to have to rethink the concept of "general purpose" processing.


Comments     Threshold


This article is over a month old, voting and posting comments is disabled

By Tyler 86 on 8/11/2006 11:40:45 PM , Rating: 2
quote:
Today, with both Intel and AMD striving towards that goal, this prediction seems rather...mundane. So lets look a little further into the future.

Is there any effective limit on the number of individual processors one can put on a chip? 1,000? 10,000? Certainly a thousand cores by 2030 seems likely. Right now, software developers seem to be having a hard time writing code that uses more than one effectively, but this will change. Those programmers will die out, if need be, to make way for the new breed.


What the hell? Loony tunes written all over this one...

Sure, parallelization is here to stay, but there comes a point when you hit diminishing returns, and stacking cores together in a hurry, you're looking to hit it face first...

The multi-core strategy began with supercomputers which employed multiple processors working together...

That's all fine and dandy, but it brought with it ridiculous overhead... That overhead is compounded with multiple cores per bus, without direct paths to memory (not so problematic with AMD's integrated memory controller).

Refining, enhancing, and adapting the architecture to the demands of the present, and phasing out the old is the way it's going to go -- the way it's been going since it began.

"Dual-core" technology solved a here-and-now economic price-to-performance problem, and the concept of multiple cores will be around forever, because of the inherent economic security in mass production of the same article...

There is a line where the overhead to manage the cores on the die becomes greater than the cores themselves, now I don't know exactly where the line is, but I'm certain to a high degree it's some relatively high power of two plus one.

Instead of 'simply' more cores, more implicitly unordered, explicitly synchronous, and more predictably asynchronous feature enhancements/instructions will be added, and the old truely simple scalar functions will be phased out.

I do agree that the current 'legacy' general purpose portion of modern ALUs will evaporate with time, but there will always be a general purpose capable portion of a CPU, be it primary, dedicated, peripheral, or emulated.

quote:
now, software developers seem to be having a hard time writing code that uses more than one effectively

This is simply not true. In the vast majority of cases where multiple CPUs are supported at all by a developer, the operations performed are limited by both supply and demand.
Encoding & encryption;
For the vaParallelization took days, and the performance benefits were vast.
Video games, demand;
Certainly, better multiple-CPU support will get you better physics, and higher framerates, playing at 640x480, low graphical quality, presenting the limit of your CPU at some ridiculously high number like 200fps, 400fps, even 500fps+... but should you crank that up, the CPU limitation bows out to the GPU limitation.




By Tyler 86 on 8/11/2006 11:49:05 PM , Rating: 2
DAMMIT, I hit post by accident, trying to rewrite this portion... NO EDIT BUTTON!

Supply, encoding & encryption;
For the vast majority of products in this area, parallelization took days, and the performance benefits were vast. Multiple-core CPUs took a gigantic leap in the charts... faster than realtime processing is commonplace. Although on algorithms with more data and less complex algorithms, such as those to be employed on Blu-ray, HD-DVD, and Dual-layer DVD, the problem will become memory limited on current dual-core systems before it becomes CPU limited. Current DVDs are only 4.7 GBs total , but these new content delivery systems are going to be breaching
4.7 GBs per second, or more with holographic storage.. Intel and AMD are averting this as best they can, by moving up the memory chain.

It's the way the system works, supply and demand. This has nothing to do with hundreds of cores, or whatever. It's just the economy.


By Tyler 86 on 8/11/2006 11:57:00 PM , Rating: 2
quote:
Specialized cores are inevitable for one simple reason. If you design a chip for a specific task, the same number of transistors can perform that task 10-100 times faster. Sometimes more. These kind of performance gains can't be ignored.

If a 100-core chip has room for a few speciailized cores, what will our thousand-core monster look like?


Oh, I see what you're imaginging, it's a market driven vision on top of engineering potential, although it should be the other way around...

You're confusing cores for parallelism.
The FPU slash MMX unit is seperate from the ALU, and the XMM/SSE unit is yet another seperate unit... they're individual cores in that respect, and they can perform instructions concurrently... and we've had this parallelism since they integrated the x87 FPU unit into the x86 CPU way, way back in the day.


By Tyler 86 on 8/12/2006 12:02:08 AM , Rating: 2
quote:
Why not include some cores designed specifically to run certain applications? Include a core for rendering HTML and running Javascript...and you have a hardware-accelerated web browser. Why not one for word processing?

To summarily answer, simply, because there's something better.
It's been employed for quite some time... Field programmable logic units. I hear they've got them up to 2Ghz plus now a days... perhaps you've seen them in the news. They're also getting their own avenue in AMD's HTX interface I hear...


By foxglove on 8/12/2006 6:02:00 AM , Rating: 2
quote:
Loony tunes written all over this one... Sure, parallelization is here to stay, but there comes a point when you hit diminishing returns

You missed the main point of the article. When you have room for that many cores, its better to do the job with a few cores of specialized logic rather than trying to split the job equally across a thousand identical processors. The diminishing returns you talk about is what drives the process.

Go and read it again, maybe it'll make more sense.


By Tyler 86 on 8/12/2006 12:13:51 PM , Rating: 2
Alright, I read it to it's entirity...

quote:
... Just bounce a thread around from core to core, depending on its current needs.

Does this mean the general-purpose processor will eventually go the way of the dinosaur? One can argue that, while the cores themselves are specialized, the chip as a whole still isn't...and you'd be at least half-right. But it seems clear that on an architectural level, we are eventually going to have to rethink the concept of "general purpose" processing.


We've already been doing that with the general purpose ALU, the FPU/MMX unit, and the XMM/SSE unit... it isn't exactly 'bouncing between specialized cores', but it accomplishes the same effect...

I don't see 'few cores' anywhere in the article...

A thousand identical cores would be easier to design, produce, and even use... In order to get the most performance out of the silicon you're working with, though, you have to specialize...

If that's the point of the article, there sure was a lot of bush beating in there.


By Tyler 86 on 8/12/2006 12:16:33 PM , Rating: 2
Ah I see, "few specialized cores"...
I don't think we'll see more than a 1:3 general-to-specialized ratio we have now.


By Trisped on 8/14/2006 1:33:29 PM , Rating: 2
But the large the number of cores on a package, the higher the falure rate. Look at the Cell processor with its 10-20% yeild of fully working 8 core processors.


By masher2 (blog) on 8/14/2006 3:30:25 PM , Rating: 2
> "the large the number of cores on a package, the higher the falure rate..."

Yields are an inverse function of die size, not core count. If you have a chip of a given size, it'll have essentially the same yield, whether it contains 10 massive cores or 1000 small ones (ignoring some slight fluctuations due to layout differences).

However, the more cores you have, the less any single failure affects you. IBM is already considering the idea of a self-healing Cell processor, whereas a spare SPE automatically takes over in the event of a failure. In a traditional processor, any failure typically faults the entire chip. On the other hand, our thousand-core example might well work fine with a hundred or more failed cores, albeit slightly more slowly.


By Tyler 86 on 8/15/2006 1:53:25 PM , Rating: 2
True... then you'd have to come up with an entirely new marketing scheme;

100-200 working cores, 100$
200-400 working cores, 200$
400-600 working cores, 250$
600-780 working cores, 300$
780-900 working cores, 350$
900-1000 working cores, 400$

Bleh, then you run into 'unlockers', 'overclockers', relabeling, speed binning problems... I'm sure they can be worked out.

Yes, larger dies directly affect the chance of defect.
The interconnects on such a complex 'x86-64' core would take up quite an ammount of space, and if it's in a linear bus-like fashion similar to Cell or graphics chips, if it were to be damaged, it could cause a very high loss count, even if the physical cores are in-tact... if they go for dedicated interconnects between cores, or even every 2 cores, or 4 cores, they end up with a very big die size problem with 1,000 cores.


By Tyler 86 on 8/15/2006 2:00:17 PM , Rating: 2
Maybe the defect problem will be worked out down to 1% or less of yield one day... when that day comes, general purpose in multiplex could be the optimal CPU design... but definantly not on the current x86 implementation.

Perhaps Intel's idea of 'pluggable' dies will one day fruition, and they'll produce many individual cores with full-way interconnectivity that can be 'plugged' into the CPU... that would greatly cut their potential losses, and that's what their objective has been for a very long time - for better or for worse.
I think that's how AMD took the lead while they did.


Not so fast...
By lemonadesoda on 8/11/2006 9:24:27 PM , Rating: 3
<my view>

I thoroughly disagree with some posters about the future of multi-core processors reaching 1000s of cores. While such processors may exist in the future, they will be RARE, and not used in mainstream applications. There are a number of reasons:

1./ There are only a very FEW computational tasks that gain from independent multi-parallelization, multi-core. These tasks tend not to be what Mr. Consumer, or Mr. Office worker, is doing.

2./ Most numerical computational tasks benefit significantly from DEPENDENT multi-parallelization, usually implemented through vector register processing. This is different from multi-core. Vector based processors are seen in special purpose computational mainframes etc.

3./ We humans tend to think, talk, drive, analyse, communicate, in a linear fashion. The typical task we do is linear. Any aid to our work can therefore work in a linear fashion. (Note this is a rather rough and simplistic argument, and is based on the concept of work aids, where steps in the process do not require multi-independent computational ability)

4./ DEPENDENT multi-parallelization has a complexity that increases by (n-1)! (n minus one factorial). Getting the processors to communicate with each other, both at a hardware level, and a software level, increases at, essentially, an exponential level.

5./ Developing software becomes harder, more timeconsuming, and therefore more expensive, when developing multi-core applications. There is therefore going to be a point of diminishing returns from a cost-benefit viewpoint. For most applications the number of processors will be in the single digits, not the thousands.

6./ For applications like server-database, there is a bus-contraint. Very quickly, the interconnect is saturated (communicating with memory, HDD, network), with increasing processors.

***

So although SUPER-PI and other benchmark calculations may look good under actual or theoretical 1000s of processors, in real-life commercial applications, the marginal benefit will quickly diminish... And at some point the "gain" will diminish faster than the "cost" of the hardware, software, interconnect, etc.

</my view>




RE: Not so fast...
By Tyler 86 on 8/12/2006 12:35:40 AM , Rating: 2
quote:
3./ We humans tend to think, talk, drive, analyse, communicate, in a linear fashion. The typical task we do is linear. --snip--

If you truely believe this, sir, perhaps you are the opposite of ADHD; overtly linear.

I find myself constantly following multiple trains of thought, many possibilities, with many revisions, and constantly having one intersect another mid-thought -- even as I write this paragraph, I have revised perhaps 20 to 30, well, 31 now, ok, well 36 .. 37... times this first sentence... some of which occured at the same time, and required a seperate train of thought to ... sequentiate? :)

To contemplate the matter of human nonlinearity typicly begets confusion, so I'm afraid I'm just going to leave it well enough alone, with this as it's punctuation; Humans are definantly not linear.

Perhaps you don't realise it, but you probably do it too.

Ever heard of a subconcious? Supposedly, everyone has one.

On 4., I agree entirely that diminishing returns will present themselves all to aparant with continual core-stitching... because they already have.

On 6., In complex database processing, the bus constraint is less a concern than processing performance -- more cores the merrier, there... however, complex database processing is fairly uncommon compared to massive, yet sequential and mundane parallel data processing... obviously, more bandwidth is the solution there.

On 5., Developing software becomes harder, because the features added are neglected abstraction, but with sufficient human element, this is remedied. Parallelization can easily be automated, although interface change is fairly difficult. That is iceberg the Itanic struck. For better or worse though, a better architecture is better. It's the way we're going to have to go, even if migration takes a century or so, beginning with x86-64, perhaps skipping 2 or 3 alternative architecture generations, before ridding the architecture of it's legacy components - but the demand will be there.

The holy grail as presented by the potential of quantum computation will be that strange point at which the question is answered before it can be asked. :P


RE: Not so fast...
By lemonadesoda on 8/12/2006 8:42:39 PM , Rating: 2
Tyler. Thanks for some good additional thoughts. However, I don't think the points I raised were negated in their relevance to mainstream computing. Remember that we are not looking a the 1 in a 1000 or 1 in a million application. We are looking at mainstream application. I therefore challenge your reply:

3./ Can you think of 3 common and typical situations where a PC used by Mr Consumer, and 3 common and typical situations where Mr Office would benfit from a 1000 core rather than, say, 10 core, processor? Please, no answers, like DivX your entire DVD collection in parallel. I'm looking for work-aid applications for Mr Office, and commonly repeated tasks that Mr Consumer does on his/her PC. And that's Mr Consumer, not Mr Whizz-Bang-Tech-Head. And from the perspective of "client" or "terminal" PC, not "server".

6./ Agreed for today's processors (although there is already an issue regarding integrated vs. external memory controllers on Quad core CPU's). I disagree that bus contraint is irrelevant when you are dealing with 1000 cores. The amount of data transfer between cores, memory, HDD, and network connection, will be orders of magnitide greater than today.

5./ I think that the typical application that is capable to run effectively on a 1000 core processor will essentially run 90% of the time in a single, or scheduling core, and 9% of the time in a small handful of cores, and less than 1% in anything more than 10 let alone 1000 cores.

The PC will spend most of the time idling at the UI, perhaps with a few processes running quietly in the background. For most user interactivity on the PC, the application will schedule tasks that will, at best, utilise a few cores, and occasionally, there will be a task that can benefit and has been programmed to leverage the 1000 cores. That task will last a moment and we are back to idling at the UI. Unless the marginal cost of a core is near zero, then the cost/benefit of a system is going to be where the majority of computational time is being spent. And I truely believe that for mainstream application, this is going to be in the "can count on your fingers" number of cores, and not the 1000s+.

***

Your choice of answering the points out-of-order was rather nice... but you still answered them in a sequential if not linear fashion. ;-P


RE: Not so fast...
By Garreye on 8/15/2006 11:38:50 AM , Rating: 2
I think you're thinking too much in terms of programs that are run now, obviously it's going to be difficult to point out situations that would use 1000 cores when most application today for Mr.Office are for the most part single threaded. I don't think having the idea behind having a 1000 core processor would be to be ultilize all of them at once in any situation. It's about specialization of cores so that no matter what you're doing on you're computer performance is maximized. As we continue to shrink process size and there's more space on the die, it's no longer gonna be about making sure that every nm of the CPUs core is being used all the time and in the most effective way, because space won't be a limiting factor like it is now.
So I agree with you when you say:
quote:
For most user interactivity on the PC, the application will schedule tasks that will, at best, utilise a few cores

but as the cores that it utilized will be optimized for specific tasks it will perform them very quickly.
As for having 1000 cores vs 10 cores, is very hard to predict, I think it would all depend on adoption by the programming community, and so it will be dependant on how hard it is to program for the increasing number of cores. It seems to me that more development tools will be created to make massively mutli-threaded programming much easier than it is now. Either that or the processor will follow the idea behing CISC design and handle distributing the task to different cores so the programmer doesn't have to worry too much about this.
quote:
I disagree that bus contraint is irrelevant when you are dealing with 1000 cores. The amount of data transfer between cores, memory, HDD, and network connection, will be orders of magnitide greater than today.

I definetely agree with you on this, bus constraint will be a huge factor with a large number of cores, communication/coordination between cores will be increasingly important.
quote:
Your choice of answering the points out-of-order was rather nice... but you still answered them in a sequential if not linear fashion

We answer question in a linear fashion due to the media in which we are communicating. That doesn't mean our minds work linearly at all, as we are typing our mind is thinking about what we are typing, what we are going to type after, as well as thinking about numerous other things on this subject, as well as other many other things happening around us. If we were to try to put all our thought in the order in which they come to us it would be pretty much impossible as we think different things in parallel and think much faster than we can type (or even speak) and would be for the most part incomprehensible to other people reading it.


RE: Not so fast...
By Tyler 86 on 8/15/2006 1:39:34 PM , Rating: 2
I'd rather have represented my thoughts in a flow chart, that way it would be more linear, and less sequential.
Although, you'd have branch points, whereas you cannot in a paragraph, without some serious explanation...

It is altogether easier for a human to percieve linear information than nonlinear information, although there are those cryptographic savants floating around...

Following your topic #3;

An Excel spreadsheet that generates graphs from plugin that go to SQL servers, and disk based Access .MDBs, through ODBC, or directly...
Each thread issued asynchronously gets scheduled to a new processor.
Process Explorer (www.sysinternals.com) showed at one point there were 783 threads, although in truth only 38 to 40 of them were attempting to execute at the same time...
In such a situation, a 40 core processor would be more efficient than a 10 core processor, but only in the most ridiculous fashion of milliseconds to nanoseconds difference.

I argued below that it isn't simply the number of cores a processor has that dictates it's parallelization; our current 'single core' processors have 3 processing units, the general purpose arithmetic and logic unit (ALU), the floating point unit slash multimedia extention unit, and the streaming SIMD (single instruction multi-..something.. denominator?) extension unit...
Applications, if written correctly, can make use of all 3.
However, it's even easier to make use of just additional wider general purpose registers (see x64 technology) even across multiple cores.

It's very annoying for me to look at complaints about not optimally parallelized applications... Even single threaded applications that can be launched twice gain a benefit.

I know too damn well that typical office workers are likely to triple-click every Excel spreadsheet, and open 13 instances of Outlook... in that way, they too, can benefit from multiple cores.

I believe that covers client & terminal... as for thin-clients, true terminals, there is very little effect on anything more than 4 cores, as you have only 4 things to do at once - interface logic, audio, video, and networking.

On 6;

The gift of parallelization is that once you have 2 things that work correctly, synchronously, you can do 3 of them, and so forth. This doesn't just apply to computation, but also storage, and bandwidth.

A 1000 core processor with 1000 times the interconnect count with 1000 times the memory devices, interleved and mutli-channeled.... possibly even all in one piece of hardware, a 'stick', if miniturization makes it down that small... or hell, 1000 core processor directly integrated to 1000 memory devices...

It's not irrelevant, it is a challenge, maybe even an 'entire obstical', but there aren't as many rocket-scientist level requirements to overcome in the bus than there are in actual miniturization of the CPU.

On 5;
Each task can take more than 4 cores... heck...
You're not a programmer, I'm assuming? ...
Every loop and iteration of a loop can be issued a thread.
You can essientially have multiple threads dedicated to issuing new threads...
Today, that's a bad idea, because the resources could better be spent on actually doing what small work needs to be done -- diminishing returns.
On tomorrow's processors, loading multiple small XML files to memory, sifting through certain XML files, evaluating a script function, rendering to the UI, accessing video, audio, and peripheral hardware, and buffering for what may be to come, not to mention predictive artificial intelligence that may be assisting the end user, however simple (Microsoft's Clippy) or complex...
New methods of input & interface, perhaps physicly 3D, by voice, by expression, by brainwaves, or what have you...
It's not too far of a stretch to have just such an environment where processing on all 1,000 cores for more than mere seconds at a time for general activity becomes a reality.

The .NET framework is becoming more and more popular, even into the venue of 3D graphics -- I think it's quite ridiculous, but that doesn't make it go away. It's a manifestation of abstraction, it's very easy to read, to write, and to think in. It can be run-time compiled, and because the compiler can be updated, it can be constantly optimized to use whatever resources are at it's disposal, if at all possible - although it'd still not be as efficient as native code, it wouldn't exactly be much overhead to incur in a 1,000 core environment.

You never know though; We could be talking about a 400 Mhz 1,000 core monsterously energy efficient core here, too...

Parallelization doesn't necessarily mean activity, sometimes it's there just as a failover, or alternative.

On the topic of human nonlinearity;
This is just one of those branches on the flow chart, not to be assigned temporally to have occurred after I have written what I have written above.
Now, did I really say all I had to say? I'm not certain, and I don't know if I can be without getting feedback.
Sometimes thoughts in my mind repeat themselves after a certain delay after I've finished, as sort of an echo that feels like an afterthought, which queues me to revise what I've written... it's the way the human mind works, it introduces uncertainty in an attempt to assure itself certainty. It works, typicly; Although in extreme cases you see this very obviously in individuals suffering from paranoia, that it can work too well, to the point of insanity.
That brings me to this 'revelation'; There are all kinds of mental dimensia that clearly express the nature of humanity's nonlinearity, such as common ADHD, and less common multiple personality disorder.
Now that I think I'm crazy, due to the common uncertainty insertion in my attempt to assure myself, I can sit back and laugh, and not care if I said all I had to say.
Content in my confusion, I submit. :P


modularization
By Kuroyama on 8/10/2006 12:18:51 PM , Rating: 2
From a previous Dailytech article

A major push for AMD’s K8L design is in “modular” component design – meaning everything from L3 cache to memory controllers are developed as individual components and linked together with reusable, robust designs.

Perhaps once a workable modularization approach is implemented then (exaggerating a bit) putting in a new core might merely be a matter of designing the computational bits and linking it in as a module. Could be pointing towards the world of more and more specialized cores.




RE: modularization
By rrsurfer1 on 8/10/2006 1:29:03 PM , Rating: 2
In a way, it's already done like this. CPU's are basically coded in hardware description languages that link together digital modules, then optimized. I think it's certain that the level of specialization will go up as the process size goes down. They currently have limited space so general purpose was best. With more space you can begin to look at optimized "cores" for specific purposes.

This is also how the human brain works essentially. The mind has numerous "special-purpose" neural networks that function in parallel, with a central bus, if you will, coordinating them. Certain tasks are extremely fast because the brain has dedicated subprocessors, other tasks are difficult for us because of the lack of dedicated "hardware".


RE: modularization
By pepangelist on 8/10/2006 7:49:24 PM , Rating: 2
Man!
You REALLY need to get out more.
The brain is not only a computation device, but a much more complex machine driven by tiny bolts of electricity and chemistry we haven't even begin to touch. All we know is by "colored areas" on a cat scan that light up if you are having fun, eating chocolate, thinking about a memory or whatever experiment scientists device. But, as in the weather, men are still very hard to predict. Wouldn't you agree?

Anyway, your analogy wasn't bad at all, just incredibly incomplete.


RE: modularization
By rrsurfer1 on 8/10/2006 9:18:12 PM , Rating: 2
The computational theory of mind actually makes sense, and more and more research is showing it to be valid. The reason "men are hard to predict" is because the brain IS complicated, but reasearch is proving its just a complicated computer with many sub-processors. It's not just "cat scans" (or FMRI's) that show this activity, it's studies of abnormal individuals, such as people with brain disorders or damage.

And if you do some reading you'll see scientists do know quite a bit about the mind and more is being discovered every day.

I know my analogy is incomplete - you'd need thousands of pages to make it "complete." I still think it's a valid analogy.


RE: modularization
By hstewarth on 8/11/2006 12:55:15 AM , Rating: 2
I think you absolutely right, this has been done and constantly envolving. Think about even in the early days of 8088, it had a 8087 math coprocessor and then later merge in with 486 chip. Later with MMX and SSE instructions.

Also GPU on systems have sets of processors along with Raid processors on some systems to handle IO.

But to truely take advantage of these extra cores - the application must be program to take advantage of them. It doesn't matter how many are there or how they are organized or from what manufactor. If the software doesn't take advantage of it - then why do it.


Definately for worstations
By Donegrim on 8/10/2006 5:45:01 AM , Rating: 2
I definitely think this could catch on for specialized workstations, probably much sooner than it will for general PCs. I could imagine the big pro software vendors selling their own cpus after commissioning the big chip makers to make them a customized core or two. Imagine going to steinberg for you next DAW and buying a cpu with 4 "vst cores" built in. Or going to newtek for a 3D workstation and getting a cpu with a "lightwave core" and maybe 8 more "render cores". I imagine this sort of thing will be really expensive to start with, but it could well become the norm.




RE: Definately for worstations
By tuteja1986 on 8/10/2006 10:21:33 AM , Rating: 2
But we really need to standardize these core before Intel , AMD , IBM start to release different solution and program become so confused on which solution to follow. I really think Microsoft needs to develop a Physic API before it gets out of the control.


RE: Definately for worstations
By Ringold on 8/10/2006 8:07:18 PM , Rating: 2
I think it wont just be workstations, I think its an awfully safe bet that as soon as AMD can pull it off we'll have our GPU sitting a few nanometers away from the other cores. Laptops would love it too.. come to think of it, any reason why they wouldn't be the FIRST to get such a processor? Integrated graphics don't have to be great, they just would have to avoid being horrible, and it'd be one less large chunk of hardware to keep powered, right?

When though, I don't know. I'd hope it'd be within 2 or 3 years. If it takes another 4 or 5, I'd be pretty disappointed.. can hardly wait to bump up the FSB or equivalent and simultaneously OC all types of performance across the board.. and all under one waterblock!

Any speculation from you masher on the timeframe?


RE: Definately for worstations
By masher2 (blog) on 8/10/06, Rating: 0
future cpus
By Demoulous on 8/10/2006 7:03:09 AM , Rating: 2
Given that we have seen the amalgamation of the vertex, pixel, and fragment shader, into the unified shader on the GPU.
Why would any future design waste precious space on specific tasks when it is more lightly that they will unify with FPGA like technologies and have a fully reprogrammable architecture where any 'core' can be reprogrammed on the fly to be a FFT accelerator for SETI or comms, a unified shader for GPU work are but a tiny fraction of the things a reconfigurable cpu can hardware accelerate..




RE: future cpus
By masher2 (blog) on 8/10/2006 11:35:36 AM , Rating: 3
> "Why would any future design waste precious space on specific tasks..."

An excellent question and, at present geometry sizes, they won't. But when we start talking about the 8nm node and beyond, that space isn't going to be nearly so precious. In fact, chip designers are going to have more real estate than they know what to do with.


I don't think so...
By maxvelocity on 8/11/2006 9:09:20 AM , Rating: 2
Well the general prediction I would agree with, of course, but not the conclusions. Why would anybody want to have HTML or JavaScript, or anysuch hardwired in specialized circuits? You get the extra speed, of course, but as soon as any of these standards change your specialized core becomes useless.

Besides, the extra juice you can cram out is not so important, as the speed of the general purpose core will increase as well. For many of the examples given here, fast enough is good enough.

Probably, some functions like graphics rendering, networking and such will be hardwired, because here speed really is king. For other applications I would chose 990 general purpose cores (and 10 hardwired) any day of the week.




RE: I don't think so...
By foxglove on 8/11/2006 9:50:22 AM , Rating: 2
I dont think html standards will be changing much 25 yrs from now. And if they do, a special core would still accelerate the 99.9% that didn't change. The new stuff would need to run on one of the general cores.


Not quite
By Trisped on 8/14/2006 1:25:46 PM , Rating: 2
The reason behind the 1 core setup was because it is cheaper to make one core that can do everything, then 100 that can only do 1 thing. The 1 core will always be working, where 99 of the 100 will not (because you are only doing one type of task).
There is also the advantage from the programming stand point, all multi core tasks can be made liner and stuck on one core, but not all tasks can be split onto multiple cores. For instance, if you are working with fractals, where the output of the last commutation is the input for the next, it doesn't matter how many cores you have, you can only use one at a time.

The real and only reason we are going from 1 core to multi core is because demand for processing power is growing much faster then the technology to allow that power. As a result, we split the processor into 2 or more parts, allowing us that many more times the power. And once you have more then 1 core, why keep them all general? You know that x% is going to be used for this type of calculation, why not give that its own core that will get the answer quicker and easier then if you used the general core?

In a beautiful world we would have one chip that did everything. Connected with a million pins that directly connect to each external device and drive as well as providing all video and audio I/O services.




RE: Not quite
By ttowntom on 8/15/2006 3:37:19 PM , Rating: 2
quote:
if you are working with fractals, where the output of the last commutation is the input for the next, it doesn't matter how many cores you have, you can only use one at a time.
Nope. Fractals are one of the easiest parallel processing tasks there is. You can calculate each pixel independently, so using 1000 cores to generate something like the Mandelbrot set would be very easy.



Arbitrary Conclusion
By mindless1 on 8/16/2006 3:05:54 AM , Rating: 2
[QUOTE]Back in 2001, I raised a few eyebrows with the prediction that, by the year 2020, desktop processors would contain a hundred or more cores, many of them specialized to certain tasks.[/QUOTE]

Of course they did, it was a wild unfounded speculation that ignored the fact that computers already have multiple processing units across the various busses, but that is not what a CENTRAL PROCESSING UNIT is meant to be. A core philosophy is it's high programability.

[QUOTE]Is there any effective limit on the number of individual processors one can put on a chip? 1,000? 10,000? Certainly a thousand cores by 2030 seems likely. Right now, software developers seem to be having a hard time writing code that uses more than one effectively, but this will change. Those programmers will die out, if need be, to make way for the new breed. [/QUOTE]

Yes the effective limit is the amount of space, the process size, and how effective such cores could be at the smallest process size possible given that this will limit the transistor area per each. It goes back to the core of why they're programmable not so specialized, because it allows focus on primary tasks. Software developers are having a hard time because they're not sufficiently motivated, the market-profit for it is small.

Of course this will change, because more multi-core systems enter the market, the customer base grows. The idea of programmers dying out is foolish, THEY know far better than you, what they've been ordered to write for money.

[QUOTE]Specialized cores are inevitable for one simple reason. If you design a chip for a specific task, the same number of transistors can perform that task 10-100 times faster. Sometimes more. These kind of performance gains can't be ignored.[/QUOTE]

In certain areas this is true. Highly compute bound, popular things such as video decoding, gaming and the like. The larger issue with lesser functions is, if you have 100,000 (for example) specialized cores but the user only does 10 demanding things simultaneously, higher performance would have been had for those 10 things from dedicating fewer, larger cores, than only those specialized tiny ones. You'd have potentially the tenfold increase in performance per transistor with the small dedicated core but with fewer cores, perhaps 1000 times as many transistors per demanding use.

Even now we see this, Intel and AMD didn't go dual core for cited reason, they did it because they can't ramp up a single core indefinitely. They could do quad core on PCs now, if they felt it was a marketable venture. What's more profitable in server markets using quad cores? Selling two CPUs.

Further, dedicating a course of specialized cores requires years of research and work based upon certain assumptions such as that everyone would still use MPEG4/2. Oops, now MPEG4/10 is catching on. Assumings that Netscape's browser is the winner. Oops, IE took over. Same with Java, any other technology, including windows itself. General purpose cores are the answer to a broad and changing market. Having more than one is not evidence that very large numbers specialized, will be as useful.




RE: Arbitrary Conclusion
By ttowntom on 8/16/2006 8:46:36 AM , Rating: 2
quote:
The idea of programmers dying out is foolish

The article didn't say programmers were going to die out, just programmers who couldnt write multithreaded code.

quote:
based upon certain assumptions such as that everyone would still use MPEG4/2. Oops, now MPEG4/10 is catching on
Hardware video acceleration is already common, even now when video standards are changing very quickly. In 20-30 years, they'll be changing very slowly, if at all. Besides a processor designed to speed up MPEG4/2 is probably going to work pretty well at MPEG4/10 also.


By sxr7171 on 8/12/2006 9:22:51 PM , Rating: 3
I think what you say is very likely the direction things are heading. Dailytech as a site will become better with writers such as yourself.




new platform
By kattanna on 8/10/2006 11:39:41 AM , Rating: 2
what i have been waiting for is the day where you can mix and match various cores as needed. You would have a interconnect board where you would plug in various processing cores as easily as you change a CD in/out of a drive nowadays.

also move away from electrical to optical interconnects. I wouldnt want to design a board with say 16 processing slots/sockets...would be a wiring nightmare. but if each socket/slot was optically connected the design could be much simplier. basically just make the "motherboard" more of an optical interconnect backplane.

memory and peripherials would also connect optically to the backplane the same way.

that would allow a person to get a backplane and when they need a trully faster system..they could buy a new processor module and memory module that had more advanced optical multiplexing abilities to increse bandwidth betwwen the 2, while the backplane wouldnt need to be changed.




By pepangelist on 8/10/2006 7:53:12 PM , Rating: 2
In ascending importance and all running in parallel:

- CD Burning
- MPEG1/2/3/4... encoding and decoding
- Most important: User Interface hardware/ I don't want to ever have a computer that is unresponsive again, ever.




A thousand chips in one motherboard.
By RexRed on 8/11/2006 12:46:21 AM , Rating: 2
This would certainly give one much headroom for what ever computing was at hand. With this level of raw speed and threshold, would there be as much need to compartmentalize tasks? If so it would logically still run along the traditional lines of audio/video and foreground/background maintenance and utilities.

Whatever would be available for the main CPU would also mirror the same level of technology capable for audio chips and video chips and even hand held chips. Once someone can convert a high resolution 2 hour movie in say, 2 minutes or less to whatever format they want, what is there really left for computers to achieve for the basic consumer? Their audio and video needs are now complete. Multimedia will be mature "soon". The plasma screen will not need to expand much larger for the home user. The largest surface a video feed would need to cover in a room is, all six or so walls (including floor and ceiling.). Hi-res has already reached those resolutions many upscale PC's can already handle those loads. Not very long ago computers could not even handle audio streams well.

So what is next? Well it comes back to raw computing power. Projects like the ones that help fight cancer by linking an online network of computers. This may become very big someday as more and more or our genetic/biological survival may depend on it.




By hstewarth on 8/12/2006 7:09:39 PM , Rating: 2
I just thought 0f something this idea is sort of opposite of the idea of Unified Shaders in GPU.




By rogerv on 8/13/2006 11:12:03 AM , Rating: 2
There is one need that could drive massively multi-core CPUs - artificial intelligence for domestic robots. That's the one consumer item that will remain challenging and could drive further CPU advancement.

As the population continues to become aged as people liver longer, there will become a need for sophisticated domestic robots that can be companions and care takers to the elderly.




"DailyTech is the best kept secret on the Internet." -- Larry Barber














botimage
Copyright 2009 DailyTech LLC. - RSS Feed | Advertise | About Us | Ethics | FAQ | Terms, Conditions & Privacy Information | Kristopher Kubicki