The Quest for More Processing Power, Part One: "Is the single core CPU doomed?"by Johan De Gelas on February 8, 2005 4:00 PM EST
- Posted in
"What you have seen is a public demonstration of 4 GHz silicon straight off our manufacturing line. We have positive indications to be able to take Netburst to the 10 GHz space."
"While architectural enhancements are important, Intel intends to continue its lead in raw speed. Otellini demonstrated a new high-frequency mark for processors, running a Pentium 4 processor at 4.7 GHz."
The first assertion was made at IDF Spring 2002, and the second press release was broadcasted after Fall IDF 2002. Fast forward to the beginning of 2004, and we read in the Prescott presentation: "2005-2010: the era of thread level parallelism and multi-core CPU technology. " What happened to "the 10 GHz space"?
Fig 1. "2005-2010: the era of thread level parallelism and multi-core CPU technology ".
The presentation of the new 6xx Prescott even states that Intel is now committed to " Adding value beyond GHz". This sounds like Intel is not interested in clock speeds anymore, let alone 10 GHz CPUs.
Already, the hype is spreading: Dual core CPUs offer a much smoother computing experience; processing power will increase quickly from about 5 Gigaflops to 50 gigaflops and so on. It is almost like higher clock speeds and extracting more ILP (Instruction Level parallelism), which has been researched for decades now, are not important anymore.
At the same time, we are hearing that "Netburst is dead, Tejas is cancelled and AMD's next-generation K9 project is pushed back." Designs built for high clock speeds and IPC (Instructions per Clock) are no longer highly regarded as heroes, but black sheep. They are held responsible for all the sins of the CPU world: exploding power dissipation, diminishing performance increases and exorbitant investments in state of the art fabs to produce these high clock speed chips. A Prescott or Athlon 64 CPU in your system is out of fashion. If you want to be trendy, get a quad core P-m, also known as Whitefield , made in India.
To the point
I am exaggerating, of course. A good friend of mine, Chris Rijk, said: "PR departments having no 'middle gears': they either hype something to great lengths, or not at all." Trying to understand what is really going on is the purpose of this article. We are going to take a critical look at what the future CPU architectures have to offer. Is the traditional approach of increasing IPC and clock speed to get better performance doomed? Does multi-core technology overcome the hurdles that were too high for the single-core CPUs? Are multi-core CPUs the best solution for all markets? Will multi-core CPUs make a difference in the desktop and workstation market?
In this first instalment, we explore the problems that the current CPU architectures face. The intention is to evaluate whether the solution proposed by Intel and other manufactures is a long-term solution, one that really solves those problems. We will also investigate one CPU in particular, the Intel Prescott. So, basically there are 4 chapters in this article that will discuss:
- The problems that CPU architects face today: Wire Delay, Power and the Memory wall.
Chapter 1 - The brakes on CPU power
- The reason why Intel and others propose dual core as a solution to these problems.
Chapter 2 - Why single core CPUs are no longer "cool"
- Whether or not these problems can be solved without dual core.
Chapter 3 - Containing the epidemic problems
- A case study of the Intel Prescott.
Chapter 4 - The Pentium 4 crash landing
Although Intel is undeniably the industry leader in the CPU market, this doesn't always mean that the solutions proposed are the right ones. For example, remember MMX, which was a technology that should have turned the (x86-based) PC into a multimedia monster. In hindsight, the critics were right. MMX was little more than a marketing stunt to make people upgrade.
The first implementation of hyperthreading on Intel's Foster Xeon (Willamette Xeon) was turned off by default by all OEMs. And hyperpipelined CPUs with 30+ stages turned out to be an impressive, but pretty bad idea.
In other words, not all hypes have turned out to be beneficial for the customer. Millions of customers are still waiting for the rich content on the Internet that is enabled by and runs so much faster on the Netburst architecture...
Post Your CommentPlease log in or sign up to comment.
View All Comments
Zak - Wednesday, August 22, 2007 - linkI seem to remember reading somewhere, probably couple of years ago, about research being done on hyperconductivity in "normal" temperatures. Right now hyperconductivity occurs only in extremely low temperatures, right? If materials were developed that achieve the same in normal temperatures it'd solve lots of these issues, like wire delay and power loss, wouldn't it?
Tellme - Monday, February 21, 2005 - linkCarl what i meant was that soon we might not see much improved performance with multicores as well because the data comes too late to the processor for quick execution. (That is true for single cores as well).
Did you checked the link?
Their idea is simple.
"If you can't bring the memory bandwidth to the processor, then bring the processors to the memory."
Currently processor waits most of its time for data to be processed.
carl0ski - Saturday, February 19, 2005 - link#61 i thought p4 already had memory bandwidth problems,
AMD has a temporary work around (on die memory controller) which aids in multiple CPU's/Dies using the same fsb to access the Ram.
Intel has proposed multiple fsb's , one each CPU/die.
Does anyone know if that means they will need sperate RAM dimms for each FSB? because that would prove an expensive system.
carl0ski - Saturday, February 19, 2005 - link[quote]59 - Posted on Feb 12, 2005 at 11:28 AM by fitten Reply
#57 What was the performance comparison of the 1GHz Athlon vs. the 1GHz P3? IIRC, the Athlon was faster by some margin. If this was the case, then there was a little more than tweaking that went on in the Pentium-M line. Because they started out looking at the P3 doesn't mean that what they ended up with was the P3 with a tweak here or there. :)[/quote]
#59 didnt P3 1ghz run 133mhz sdram? on a 133fsb?
Athlon 1ghz had a nice DDR 266 fsb to support it.
Tellme - Monday, February 14, 2005 - linkNice article.
I think dual cores will soon reach hit the wall ie Memory Bandwidth.
Hopefully memory and processors are integrates in near future.
ceefka - Monday, February 14, 2005 - linkThough still a little too technical for me, it makes a good read.
It's good to know that Intel has eaten their words and realized they had to go back to the drawing board.
I believe rather sooner than later multicore will mean 4 - 8 cores providing the power to emulate everything that is not necessarily native, like running MAC OSX on an AMD or Intel box. Iow the CELL will meet its match.
fitten - Saturday, February 12, 2005 - link#57 What was the performance comparison of the 1GHz Athlon vs. the 1GHz P3? IIRC, the Athlon was faster by some margin. If this was the case, then there was a little more than tweaking that went on in the Pentium-M line. Because they started out looking at the P3 doesn't mean that what they ended up with was the P3 with a tweak here or there. :)
avijay - Friday, February 11, 2005 - linkEXCELLENT Article! One of the very best I've ever read. Nice to see all the references at the end as well. Could someone please point me to Johan's first article at AT please. Thanks.
fishbreath - Friday, February 11, 2005 - linkFor those of you who don't actually know this:
1) The Dotham IS a Pentium 3. It was tweaked by Intel in Israel, but it's heart and soul is just a PIII.
1b) All P4's have hyperthreading in them, and always have had. It was a fuse feature that was not announced until there were applications to support them. But anyone who has HT and Windows XP knows that Windows simply has a smoother 'feel' when running on an HT processor!
2) Complex array processors are already in the pipeline (no pun intended). However the lack of an operating system or language to support them demands they make their first appearance in dedicated applications such as h264 encoders.
blckgrffn - Friday, February 11, 2005 - linkYay for Very Large Scale Integration (more than 10,000 transistors per chip)! :) I wonder when the historians will put down in the history books that we have hit the fifth generation of computing org....