Superscalar to the Rescue

If deepening the pipeline gives us higher clock speeds and more instructions being worked on at a time, but at the expense of lower performance when things aren’t working optimally, what other options do we have for increasing performance?

Instead of going deeper, what about making our chip wider? In our previous example only a single instruction could be active at any given stage in the pipeline - what if we removed that limitation?

A superscalar processor is one that allows multiple instructions to be active at any given stage in the pipeline. Through some duplication of resources you can now have two or more instructions at the same stage at the same time. The simplest superscalar implementation is a dual-issue, where two instructions can go down the pipe in parallel. Today’s Core 2 and Core i7 processors are four issue (four instructions go down the pipe in parallel); the high end hasn’t been dual issue since the days of the original Pentium processor.

The benefits of a superscalar chip are obvious: you potentially double the number of completed instructions at any given time. Combine that with a reasonably pipelined, high clock speed architecture and you have the makings of a high performance processor.

The drawbacks are also obvious; enabling a multi-issue architecture requires more transistors, which drive up die size (cost) and power (heat). Only recently have superscalar designs made their way into mobile devices thanks to smaller and cooler switching transistors (e.g. 45nm). You also have to worry even more about keeping the CPU fed with instructions, which means larger caches, faster memory buses and clever architectural tricks to extract as much instruction level paralellism as possible. A dual issue chip is a waste if you can’t keep it fed consistently.

Raw Clock Speed

The previous two examples of architectural enhancements are major improvements in design. To design a modern day CPU with more pipeline stages or to go from a single to dual-issue design takes a team years to implement; these are not trivial improvements.

A simpler path to improving performance is to just increase the clock speed of the CPU. In the first example I provided, our CPU could only run as fast as the most complex pipeline stage allowed it. In the real world however, there are other limitations to clock speed.

Manufacturing issues alone can severely limit clock speed. Even though an architecture may be capable of running at 1GHz, the transistors used in making the chip may only be yielding well at 600MHz. Power is also a major concern. A transistor usually has a range of switching speeds. Our hypothetical 45nm process may be able to run at 300MHz at 0.9500V or 600MHz at 1.300V; higher frequencies generally mean higher voltage, which results in higher power consumption - a big issue for mobile devices.

The iPhone’s processor is based on a SoC that can operate at up to 600MHz, for power (and battery life) concerns Apple/Samsung limit the CPU core to running at 412MHz. The architecture can clearly handle more, but the balance of power and battery life gate us. In general, increasing clock speed alone isn’t a desirable option to improve performance in a mobile device like a smartphone because your performance per watt doesn’t improve tremendously if at all.

In terms of sheer performance however, just increasing clock speed is preferred to deepening your pipeline and increasing clock speed. With no increase in pipeline depth you don’t have to worry about keeping any more stages full, everything just works faster if you increase your clock speed.

The key take away here is that you can’t just look at clock speed when it comes to processors. We learned this a long time ago in the desktop space, but it seems that it’s getting glossed over in the smartphone market. A 400MHz dual-issue core is going to be a better performer than a 500MHz single-issue core with a deeper pipeline, and the 528MHz processor in the iPod Touch is no where near as fast as the 600MHz processor in the iPhone 3GS.

A Crash Course in CPU Architecture Putting it in Perspective
Comments Locked


View All Comments

  • psonice - Tuesday, July 7, 2009 - link

    My understanding is that the iphone 3gs GPU is actually a 535, not a 520. At least, this is the current understanding among iphone developers, and there's an SGX535 driver on the phone to support that. The extra power might explain the hit on battery life when playing games.

    Real numbers are pretty hard to come by, but it seems the 535 is roughly 4x faster than the 520. If so, that's a massive upgrade rather than just a decent one. The 535 also supports HD video decoding where the 520 doesn't - not that apple seem to be supporting it if it does.

    I heard too that the palm pre has a 530 GPU, which is 2x faster than the 520. That puts the iphone a long way ahead for graphics instead of behind.

    One thing in the article I really disagree with btw: you say that the phone makers should provide detailed specs. I think they shouldn't, as it's not helpful at all for the average buyer. If you go into a shop without having much clue and ask for an iphone because it's the latest thing, and the shop assistant says "well this is like an iphone, but it runs 200mhz faster" you'll end up buying the "better" phone based on the spec sheet, even if it's running win mobile 5.

    I was in Japan a while back, and they tend to buy phones based on the spec sheets there. The phones all compete on having the most features. They're all really big and HORRIBLE to actually use. None of that please!

    I think apple actually get their commercials right with the iphone on the whole: show somebody actually using the phone to do stuff. If the other manufacturers did the same, that would be a perfect way to compare.
  • christinme7890 - Thursday, July 9, 2009 - link

    I agree with you holistically. There are not many people in this world that even understand the specs. Not to mention when it comes to specs, and the person has no clue, they end up getting the one with the highest numbers. This is bad. I think you are right in saying that the way apple works their commercials is perfect for people. They show people all the great apps that they could use and they say that ALL of these apps can be on one phone.

    This is why I hate the Best buy MS commercials where the kid goes into the BB and buys a PC instead of a mac. The person always buys the computer with the best specs and care little about the OS, which is what they will be using. Windows, imo after using a Mac for a year, sucks in comparison to Mac. I rarely have a problem with a mac. I sit in class everyday and watch all the pc people have startup errors and os sleep or hibernation errors. I can close my mac and KNOW WITHOUT A DOUBT that it will wake up totally fine. Not to mention it wakes up seamlessly without load screens or anything. I will not compare the two but for business and usability the MAC gets my vote and I think if Apple does their commercials for the macs just as great. Sure most people are still using MS but that is because MS strong arms people into buying their stuff everytime you buy a Computer (not to mention Apple is very strict with their software and rightly so).
  • Anand Lal Shimpi - Tuesday, July 7, 2009 - link

    Ooh, very interesting - do you have any links to discussions on the 535 being in the 3GS?

    I don't think end users need to be bombarded with specs, but I think there needs to be more information put out about these things. We shouldn't have to play guessing games about clocks and specs; don't market them, but don't hide them either - that's my thinking.

    Take care,
  • BlazingDragon - Tuesday, July 7, 2009 - link

    Anand, here it is:">
  • Anand Lal Shimpi - Tuesday, July 7, 2009 - link

    Very interesting - thanks guys, I've updated the article.

    Take care,
  • ltcommanderdata - Tuesday, July 7, 2009 - link

    It should probably also be noted that the MBX-Lite supports OpenGL ES 1.1 as implemented by Apple not just OpenGL ES 1.0. I believe it's Android's implementation that currently only supports OpenGL ES 1.0.

    It's also been reported that the iPhone OS 3.1 betas include improvements to the OpenGL stack that include additional OpenGL extensions. Whether these are focused on OpenGL ES 2.0 and the SGX or are also for OpenGL ES 1.1 and the MBX remains to be seen. Although on the issue of reducing market segmentation, it'd be great if Apple could implement the OpenGL ES 1.1 Extension Pack although I don't know if the MBX-Lite can actually support it in hardware.
  • BlazingDragon - Tuesday, July 7, 2009 - link

    Anand, here's it is:
    iPhone 3GS Has More Powerful PowerVR SGX 535 GPU?
  • kelmerp - Tuesday, July 7, 2009 - link

    I'm trying to decide between the MyTouch or a jailbroken iphone.
  • sxr7171 - Wednesday, July 8, 2009 - link

    JB iPhone vs. MyTouch? They're not even in the same league. Pre vs. iPhone is a comparison.
  • pennyfan87 - Tuesday, July 7, 2009 - link


    i love you writing and tech analysis.

    but please, drop the fanboyism.
    3 articles on such a minor upgrade? please.

    more SSD stuff please.

Log in

Don't have an account? Sign up now