A8’s CPU: What Comes After Cyclone?

Despite the importance of the CPU in Apple’s SoC designs, it continues to be surprising just how relatively little we know about their architectures even years after the fact. Even though the CPU was so important that Apple saw the need to create their own custom design, and then did two architectures in just the span of two years, they are not fond of talking about just what it is they have done with their architectures. This, unfortunately, is especially the case at the beginning of an SoC’s lifecycle, and for A8 it isn’t going to be any different.

Overall, from what we can tell the CPU in the A8 is not a significant departure from the CPU in A7, but that is not a bad thing. With Cyclone Apple hit on a very solid design: use a wide, high-IPC design with great latency in order to reach high performance levels at low clock speeds. By keeping the CPU wide and the clock speed low, Apple was able to hit their performance goals without having to push the envelope on power consumption, as lower clock speeds help keep CPU power use in check. It’s all very Intel Core-like, all things considered. Furthermore given the fact that Cyclone was a forward-looking design with ARMv8 AArch64 capabilities and already strong performance, Apple does not face the same pressure to overhaul their CPU architecture like other current ARMv7 CPU designers do.


Close Up: "Enhanced Cyclone"

As a result, from the information we have been able to dig up and the tests we have performed, the A8 CPU is not radically different from Cyclone. To be sure there are some differences that make it clear that this is not just a Cyclone running at slightly higher clock speeds, but we have not seen the same kind of immense overhaul that defined Swift and Cyclone.

Unfortunately Apple has tightened up on information leaks and unintentional publications more than ever with A8, so the amount of information coming out of Apple about this new core is very limited. In fact this time around we don’t even know the name of the CPU. For the time being we are calling it "Enhanced Cyclone" – it’s descriptive of the architecture – but we’re fairly certain that it does have a formal name within Apple to set it apart from Cyclone, a name we hope to discover sooner than later.

In any case one of the things we do know about Enhanced Cyclone is that unlike Apple’s GPU of choice for A8, Apple has seen a significant reduction in the die size of the CPU coming from the 28nm A7 to the 20nm A8. Chipworks’ estimates put the die size of Cyclone at 17.1mm2 versus 12.2mm2 for Enhanced Cyclone. On a relative basis this means that Enhanced Cyclone is 71% the size of Cyclone, which even after accounting for less-than-perfect area scaling still means that Enhanced Cyclone is a relatively bigger CPU composed of more transistors than Cyclone was. It is not dramatically bigger, but it’s bigger to such a degree that it’s clear that Apple has made further improvements over Cyclone.

The question of the moment is what Apple has put their additional transistors and die space to work on. Some of that is no doubt the memory interface, which as we’ve seen earlier L3 cache access times are nearly 20ns faster in our benchmarks. But if we dig deeper things start becoming very interesting.

Apple Custom CPU Core Comparison
  Apple A7 Apple A8
CPU Codename Cyclone "Enhanced Cyclone"
ARM ISA ARMv8-A (32/64-bit) ARMv8-A (32/64-bit)
Issue Width 6 micro-ops 6 micro-ops
Reorder Buffer Size 192 micro-ops 192 micro-ops?
Branch Mispredict Penalty 16 cycles (14 - 19) 16 (14 - 19)?
Integer ALUs 4 4
Load/Store Units 2 2
Addition (FP) Latency 5 cycles 4 cycles
Multiplication (INT) Latency 4 cycles 3 cycles
Branch Units 2 2
Indirect Branch Units 1 1
FP/NEON ALUs 3 3
L1 Cache 64KB I$ + 64KB D$ 64KB I$ + 64KB D$
L2 Cache 1MB 1MB
L3 Cache 4MB 4MB

First and foremost, in much of our testing Enhanced Cyclone performs very similarly to Cyclone. Accounting for the fact that A8 is clocked at 1.4GHz versus 1.3GHz for A7, in many low-level benchmarks the two perform as if they are the same processor. Based on this data it looks like the fundamentals of Cyclone have not been changed for Enhanced Cyclone. Enhanced Cyclone is still a very wide six micro-op architecture, and branch misprediction penalties are similar so that it’s likely we’re looking at the same pipeline length.

However from our low-level tests two specific features stand out: integer multiplication and floating point addition. When it comes to integer multiplication Cyclone had a single multiplication unit and it took four cycles to execute. However against Enhanced Cyclone those operations are now measuring in at three cycles to execute. But more surprising is the total Integer multiplication throughput rate; integer multiplication performance has now more than doubled. While this doesn’t give us enough data to completely draw out Enhanced Cyclone’s integer pathways, all of the data points to Enhanced Cyclone doubling up on its integer multiplication units, meaning Apple’s latest architecture now has two such units.

Meanwhile floating point addition shows similar benefits, though not as great as integer multiplication. Throughput is such that there appears to still be three FP ALUs, but like integer multiplication the instruction latency has been reduced. Apple has managed to shave off a cycle on FP addition, so it now completes in four cycles instead of five. Both of these improvements indicate that Enhanced Cyclone is not identical to Cyclone – the additional INT MUL unit in particular – making them very similar but still subtly different CPU architectures.


Apple iPhone Performance Estimates: Over The Years

Outside of these low-level operations, most other aspects of Enhanced Cyclone seem unchanged. L1 cache remains at 64KB I$ + 64KB D$ per CPU core, where it was most recently doubled for Cyclone. For L2 cache Chipworks believes that there may be separate L2 caches for each CPU core, and while L2 cache bandwidth is looking a little better on Enhanced Cyclone than on Cyclone, it’s not a “smoking gun” that would prove the presence of separate L2 caches. And of course, the L3 cache stands at 4MB, with the aforementioned improvements in latency that we’ve seen.

To borrow an Intel analogy once more, the layout and performance of Enhanced Cyclone relative to Cyclone is quite similar to Intel’s more recent ticks, where smaller feature improvements take place alongside a die shrink. In this case Apple has their die shrink to 20nm; meanwhile they have made some small tweaks to the architecture to improve performance across several scenarios. At the same time Apple has made a moderate bump in clock speed from 1.3GHz to 1.4GHz, but it’s nothing extreme. Ultimately while two CPU architectures does not constitute a pattern, if Apple were to implement tick-tock then this is roughly what it would look like.

Moving on, after completing our low-level tests we also wanted to spend some time comparing Enhanced Cyclone with its predecessor on some high level tests. The low-level tests can tell us if individual operations have been improved while high level tests can tell us something about what the performance impact will be in realistic workloads.

For our first high level benchmark we turn to SPECint2000. Developed by the Standard Performance Evaluation Corporation, SPECint2000 is the integer component of their larger SPEC CPU2000 benchmark. Designed around the turn of the century, officially SPEC CPU2000 has been retired for PC processors, but with mobile processors roughly a decade behind their PC counterparts in performance, SPEC CPU2000 is currently a very good fit for the capabilities of Cyclone and Enhanced Cyclone.

SPECint2000 is composed of 12 benchmarks which are then used to compute a final peak score. Though in our case we’re more interested in the individual results.

SPECint2000 - Estimated Scores
  A8 A7 % Advantage
164.gzip
842
757
11%
175.vpr
1228
1046
17%
176.gcc
1810
1466
23%
181.mcf
1420
915
55%
186.crafty
2021
1687
19%
197.parser
1129
947
19%
252.eon
1933
1641
17%
253.perlbmk
1666
1349
23%
254.gap
1821
1459
24%
255.vortex
1716
1431
19%
256.bzip2
1234
1034
19%
300.twolf
1633
1473
10%

Keeping in mind that A8 is clocked 100MHz (~7.7%) higher than A7, all of the SPECint2000 benchmarks show performance gains above and beyond the clock speed increase, indicating that every benchmark has benefited in some way. Of these benchmarks MCF, GCC, PerlBmk and GAP in particular show the greatest gains, at anywhere between 20% and 55%. Roughly speaking anything that is potentially branch-heavy sees some of the smallest gains while anything that plays into the multiplication changes benefits more.

MCF, a combinatorial optimization benchmark, ends up being the outlier here by far. Given that these are all integer benchmarks, it may very well be that MCF benefits from the integer multiplication improvements the most, as its performance comes very close to tracking the 2X increase in multiplication throughput. This also bodes well for any other kind of work that is similarly bounded by integer multiplication performance, though such workloads are not particularly common in the real world of smartphone use.

Our other set of comparison benchmarks comes from Geekbench 3. Unlike SPECint2000, Geekbench 3 is a mix of integer and floating point workloads, so it will give us a second set of eyes on the integer results along with a take on floating point improvements.

Geekbench 3 - Integer Performance
  A8 A7 % Advantage
AES ST
992.2 MB/s
846.8 MB/s
17%
AES MT
1.93 GB/s
1.64 GB/s
17%
Twofish ST
58.8 MB/s
55.6 MB/s
5%
Twofish MT
116.8 MB/s
110.0 MB/s
6%
SHA1 ST
495.1 MB/s
474.8 MB/s
4%
SHA1 MT
975.8 MB/s
937 MB/s
4%
SHA2 ST
109.9 MB/s
102.2 MB/s
7%
SHA2 MT
219.4 MB/
204.4 MB/s
7%
BZip2Comp ST
5.24 MB/s
4.53 MB/s
15%
BZip2Comp MT
10.3 MB/s
8.82 MB/s
16%
Bzip2Decomp ST
8.4 MB/
7.6 MB/s
10%
Bzip2Decomp MT
16.5 MB/s
15 MB/s
10%
JPG Comp ST
19 MP/s
16.8 MPs
13%
JPG Comp MT
37.6 MP/s
33.3 MP/s
12%
JPG Decomp ST
45.9 MP/s
39 MP/s
17%
JPG Decomp MT
89.3 MP/s
77.1 MP/s
15%
PNG Comp ST
1.26 MP/s
1.14 MP/s
10%
PNG Comp MT
2.51 MP/s
2.26 MP/s
11%
PNG Decomp ST
17.4 MP/s
15.1 MP/s
15%
PNG Decomp MT
34.3 MPs
29.6 MP/s
15%
Sobel ST
71.7 MP/s
58.1 MP/s
23%
Sobel MT
137.1 MP/s
112.4 MP/s
21%
Lua ST
1.64 MB/s
1.34 MB/s
22%
Lua MT
3.22 MB/s
2.64 MB/s
21%
Dijkstra ST
5.57 Mpairs/s
4.04 Mpairs/s
37%
Dijkstra MT
9.43 Mpairs/s
7.26 Mpairs/s
29%

Geekbench’s integer results are overall a bit more muted than SPECint2000’s, but there are still some definite high points and low points among these benchmarks. Crypto performance is among the lesser gains, while Sobel and Dijkstra are among the largest at 21% and 37% respectively. Interestingly in the case of Dijkstra, this does make up for the earlier performance loss Cyclone saw on this benchmark in the move to 64-bit.

Geekbench 3 - Floating Point Performance
  A8 A7 % Advantage
BlackScholes ST
7.85 Mnodes/s
5.89 Mnodes/s
33%
BlackScholes MT
15.5 Mnodes/s
11.8 Mnodes/s
31%
Mandelbrot ST
1.18 GFLOPS
929.4 MFLOPS
26%
Mandelbrot MT
2.34 GFLOPS
1.85 GFLOPS
26%
Sharpen Filter ST
981.7 MFLOPS
854 MFLOPS
14%
Sharpen Filter MT
1.94 MFLOPS
1.7 GFLOPS
14%
Blur Filter ST
1.41 GFLOPS
1.26 GFLOPS
11%
Blur Filter MT
2.78 GFLOPS
2.49 GFLOPS
11%
SGEMM ST
3.83 GFLOPS
3.44 GFLOPS
11%
SGEMM MT
7.48 GFLOPS
6.4 GFLOPS
16%
DGEMM ST
1.87 GFLOPS
1.68 GFLOPS
11%
DGEMM MT
3.61 GFLOPS
3.14 GFLOPS
14%
SFFT ST
1.77 GFLOPS
1.59 GFLOPS
11%
SFFT MT
3.47 GFLOPS
3.18 GFLOPS
9%
DFFT ST
1.68 GFLOPS
1.47 GFLOPS
14%
DFFT MT
3.29 GFLOPS
2.93 GFLOPS
12%
N-Body ST
735.8 Kpairs/s
587.8 Kpairs/s
25%
N-Body MT
1.46 Mpairs/s
1.17 Mpairs/s
24%
Ray Trace ST
2.76 MP/s
2.23 MP/s
23%
Ray Trace MT
5.45 MP/s
4.49 MP/s
21%

While the low-level floating point tests we ran earlier didn’t show as significant a change in the floating point performance of the architecture as it did the integer, our high level benchmarks show that floating point tests are actually faring rather well. Which goes to show that not everything can be captured in low level testing, especially less tangible aspects such as instruction windows. More importantly though this shows that Enhanced Cyclone’s performance gains aren’t just limited to integer workloads but cover floating point as well.

Overall, even without a radical change in architecture, thanks to a combination of clock speed increases, architectural optimizations, and memory latency improvements, Enhanced Cyclone as present in the A8 SoC is looking like a solid step up in performance from Cyclone and the A7. Over the next year Apple is going to face the first real competition in the ARMv8 64-bit space from Cortex-A57 and other high performance designs, and while it’s far too early to guess how those will compare, at the very least we can say that Apple will be going in with a strong hand. More excitingly, most of these performance improvements build upon Apple’s already strong single-threaded IPC, which means that in those stubborn workloads that don’t benefit from multi-core scaling Apple is looking very good.

A8: Apple’s First 20nm SoC A8’s GPU: Imagination Technologies’ PowerVR GX6450
Comments Locked

531 Comments

View All Comments

  • ninjaroll - Tuesday, September 30, 2014 - link

    You have to tell yourself that not everyone commenting is around your age range. I feel like a lot of the haters are about 12-18 years of age. If they're older...well.. I feel bad for them.
  • akdj - Saturday, October 4, 2014 - link

    I think you nailed it ninja
    The 'age & ambiguity' question. As well as age, the differences between generations and what 'we‘ had available (I'm 44) when growing up vs the 'gen Y' mid 20s-mid 30s (I think the higher end still appreciates both or the 'big three' for what they are ..not crapping on what 'they're not'). Then weve got the 'kids'. I'd even go so far as to say 12-21/22 year olds. As the 22 year old SAW the impact the Iphone had in 2007, the 'actual' Android fight 'start' in 2008 at 15 & 16. They were graduating and going into college or vocational training when the iPad broke and the Xoom filled (tablet computing). They've seen in their 'formative' years the evolution of HiDPI displays and developed personal opinions about their extremely 'personal' devices (I've got teenagers! Yikes, believe you me when I say their 'personal' devices:))
    Baby boomers, X & late Y didn't have cell phones growing up. Drug dealers and executives had pagers and computers were computers. They weren't 'connected' with the Internet (mainstream) and we paid a LOT of money for our Apple or Microsoft software and OS Updates. The incredible sea change Apple and Google have brought to the consumers and the masses regardless of income levels, location in the world and/or from developing countries ...they're penetration is significant. Obviously there are countries with their own restrictions, etc... But maybe they're the 'smart' ones for now...look at what the NSA/Patriot Act has done for the USA and her relationships even with our closest allies!
    We're still at the infancy of 'mobile' comms/computers and connectivity. These iPhones ARE computers. The G3, Z3, S5 or 5s/6/6+!!! All of them. I think as we age, we remember. It's easier not to take for granted the way technology has empowered our lives, folded the world in half, and the incredible benefits and convenience we enjoy OR despise with 'cellular' phones, phabs n tabs! At times they feel more like a leash than freedom. When you're working and paying a mortgage or two, car payments and student loans (from two decades ago or current kids going into post Ed), groceries and 'energy' (from gasoline to heating gas, cooling electricity or your battery in your fell phone of choice), groceries and your kids' entry fees, new 'cleats' and mitts, pads and summer camps....THEN you'll get it. I'd bet dollars to donuts (such a dumbass saying, very unhip I know;))
    As you age, technology will continue to evolve. Much of what we enjoy today is a direct and absolutely traceable line to developments during the 'Cold War'. Whether Russian or American, Chinese (anyone see their Olympics in Beijing? The opening and closing cereminies, etc? Kind of brings a new meaning to 'made in China' than it had when I was younger. IMHO they blew London completely outta the water ymmv as always)
    Point being there isn't 35+ folks on this board waging this ridiculous Holy War between OEMs or OS's. There ARE paid folks from both sides as again, social media in the last decade (another 'new') has become JUST as important as their thirty and sixty second TV spots, sponsorships or product placement in movies! It's HUGE. & IMHO a VERY important and crucial element in a free internet society to have sites like Anand's ...that he's passed along to Brian and Ryan and the rest of the crew. I've been here for years and have ALWAYS found what I've come for. Objective measurements and subjective reviews. We're all human. If we're reviewing a product its in our nature to 'add' our opinions now and then
    To me, as a user of OS X and Windows, UNIX, Android and iOS ...I feel like ANYone limiting themselves so blindly to what the 'enemy' is doing is ignorant, young and/or unemployed (if the latter, I feel for you if you're looking...but if you're lurking on forums like these zealots are they're NOT looking for employment. If you're out of work, you can spend 40-50 hours a week 'Looking' and in most developed nations...in other words ANYone that would criticize the other camp and not appreciate what they've already got)
    At the end of the day, it Samsung making Apple work harder. Cupertino making MtView work harder and ALL of them starting to reap the awards Microsoft seemed to 'leak' off over the past 10-15 years. We're no longer in an X86, workstation at your desk on the 'intranet' to collaborate with a fax machine to send the final product. If you don't remember those days, it's tough to take these complaints seriously as my 5s and from the time I've spent with the 6/6+, my Air and retina mini all have a 'place' in my life. And every ONE of them is faster with quicker connectivity and MORE software available than at ANY point in my life and I'm only, hopefully half way to the finish line. As you age, you'll understand what I'm saying
    That said ...If you're 44 & @ (mom's) home, in the basement, without a gig, and feeding your spiders backing these DBags arguing 'physical, objective, and factual' measurements of performance in the review....it's YOU that needs to reexamine your life and priorities.
    Love is family, kids, coaching and watching them grow, through good times and bad. It's the iPhone, the S5 or Note 3 you're carrying that's capturing those memories. Ten years from NOW, there won't be a 'lightning' connector. An iPhone 6+ or Note7/G7 or Z8! USB will be dead and history is indicative of the evolving future, only us 'old folk' will be using Facebook ...but giving it our BEST shot to 'learn' to new and HIP MySpace, Netscape, AOL or today's Twitter and Facebooks
    Remember kids, it's US, and my parents (your grandparents) that built this shit for you. Not YOU! You're reaping the benefits of the fruits of our labor. If you don't get out from behind the 600 dpi display you're so passionate about ...or get out of the house, learn how to ride a bike like Tony Hawk, snowboard like the 'Tomato' or innovate like Gates, Jobs and that snot nose kid from Stanford....young 'Zuke', you're futures are going to suck
    Don't be a slave to your tools. Let them work for you, choose what best fits your idea and vision and occupation and you'll find out soon there's a helluva lot more to life than MHz, GB of RAM, and PPI determining what you can and can't see. As your ears fail you so, too, will your eyes and damned if I can't tell the difference between the '6' &6+, the new HTC or my Note 3/5s, Air or mini! All different, ALL a helluva lot better than my green/orange monochrome displays I was 'working' on in the early 90s, how incredible '16 color displays' were and the transition from cathode ray tube 'monitors' to LCD and LED/AMOLED/Plasma displays showed us the difference between our VHS tapes, 480p DVD collection and the BluRay, 1080p displays. Now packaging those pixels into the palm of your hand is absolutely, and genuinely AMAZING. Nothing short of true miracles in engineering
    My dad graduated in 1972 with his bachelors in electrical engineering. Did it with a slide ruler and drafting kit he's still got today and the same kit myself and two of my three younger brothers took through engineering school with our TI calcs that did it all (early 90s), and my first 286 after my Apple IIe & IIc run. As a baby with the 8086 processor perhaps those of us born in the early to mid 70s and earlier are more 'appreciative' today than the younger generation. We're more patient, we've gained wisdom and most importantly we 'lived' without the Internet, with corded 'dial' phones (when I was a kid we had a party line...and only had to dial FOUR digits locally lol! Small town in northwest Montana). To me, I really HOPE there's youngsters as intrigued by ALL forms of operation systems and is the new 'Edison, Tesla, Carnegie or Jobs/Zuke/Gates' of a future era. Redesigning in his or her 13 year old mind an OS that's a 'learning OS'. Through the millions of lines of code to boot to the desk, half can be elimated as it learns YOUR usage and 'needs' ..that conforms to the individual and their needs ...regardless of how basic or how 'tough'.
    We run and have for over 20 years an audio and video production business. My wife and I are both experienced, high performance rated pilots and live in Alaska. It's paramount we fly with the business as we're living in an area nearly the physical size of the entire lower 48 with over 3 million lakes (sorry Minnesota, but we do only have just over 10,000 rivers;))---& more coastline that the ENTIRE CONUSA. With a pair of roads. No access without a plane or boat, or big balls and a four wheeler or snow 'machine' (it's Alaskan for snowmobile;)). We've been lucky enough to work with plenty of the largest cable and network broadcasters on documentaries and 'real TV' (not reality). Whether following the Troopers, fishing for crabbing on the ocean, flying into single resident 'zip codes' in the dead of winter with 2400 pounds of heating fuel in the plane with ya, it can be a kick in the ass and iOS has changed our operations in the last half decade for the better. Filing my FP, deciding how much fuel, traffic and weather conditions as well as updated Jep charts, plates and diversions ...it's becime my kneeboard, fifty LB flight bag, manuals and checklists, as well as maintenance and troubleshoot instructions ...ADSB and TCAS, 3D terrain mapping and tradfic following, it's a BIG change. While the Note 3 works GREAT for sketching rigging points with structural engineers, etc. The rMBP has been an absolute Home Run for us as has the new Mac Pro at the studios. We use several HP and Dell workstations as well, both systems are awesome and I think I'm one of the few enjoying Win 8.1 ...bought an HP 2in1 for about $750 and I've got a 13" core i5 slate with an SSD.
    Way TL/DR; youngsters don't be afraid to open your eyes and think for yourself. Try everything. Use what your need and stay away from the internet when you've made your chouce for a couple of weeks:). Something better is ALWAYS around the corner but each and every choice available today is better than yesterday's. Guaranteed
  • timbo24 - Tuesday, September 30, 2014 - link

    Great review, thanks for the hard work.
  • gevorg - Tuesday, September 30, 2014 - link

    Very nice to see audio tests, just another thing that makes Anandtech reviews unique.
  • paul4na - Wednesday, October 1, 2014 - link

    Unique? If you want proper phone reviews with detailed benchmarks then go to GSMArena.
  • doobydoo - Wednesday, October 1, 2014 - link

    LOL. You made a funny.
  • slatanek - Tuesday, September 30, 2014 - link

    still no sign of Windows 10 event...
  • Chaser - Tuesday, September 30, 2014 - link

    As objectively as I could I took up Costco on their 14 day return policy and tried the iPhone 6. I owned the first iPhone and had been Android flagship type ever since.

    Bottom line: compared to Android the iPhone does less. After 5+ years the interface is STILL the same square blobs that float on the screen. No shortcuts. No app widgets. Install a new app and it is placed in the next open spot with all the other square blobs. I liek how I can use shortcuts for my higher use app but hide others in the application folder with Android.

    No notification LED. My new G3 I can color code that notification to know just by sight what type of alert has popped up on my phone. Text, email, Facebook, more. With Apple you get a flash of the camera, if its upside down. What a joke.

    Despite Apple "allowing" Swiftkey's new keypad its a paltry joke compared to Android's version. Chrome can also be installed but make no mistake, any links through email or text will open the default browser Safari. iMessage is still the default text client with no alternatives that provide the same functionality.

    It's simple. While the Apple faithful will buy their new tech darling phones the boring, long on the tooth Apple interface does less. Android offers far more customization and openness. Back my gold iPhone 6 went to Costco and now I love my new G3. Sigh...maybe another 5 years.
  • Parhel - Tuesday, September 30, 2014 - link

    The customization and openness is exactly what turns me off of Android. It's not the only thing, but it's the main thing. I don't want that in a phone. I spend 10 hours a day coding and troubleshooting at work. For a phone, I want something that's already set up for me. Something that I barely need to even look at to use. I don't want to tinker with it.
  • Chaser - Tuesday, September 30, 2014 - link

    I tinker with nothing if I chose. However I'd rather have those choices than Apple's divine vision of how my phone should operate.

Log in

Don't have an account? Sign up now