Another year, another TechDay from Arm. Over the last several years Arm’s event has come as clockwork in the May timeframe and has every time unveiled the newest flagship CPU and GPU IPs. This year is no exception as the event is back on the American side of the Atlantic in Austin Texas where Arm has one of its major design centres.

Two years ago during the unveiling of the Cortex A73 I had talked a bit more about Arm’s CPU design teams and how they’re spread across locations and product lines. The main design centres for Cortex-A series of CPUs are found in Austin, Texas; Cambridge, the United Kingdom, and Sophia-Antipolis in the south of France near Nice. For the last two years the Cortex A73 and Cortex A75 were designs that mainly came out of the Sophia team while the Cortex A53 and more recently the A55 were designs coming out of Cambridge. This means that we haven’t seen any recent designs coming out of Austin and the last of the “Austin family” of CPUs were the A57 and A72.

The project being worked on in Austin had been hyped up for several years – I remember even as early as the A73 release back in 2016 the company had pulled forward some elements from an advanced future microarchitecture on the back-end pipelines, especially on the FP/SIMD side. The Cortex A75 was further remarked as pulling more elements from this new mysterious project.

Today we can finally unveil what the Austin team has been working on – and it’s a big one. The new Cortex A76 is a brand new microarchitecture which has been built from scratch and lays the foundation for at least two more generations for what I’ll call “the second generation of Austin family” of CPUs.

The Cortex A76 is important for Arm for a design perspective as it represents a new start from a clean sheet. It’s rare for IP claim to be able to do this as it represents a great resource and time investment and if it weren’t for the Sophia design team taking over the steering wheel for the last two generations of products it wouldn’t have been reasonable to execute. The execution of the CPU design teams should be emphasised in particular as Arm claims this is the 5th generation “annual beat” product where the company delivers a new microarchitecture every new year. Think of it as an analogue to Intel’s past Tick-Tock strategy, but rather Tock-Tock-Tock for Arm with steady CAGR (compound annual growth rate) of 20-25% every generation coming from µarch improvements.

So what is the Cortex A76? In Arm’s words, it’s a “laptop-class” performance processor with mobile efficiency. The vision of the A76 as a laptop-class processor had been emphasised throughout the TechDay presentation so it seems Arm is really taking advantage of the large performance boost of the IP to cater to new market segments such as the emerging “Always connected PCs” which Qualcomm is spearheading with their SoC platforms.

The Cortex A76 microarchitecture has been designed with high performance while maintaining power efficiency in mind. Starting from a clean sheet allowed the designers to remove bottlenecks throughout the design and to break previous microarchitectural limitations. The focus here was again maximum performance while remaining within energy efficiency that is fit for smartphones.

In broad metrics, what we’re promised in actual products using the A76 is the follows: a 35% performance increase alongside 40% improved power efficiency. We’ll also see a 4x improvements in machine learning workloads thanks to new optimisations in the ASIMD pipelines and how dot products are handled. These figures are baselined on A75 configurations running at 2.8GHz on 10nm processes while the A76 is projected by Arm to come in at 3GHz on 7nm TSMC based products.

The new CPU is naturally still compatible with DynamIQ’s common cluster topology and Arm envisions designs to be paired with Cortex A55s as the little more power efficient CPUs. The configuration scalability of the DynamIQ IP again was reiterated and we were presented with example configurations such as 1+7 or 2+6 with either Cortex A75 or A76 CPU IP. This presentation slide was one of the rare ones where Arm referred to the area size of the A76, pointing out that the A75 still had better PPA and thus might still be a valid design choice for companies, depending on their needs. One comparison that was made during the event is that in terms of area, three A76’s with larger caches would fit inside the size of a Skylake core – all while within 10% of the IPC of the Intel CPU, but obviously there’s also process node scaling considerations to take into account.

A standout claim is that Arm aims to outperform the competition at half the area and half the power. Arm was slightly beating around the bush here in what it considers the competition, but generally the answer was that it was considering everybody the competition. Taking into account Intel, AMD or Samsung it’s actually not that hard to imagine Arm beating them in PPA as historically the company always had the smallest CPU designs and that directly translates into more efficient microarchitectures.

Before we get into more detailed breakdowns of the performance and power improvements and what I’m expecting to happen into products, let’s see the microarchitectural improvements on the core and how Arm managed to extract this much performance while maintaining power efficiency.

Cortex A76 µarch - Frontend
Comments Locked

123 Comments

View All Comments

  • lmcd - Friday, June 1, 2018 - link

    They're at the mercy of chipmakers. The only companies that would buy such a core have already left reference designs behind. Everyone else wants small, cheap chips, so much so that we've had A53-only designs in the entire middle-and-lower range. Will anyone even use the A76? I don't know if that's guaranteed.
  • vladx - Friday, June 1, 2018 - link

    Read the last part of the article, it's almost guaranteed next Kirin is skipping the A75.and going directly to A76. I think Huawei is done playing catch-up.
  • darkich - Friday, June 1, 2018 - link

    It's so frustrating how even you people who are into SoC's already forget that Apple was basically cheating the customers with secret huge compromises, just to be able to put unbalanced and owerpowered cores in the iPhones.
  • Zoolookuk - Friday, June 1, 2018 - link

    Wow, I've seen some seriously bad anti-Apple comments over the last 30 years, but this is probably the best one yet. A10 and A11 are not unbalanced and not 'cheating' customers. Anyone with half a brain can the history of this advantage Apple has started with A7, which was the first 64-bit ARM-based SoC in phones. Ever since they, they've been consistently 2 generations ahead of the competition, and that gap shows no sign of closing.

    The comments below this that 'at 3ghz' the (still unreleased) A76 would 'only need a 20% boost' to match last year's A11 is pretty funny. So a chip already at its thermal and power limit "only" needs to be overclocked by 20% to match a chip designed two years ago running 40% slower.
  • techconc - Tuesday, June 5, 2018 - link

    Actual device performance easily disproves your claim. Your comment isn't helpful and you would have more credibility if you at least attempted to justify the claim you made.
  • jjj - Friday, June 1, 2018 - link

    In reality Apple is the one behind but won't bother to explain, just remember the core count for Apple's solutions.

    Anyway, A76 at 3GHz and 750mW per core would require less than a 20% boost in clocks to match Apple's A11 in Geekbench.
    Apple has only 2 large cores and when including the 4 small ones, the SoC throttles hard under 100% load
    If that is what you want, A76 should be able to deliver something close enough when configured as dual core with stupid high clocks. A SoC vendor could push A76 to 2W-3W per core instead of 0.75W and get clocks as high as possible.
    But maybe it's better to have more than enough perf, 4 cores and sane efficiency.
  • Lolimaster - Friday, June 1, 2018 - link

    Please stop using geekbench as a comparison tool specially between 2 different ARM ecosystems.
  • jjj - Friday, June 1, 2018 - link

    You have results for Apple on anything else?
  • Elstar - Friday, June 1, 2018 - link

    Because, crazy as it sounds, most ARM's customers don't want fast off the shelf designs, at least at first. ARM's whole business model is rather simple: they sell simple, affordable, efficient, and feature rich reference designs as a "gateway drug". Once you get hooked on their ecosystem, then they charge a lot for nontrivial customization.
  • name99 - Friday, June 1, 2018 - link

    This is like asking "why doesn't Apple come out with a 5GHz A12?" It's not so much that they can't as that this does not make sense for their business model.

    You can SAY that you want (and would be willing to pay for) Apple level's of performance, but is that really true? God knows Android people complain all the time about how expensive Apple products are, and they MOSTLY buy the midrange phones, not the high end phones. Essential just went bust assuming that more people want to pay for high end Android phones than actually exist.

Log in

Don't have an account? Sign up now