Last week, Apple had unveiled their new generation MacBook Pro laptop series, a new range of flagship devices that bring with them significant updates to the company’s professional and power-user oriented user-base. The new devices particularly differentiate themselves in that they’re now powered by two new additional entries in Apple’s own silicon line-up, the M1 Pro and the M1 Max. We’ve covered the initial reveal in last week’s overview article of the two new chips, and today we’re getting the first glimpses of the performance we’re expected to see off the new silicon.

The M1 Pro: 10-core CPU, 16-core GPU, 33.7bn Transistors

Starting off with the M1 Pro, the smaller sibling of the two, the design appears to be a new implementation of the first generation M1 chip, but this time designed from the ground up to scale up larger and to more performance. The M1 Pro in our view is the more interesting of the two designs, as it offers mostly everything that power users will deem generationally important in terms of upgrades.

At the heart of the SoC we find a new 10-core CPU setup, in a 8+2 configuration, with there being 8 performance Firestorm cores and 2 efficiency Icestorm cores. We had indicated in our initial coverage that it appears that Apple’s new M1 Pro and Max chips is using a similar, if not the same generation CPU IP as on the M1, rather than updating things to the newer generation cores that are being used in the A15. We seemingly can confirm this, as we’re seeing no apparent changes in the cores compared to what we’ve discovered on the M1 chips.

The CPU cores clock up to 3228MHz peak, however vary in frequency depending on how many cores are active within a cluster, clocking down to 3132 at 2, and 3036 MHz at 3 and 4 cores active. I say “per cluster”, because the 8 performance cores in the M1 Pro and M1 Max are indeed consisting of two 4-core clusters, both with their own 12MB L2 caches, and each being able to clock their CPUs independently from each other, so it’s actually possible to have four active cores in one cluster at 3036MHz and one active core in the other cluster running at 3.23GHz.

The two E-cores in the system clock at up to 2064MHz, and as opposed to the M1, there’s only two of them this time around, however, Apple still gives them their full 4MB of L2 cache, same as on the M1 and A-derivative chips.

One large feature of both chips is their much-increased memory bandwidth and interfaces – the M1 Pro features 256-bit LPDDR5 memory at 6400MT/s speeds, corresponding to 204GB/s bandwidth. This is significantly higher than the M1 at 68GB/s, and also generally higher than competitor laptop platforms which still rely on 128-bit interfaces.

We’ve been able to identify the “SLC”, or system level cache as we call it, to be falling in at 24MB for the M1 Pro, and 48MB on the M1 Max, a bit smaller than what we initially speculated, but makes sense given the SRAM die area – representing a 50% increase over the per-block SLC on the M1.

 

The M1 Max: A 32-Core GPU Monstrosity at 57bn Transistors

Above the M1 Pro we have Apple’s second new M1 chip, the M1 Max. The M1 Max is essentially identical to the M1 Pro in terms of architecture and in many of its functional blocks – but what sets the Max apart is that Apple has equipped it with much larger GPU and media encode/decode complexes. Overall, Apple has doubled the number of GPU cores and media blocks, giving the M1 Max virtually twice the GPU and media performance.

The GPU and memory interfaces of the chip are by far the most differentiated aspects of the chip, instead of a 16-core GPU, Apple doubles things up to a 32-core unit. On the M1 Max which we tested for today, the GPU is running at up to 1296MHz  - quite fast for what we consider mobile IP, but still significantly slower than what we’ve seen from the conventional PC and console space where GPUs now can run up to around 2.5GHz.

Apple also doubles up on the memory interfaces, using a whopping 512-bit wide LPDDR5 memory subsystem – unheard of in an SoC and even rare amongst historical discrete GPU designs. This gives the chip a massive 408GB/s of bandwidth – how this bandwidth is accessible to the various IP blocks on the chip is one of the things we’ll be investigating today.

The memory controller caches are at 48MB in this chip, allowing for theoretically amplified memory bandwidth for various SoC blocks as well as reducing off-chip DRAM traffic, thus also reducing power and energy usage of the chip.

Apple’s die shot of the M1 Max was a bit weird initially in that we weren’t sure if it actually represents physical reality – especially on the bottom part of the chip we had noted that there appears to be a doubled up NPU – something Apple doesn’t officially disclose. A doubled up media engine makes sense as that’s part of the features of the chip, however until we can get a third-party die shot to confirm that this is indeed how the chip looks like, we’ll refrain from speculating further in this regard.

Huge Memory Bandwidth, but not for every Block
Comments Locked

493 Comments

View All Comments

  • zodiacfml - Monday, October 25, 2021 - link

    Nice, it falls where I expected it since the announcement. Apple is now playing with the 5nm process like it is nothing.
  • LuckyWhale - Monday, October 25, 2021 - link

    Seems like a hasty and rushed article by a fanboy. So lacking (perhaps on purpose) in real-world general benchmarks; Could have added some encoding or compression, image manipulation, etc. benchmarks for several competing systems.
  • Hifihedgehog - Monday, October 25, 2021 - link

    > Seems like a hasty and rushed article by a fanboy.

    Unlike Ian Cutress who is down to earth and a blast to send questions or pose hypotheses to, the author on occasion has been very rude and condescending if you disagree. Never mind his statements like this one from Andre that ARM would make you question his motives: "No - Apple is indeed special and many Arm ISA things happen because of Apple." ARM ISA has been progressing more and more automonously and independently from Apple since the 2013 arm64 contribute. Apple indeed has a 1-2 year lead over the rest of industry thanks to bets made around a decade ago, but they are not heaven's never-failing gift to humanity. Statements like this from Andrei should give you all the knowledge you need about his bias.
  • ikjadoon - Monday, October 25, 2021 - link

    I think the AT team work together on various pieces. Ian is, after all, the primary Intel & AMD author here, so his work is seemingly used here, too.

    >many Arm ISA things happen because of Apple

    You realise that statement has been directly validated by Apple employees...publicly? See Shac Ron, who commented in the very thread you're referring to.

    It's not that unexpected...Arm Ltd. was literally founded by Acorn, Apple and VLSI. An Apple VP was appointed as Arm's first CEO. Former Arm employees have confirmed that Apple was responsible for Arm's name removing all mention of "Acorn". Literally from Wikipedia, mate:

    >The company was founded in November 1990 as Advanced RISC Machines Ltd and structured as a joint venture between Acorn Computers, Apple, and VLSI Technology. Acorn provided 12 employees, VLSI provided tools, Apple provided $3 million investment. Larry Tesler, Apple VP was a key person and the first CEO at the joint venture.

    Then this...

    >Apple indeed has a 1-2 year lead

    If you think Intel (and AMD) need just 1-2 years to overcome a 4x perf/watt gap, I'm not sure how long you have read AnandTech or followed this industry. If that timeline was anything close to true, we should see AMD & Intel matching M1 perf/watt next month, right? M1 launched a year ago.

    >they are not heaven's never-failing gift to humanity

    Don't think anyone has made conclusion, have they? These are CPU reviews: there's data and there's straightforward conclusions from the data.
  • Hifihedgehog - Monday, October 25, 2021 - link

    If you are referencing the data, then you would observe it is a 2X efficiency advantage, not your grossly over-exaggerated 4X. Unless, of course, you are referencing Apple’s 3080 Laptop claims. The numbers here show 3060 gaming performance. Adobe Premiere Pro, meanwhile, which has been made enhanced for M series silicon since July, is only showing numbers on par with RTX 3050 Ti coming out of M1 Max. Let’s be objective but Andrei needs to broaden his benchmark horizons:
    https://twitter.com/TheRichWoods/status/1452639861...
  • ikjadoon - Tuesday, October 26, 2021 - link

    lmao: what kind of basic YouTube comment is this? You're thoroughly confused. And ignored every other silly point that was neatly debunked...

    1) Nope: XDA admits in their actual review that Premiere Pro hasn't yet been updated to activate the M1 Pro / Max video engines, lmao, and that's why it appeared "slow". Do you just copy-paste tweets that agree with you with zero critical review? XDA made it obvious in their review (which you happily refused to link).

    https://www.xda-developers.com/apple-macbook-pro-2...

    >I checked with Apple on the jarring discrepancy between Final Cut Pro and Adobe Premiere Pro rendering times (1:35 vs 21:11!) and a representative from Apple said it’s because Adobe Premiere Pro has not been optimized to use the M1 Pro/Max’ ProRes hardware for video encoding.

    2) You simply cannot read and I'm done wasting my time. Thanks for making me write this out, though: much easier to debunk other illiterate YouTube commenters. 😂

    Cinebench R23 ST: M1 Max has 4.7x higher perf/W than the i9-11980HK
    Cinebench R23 MT: M1 Max has 2.6x higher perf/W than the i9-11980HK
    SPEC2017 502 ST: M1 Max has 2.9x higher perf/W than the i9-11980HK
    SPEC2017 502 MT: M1 Max has 4.0x higher perf/W than the i9-11980HK
    SPEC2017 511 ST: M1 Max has 3.5x higher perf/W than the i9-11980HK
    SPEC2017 511 MT: M1 Max has 2.9x higher perf/W than the i9-11980HK
    SPEC2017 503 ST: M1 Max has 2.4x higher perf/W than the i9-11980HK
    SPEC2017 503 MT: M1 Max has 6.3x higher perf/W than the i9-11980HK

    The geometric mean perf/W improvement is 3.5x. :) Thank you for the laughs, though. Now I know who actually doesn't read the articles!

    If you want objectivity, then bring real data next time. Please find someone else to "debate" lmao. See you once Intel & AMD to release their "just 1-2 years behind" M1 competitors in the next two months.

    Will eagerly await for AnandTech's review...
  • ikjadoon - Monday, October 25, 2021 - link

    Did you read the article, though? AnandTech tested every single thing you asked for...

    All benchmarks need to be standardized between arm vs x86, Windows vs macOS, etc. What validated, repeatable cross-ISA, cross-OS "real-world" software do you suggest?

    Today, that is SPEC2017...which would you have understood had you read the article:

    557.xz_r = compression
    525.x264_r = encoding
    538.imagick_r = image manipulation

    They also ran PugetBench's Premiere Pro, one of the few Apple Silicon-native production applications.
  • michael2k - Monday, October 25, 2021 - link

    They did: That's what SPEC2017 was and they compared it to the Ryzen 5980HS and Core i9-11980HK
  • vladx - Monday, October 25, 2021 - link

    Yes at this point it's quite obvious Andrei is a big Apple fanboy with how he tries to oversell the performance of the Max SoC with synthetic benchmarks.
  • FurryFireball - Monday, October 25, 2021 - link

    For games why don’t you use one that was converted for M1? I know world of Warcraft was made native for M1 by blizzard so that would give you a better idea of how well it would do in gaming. Wow still can beat down a 3080ti if you max everything out so it’s not a slouch of a game to benchmark to.

Log in

Don't have an account? Sign up now