Dissecting Intel's EPYC Benchmarks: Performance Through the Lens of Competitive Analysisby Johan De Gelas & Ian Cutress on November 28, 2017 9:00 AM EST
- Posted in
- Xeon Platinum
- EPYC 7601
Enterprise & Cloud Benchmarks
Below you can find Intel's internal benchmarking numbers. The EPYC 7601 is the reference (performance=1), the 8160 is represented by the light blue bars, the top of the line 8180 numbers are dark blue. On a performance per dollar metric, it is the light blue worth observing.
Java benchmarks are typically unrealistically tuned, so it is a sign on the wall when an experienced benchmark team is not capable to make the Intel 8160 shine: it is highly likely that the AMD 7601 is faster in real life.
The node.js and PHP runtime benchmarks are very different. Both are open source server frameworks to generate for example dynamic page content. Intel uses a client load generator to generate a real workload. In the case of the PHP runtime, MariaDB (MySQL derivative) 10.2.8 is the backend.
In the case of Node.js, mongo db is the database. A node.js server spawns many different single threaded processes, which is rather ideal for the AMD EPYC processor: all data is kept close to a certain core. These benchmarks are much harder to skew towards a certain CPU family. In fact, Intel's benchmarks seem to indicate that the AMD EPYC processors are pretty interesting alternatives. Surely if Intel can only show a 5% advantage with a 10% more expensive processor, chances are that they perform very much alike in the real world. In that case, AMD has a small but tangible performance per dollar advantage.
The DPDK layer 3 Network Packet Forwarding is what most of us know as routing IP packets. This benchmark is based upon Intel own Data Plane Developer Kit, so it is not a valid benchmark to use for an AMD/Intel comparison.
We'll discuss the database HammerDB, NoSQL and Transaction Processing workloads in a moment.
The second largest performance advantage has been recorded by Intel testing the distributed object caching layer memcached. As Intel notes, the benchmark was not a processing-intensive workload, but rather a network-bound workload. As AMD's dual socket system is seen as a virtual 8-socket system, due to the way that AMD has put four dies onto each processor and each die has a sub-set of PCIe lanes linked to it, AMD is likely at a disadvantage.
Intel's example of network bandwidth limitations in a pseudo-socket configuration
Suppose you have two NICs, which is very common. The data of the first NIC will, for example, arrive in NUMA node 1, Socket 1, only to be accessed by NUMA node 4, Socket 1. As a result, there is some additional latency incurred. In Intel's case, you can redirect a NIC to each socket. With AMD, this has to be locally programmed, to ensure that the packets that are sent to each NICs are processed on each virtual node, although this might incur additional slowdown.
The real question is whether you should bother to use a 2S system for Memached. After all, it is distributed cache layer that scales well over many nodes, so we would prefer a more compact 1S system anyway. In fact, AMD might have an advantage as in the real world, Memcached systems are more about RAM capacity than network or CPU bottlenecks. Missing the additional RAM-as-cache is much more dramatic than waiting a bit longer for a cache hit from another server.
The virtualization benchmark is the most impressive for the Intel CPUs: the 8160 shows a 37% performance improvement. We are willing to believe that all the virtualization improvements have found their way inside the ESXi kernel and that Intel's Xeon can deliver more performance. However, in most cases, most virtualization systems run out of DRAM before they run out of CPU processing power. The benchmarking scenario also has a big question mark, as in the footnotes to the slides Intel achieved this victory by placing 58 VMs on the Xeon 8160 setup versus 42 VMs on the EPYC 7601 setup. This is a highly odd approach to this benchmark.
Of course, the fact that the EPYC CPU has no track record is a disadvantage in the more conservative (VMware based) virtualization world anyway.
Post Your CommentPlease log in or sign up to comment.
View All Comments
sharath.naik - Tuesday, November 28, 2017 - linkEpyc single socket 32core/64 thread CPU is ~2000$. There is no Intel equivalent here, which is disappointing. As the single socket systems are only ~22 core max and no 205 watt parts.
IGTrading - Tuesday, November 28, 2017 - linkYou're talking nonsense mate :)
I'd pay extra to have extra physical cores when I'm speccing a server holding VMs, but AMD gives us more cores for less money.
I also love AMD's RAID which works absolutely great and it's free while Intel's is annoyingly a paid-for solution.
Intel doesn't say one peep about Full Encrypted RAM, because they don't have it.
Intel doesn't say a pee about power consumption because their platform looses in every test.
Intel doesn't say a peep about EPYC 1.1 or EPYC Plus or whatever which will be a drop-in upgrade for the current platforms.
I was put in the shitty situation of speccing Xeon based machines because the per-core licenses were extremely expensive and the Xeon solution is offering us better performance, but other than this situation, we're doing everything to avoid working with Intel.
We still have servers that started out with dual Opterons and grew to Hexa-Core over the years.
That saved our clients a ton of money and their jaws dropped when we advised that they need to move back to Xeon if they want to upgrade (EPYC was still 2 years away then) .
It may be fashionable as a young lads to root for the "cool winner" like Ferrari, Bugatti or Intel , but when you've worked multiple decades in the industry and had to swallow all the crap Intel was pulling, you start rooting for the little guy.
ddrіver - Tuesday, November 28, 2017 - linkPaying anywhere between $12K-$50+K more per machine just to have the Intel logo tends to add up. Ending up with up to 200W more per machine also incurs some extra costs.
If you said the cost fades when compared to licensing costs of many software solutions I would understand. But the metal itself... no, the extra cost for that Xeon is either stupidity or protection tax.
Geranium - Tuesday, November 28, 2017 - linkHow many server software really using AVX-512? Can you give us a list (excluding AI and machine learning apps, because those ran better on GPU/Dedicate hardware).
SaltyVincent - Wednesday, November 29, 2017 - linkI haven't come across in personally, but something else to add is the amount of heat these chips generate when running AVX-512 under load. Running any AVX benchmarks on Intel chips usually results in throttling.
deltaFx2 - Wednesday, November 29, 2017 - link"The whole "pricetag" thing is not really an issue": No? Is that why the volume sales in the server market is the mid-section of the former Xeon E5? Wouldn't people be buying top end E7s (Platinum in today's lingo)? Of course pricetag matters, and matters even more when you're deploying tens of thousands of nodes.
Ro_Ja - Tuesday, November 28, 2017 - linkHead title needs a wee bit edit.
negusp - Tuesday, November 28, 2017 - linkYour comment needs a big bit edit.
Ryan Smith - Tuesday, November 28, 2017 - linkHead title? I'm not sure I follow.
IGTrading - Tuesday, November 28, 2017 - linkThese TSX instructions have a lot in common with AMD's own proposed ASF instructions which were discussed 3 years before TSX.
Don't you think so ?