Intel 3rd Gen Xeon Scalable (Ice Lake SP) Review: Generationally Big, Competitively Small

Name: Intel 3rd Gen Xeon Scalable (Ice Lake SP) Review: Generationally Big, Competitively Small
Item: Intel 3rd Gen Xeon Scalable (Ice Lake SP) Review: Generationally Big, Competitively Small
Author: Andrei Frumusanu

by Andrei Frumusanu on April 6, 2021 11:00 AM EST

169 Comments | Add A Comment

169 Comments

Section by Ian Cutress

Ice Lake Xeon Processor List

Intel is introducing around 40 new processors across the Xeon Platinum (8300 series), Xeon Gold (6300 and 5300 series) and Xeon Silver (4300 series). Xeon Bronze no longer exists with Ice Lake. Much like the previous generation, the 8/6/5/4 segmentation signifies the series, and the 3 indicates the generation. Beyond that the two digits are somewhat meaningless as before.

That being said, there is a significant change. In the past, Platinum/Gold/Silver also indicated socket support, with Platinum supporting up to 8P configurations. This time around, as Ice Lake does not support 8P, all the processors will support only up to 2P, with a few select models being uniprocessor only. This makes the Platinum/Gold/Silver segmentation arbitrary, if only to indicate what sort of performance/price bracket the processors are in.

On top of this, Intel is adding in more suffixes to the equation. If you work with Xeon Scalable processors day in and day out, there is now a need to differentiate the Q processor from a P processor, and an S processor from an M processor. There’s a handy list down below.

SKU List

The easiest way with this is to jump into the deep end with the processor list. RCP stands for recommended customer price, and SGX GB stands for how large Software Guard Extension enclaves can be – either 8 GB, 64 GB, or 512 GB. Cells highlighted in green show highlights in the stack.

Intel 3rd Gen Xeon Scalable Ice Lake Xeon Only
AnandTech		Cores w/HT	Base Freq	1T Freq	nT Freq	L3 MB	TDP W	SGX GB	RCP 1ku	DC PMM
Xeon Platinum (8x DDR4-3200)
8380		40	2300	3400	3000	60	270	512	$8099	Yes
8368	Q	38	2600	3700	3300	57	270	512	$6743	Yes
8368		38	2400	3400	3200	57	270	512	$6302	Yes
8362		32	2800	3600	3500	48	265	64	$5488	Yes
8360	Y	36	2400	3500	3100	54	250	64	$4702	Yes
8358	P	32	2600	3400	3200	48	240	8	$3950	Yes
8358		32	2600	3400	3300	48	250	64	$3950	Yes
8352	Y	32	2200	3400	2800	48	205	64	$3450	Yes
8352	V	36	2100	3500	2500	54	195	8	$3450	Yes
8352	S	32	2200	3400	2800	48	205	512	$4046	Yes
8352	M	32	2300	3500	2800	48	185	64	$3864	Yes
8351	N	36	2400	3500	3100	54	225	64	$3027	Yes
Xeon Gold 6300 (8x DDR4-3200)
6354		18	3000	3600	3600	39	205	64	$2445	Yes
6348		28	2600	3500	3400	42	235	64	$3072	Yes
6346		16	3100	3600	3600	36	205	64	$2300	Yes
6342		24	2800	3500	3300	36	230	64	$2529	Yes
6338	T	24	2100	3400	2700	36	165	64	$2742	Yes
6338	N	32	2200	3500	2700	48	185	64	$2795	Yes
6338		32	2000	3200	2600	48	205	64	$2612	Yes
6336	Y	24	2400	3600	3000	36	185	64	$1977	Yes
6334		8	3600	3700	3600	18	165	64	$2214	Yes
6330	N	28	2200	3400	2600	42	165	64	$2029	Yes
6330		28	2000	3100	2600	42	205	64	$1894	Yes
6326		16	2900	3500	3300	24	185	64	$1300	Yes
6314	U	32	2300	3400	2900	48	205	64	$2600	Yes
6312	U	24	2400	3600	3100	36	185	64	$1450	Yes
Xeon Gold 5300 (8x DDR4-2933)
5320	T	20	2300	3500	2900	30	150	64	$1727	Yes
5320		26	2200	3400	2800	39	185	64	$1555	Yes
5318	Y	24	2100	3400	2600	36	165	64	$1273	Yes
5318	S	24	2100	3400	2600	36	165	512	$1667	Yes
5318	N	24	2100	3400	2700	36	150	64	$1375	Yes
5317		12	3000	3600	3400	18	150	64	$950	Yes
5315	Y	8	3200	3600	3500	12	140	64	$895	Yes
Xeon Silver (8x DDR4-2666)
4316		20	2300	3400	2800	30	150	8	$1002
4314		16	2400	3400	2900	24	135	8	$694	Yes
4310	T	10	2300	3400	2900	15	105	8	$555
4310		12	2100	3300	2700	18	120	8	$501
4309	Y	8	2800	3600	3400	12	105	8	$501
Q = Liquid Cooled SKU Y = Supports Intel SST-PP 2.0 P = IaaS Cloud Specialised Processor V = SaaS Cloud Specialised Processor N = Networking/NFV Optimized M = Media Processing Optimized T = Long-Life and Extended Thermal Support U = Uniprocessor (1P Only) S = 512 GB SGX Enclave per CPU Guaranteed (...but not all 512 GB are labelled S)

The peak turbo on these processors is 3.7 GHz, which is much lower than what we saw with the previous generation. Despite this, Intel seems to be keeping prices reasonable, and enabling Optane support through most of the stack except for the Silver processors (which has its own single exception).

New suffixes include Q, for a liquid cooled processor model with higher all-core frequencies at 270 W, and Intel said this part came about based on customer demand. The T processors are extended life / extended thermal support, which usually means -40ºC to 125ºC support – useful when working at the poles or in other extreme conditions. M/N/P/V specialized processors, according to our chat with Lisa Spelman, GM of the Xeon and Memory Group, are the focal points for software stack optimizations. Users that want focused hardware that can get 2-10%+ more performance on their specific workload can get these processors for which the software will be specifically tuned. Lisa stated that while all processors will receive uplifts, the segmented parts are the ones those uplifts will be targeted for. This means managing turbo vs use case and adapting code for that, which can only really be optimized for a known turbo profile.

Competition

It’s hard not to notice that the server market over the last couple of years has become more competitive. Not only is Intel competing with its own high market share, but x86 alternatives from AMD have scored big wins when it comes to per-core performance, and Arm implementations such as the Ampere Altra can enable unprecedented density at competitive performance as well. Here’s how they all stand, looking at top-of-stack offerings.

Top-of-Stack Competition
AnandTech	EPYC 7003	Amazon Graviton2	Ampere Altra	Intel Xeon
Platform	Milan	Graviton2	QuickSilver	Ice Lake
Processor	7763	Graviton2	Q80-33	8380
uArch	Zen 3	N1	N1	Sunny Cove
Cores	64	64	80	40
TDP	280 W	?	250 W	270 W
Base Freq	2450	2500	3300	2300
Turbo Freq	3500	2500	3300	3400
All-Core	~3200	2500	3300	3000
L3 Cache	256 MB	32 MB	32 MB	60 MB
PCIe	4.0 x128	?	4.0 x128	4.0 x64
Chipset	On CPU	?	On CPU	External
DDR4	8 x 3200	8 x 3200	8 x 3200	8 x 3200
DRAM Cap	4 TB	?	4 TB	4 TB
Optane	No	No	No	Yes
Price	$7890	N/A	$4050	$8099

At 40 cores, Intel does look a little behind, especially as Ampere is currently at 80 cores and a higher frequency, and will come out with a 128-core Altra Max version here very shortly. This means Ampere will be able to enable more cores in a single socket than Intel can in two sockets. Intel’s competitive advantage however will be the large current install base and decades of optimization, as well as new security features and its total offering to the market.

On a pure x86 level, AMD launched Milan only a few weeks ago, with its new Zen 3 core which has been highly impressive. Using a chiplet based approach, AMD has over 1000 mm2 of silicon to spread across 64 high performance cores and massive amounts of IO. Compared to Intel, which is around 660 mm2 and monolithic, AMD has the chipset onboard in its IO die, whereas Intel keeps it external which saves a good amount of idle power. Top of stack pricing between AMD and Intel is similar now, however AMD is also focusing in the mid-range with products like the 7F53 which really impressed us. We’ll see what Intel can respond with.

In our numbers today, we’ll be comparing Intel’s top-of-stack to everyone else. The battle royale of behemoths.

Gen on Gen Improvements: ISO Power

It is also important to look at what Intel is offering generationally in a like-for-like comparison. Intel’s 28-core 205 W point for the previous generation Cascade Lake is a good stake in the ground, and the Intel Xeon Gold 6258R is the dual socket equivalent of the Platinum 8280. We reviewed the two and they performed identically.

For this review, we’ve put the 40-core Xeon Platinum 8380 down to 205 W to see the effect of performance. But perhaps more in line, we also have the Xeon Gold 6330 which is a direct 28-core and 205 W replacement.

Intel Xeon Comparison: 3rd Gen vs 2nd Gen 2P, 205 W vs 205 W
Xeon Gold 6330	Xeon Plat 8352Y	AnandTech	Xeon Gold 6258R
28 / 56	32 / 64	Cores / Threads	28 / 56
2000 MHz Base 3100 MHz ST 2600 MHz MT	2200 MHz Base 3400 MHz ST 2800 MHz MT	Base Freq ST Freq MT Freq	2700 MHz Base 4000 MHz ST 3300 MHz MT
35 MB + 42 MB	40 MB + 48 MB	L2 + L3 Cache	28 MB + 38.5 MB
205 W	205 W	TDP	205 W
PCIe 4.0 x64	PCIe 4.0 x64	PCIe	PCIe 3.0 x48
8 x DDR4-3200	8 x DDR4-3200	DRAM Support	6 x DDR4-2933
4 TB	4 TB	DRAM Capacity	1 TB
200-series	200-series	Optane	100-series
4 TB Optane + 2 TB DRAM	4 TB Optane + 2 TB DRAM	Optane Capacity Per Socket	1 TB DDR4-2666 + 1.5 TB
64 GB	64 GB	SGX Enclave	None
1P, 2P	1P, 2P	Socket Support	1P, 2P
3 x 11.2 GT/s	3x 11.2 GT/s	UPI Links	3 x 10.4 GT/s
$1894	$3450	Price (1ku)	$3950

So the 6330 might seem like a natural fit, however, the 8352Y feels better given that it is more equivalent in price and offers more performance. Intel is promoting a +20% raw performance boost with the new generation, which is important here, because the 8352Y still loses 500 MHz to the previous generation in all-core frequency. The 8352Y and 6330 do make it up in the extra features, such as DDR4 channels, memory support, PCIe 4.0, Optane support, SGX enclave support, and faster UPI links.

This review has a few of our 6330 numbers that we’ve been able to run in the short time we’ve had the system.

Intel's 3rd Gen Xeon Scalable: Ice Lake in Server Form Test Bed and Setup - Compiler Options

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

169 Comments

View All Comments

Drazick - Wednesday, April 7, 2021 - link
The ICC compiler has much better vectorization engine than the one in GCC. It will usually generate better vectorized code. Especially numerical code.

But the real benefit of ICC is its companion libraries: VSML, MKL, IPP.
Oxford Guy - Wednesday, April 7, 2021 - link
I remember that custom builds of Blender done with ICC scored better on Piledriver as well as on Intel hardware. So, even an architecture that was very different was faster with ICC.
mode_13h - Thursday, April 8, 2021 - link
And when was this? Like 10 years ago? How do we know the point is still relevant?
Oxford Guy - Sunday, April 11, 2021 - link
How do we know it isn't?

Instead of whinge why not investigate the issue if you're actually interested?

Bottom line is that, just before the time of Zen's release, I tested three builds of Blender done with ICC and all were faster on both Intel and Piledriver (a very different architecture from Haswell).

I asked why the Blender team wasn't releasing its builds with ICC since performance was being left on the table but only heard vague suggestions about code stability.
Wilco1 - Sunday, April 11, 2021 - link
This thread has a similar comment about quality and support in ICC: https://twitter.com/andreif7/status/13808945639975...
KurtL - Wednesday, April 7, 2021 - link
This is absolutely untrue. There is not much special about AOCC, it is just a AMD-packaged Clang/LLVM with few extras so it is not a SPEC compiler at all. Neither is it true for Intel. Sites that are concerned about getting the most performance out of their investments often use the Intel compilers. It is a very good compiler for any code with good potential for vectorization, and I have seen it do miracles on badly written code that no version of GCC could do.
Wilco1 - Wednesday, April 7, 2021 - link
And those closed-source "extras" in AOCC magically improve the SPEC score compared to standard LLVM. How is it not a SPEC compiler just like ICC has been for decades?
JoeDuarte - Wednesday, April 7, 2021 - link
It's strange to tell people who use the Intel compiler that it's not used much in the real world, as though that carries some substantive point.

The Intel compiler has always been better than gcc in terms of the performance of compiled code. You asserted that that is no longer true, but I'm not clear on what evidence you're basing that on. ICC is moving to clang and LLVM, so we'll see what happens there. clang and gcc appear to be a wash at this point.

It's true that lots of open source Linux-world projects use gcc, but I wouldn't know the percentage. Those projects tend to be lazy or untrained when it comes to optimization. They hardly use any compiler flags relevant to performance, like those stipulating modern CPU baselines, or link time optimization / whole program optimization. Nor do they exploit SIMD and vectorization much, or PGO, or parallelization. So they leave a lot of performance on the table. More rigorous environments like HPC or just performance-aware teams are more likely to use ICC or at least lots of good flags and testing.

And yes, I would definitely support using optimized assembly in benchmarks, especially if it surfaced significant differences in CPU performance. And probably, if the workload was realistic or broadly applicable. Anything that's going to execute thousands, millions, or billions of times is worth optimizing. Inner loops are a common focus, so I don't know what you're objecting to there. Benchmarks should be about realizable optimal performance, and optimization in general should be a much bigger priority for serious software developers – today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts. Servers are also far too slow to do simple things like parse an HTTP request header.
pSupaNova - Wednesday, April 7, 2021 - link
"today's software and OSes are absurdly slow, and in many cases desktop applications are slower in user-time than their late 1980s counterparts." a late 1980's desktop could not even play a video let alone edit one, your average mid range smartphone is much more capable. My four year old can do basic computing with just her voice. People like you forget how far software and hardware has come.
GeoffreyA - Wednesday, April 7, 2021 - link
Sure, computers and devices are far more capable these days, from a hardware point of view, but applications, relying too much on GUI frameworks and modern languages, are more sluggish today than, say, a bare Win32 application of yore.

Intel 3rd Gen Xeon Scalable (Ice Lake SP) Review: Generationally Big, Competitively Small

Ice Lake Xeon Processor List

SKU List

Competition

Gen on Gen Improvements: ISO Power

Post Your Comment

169 Comments

View All Comments

Drazick - Wednesday, April 7, 2021 - link

Oxford Guy - Wednesday, April 7, 2021 - link

mode_13h - Thursday, April 8, 2021 - link

Oxford Guy - Sunday, April 11, 2021 - link

Wilco1 - Sunday, April 11, 2021 - link

KurtL - Wednesday, April 7, 2021 - link

Wilco1 - Wednesday, April 7, 2021 - link

JoeDuarte - Wednesday, April 7, 2021 - link

pSupaNova - Wednesday, April 7, 2021 - link

GeoffreyA - Wednesday, April 7, 2021 - link

Log in

Don't have an account? Sign up now