Benchmark Configuration & method

This review is mostly focused on performance. We have included the Xeon E5-2697 v2 (12 cores at 2.7-3.5GHz) and Xeon E5-2650L v2 (10 cores at 1.7GHz-2.1GHz) to categorize the performance of the high-end and lower-midrange new Xeons. That way, you can get an idea of where the rest of the 12 and 10 core Xeon SKUs will land. We also have the previous generation E5-2690 and E5-2660 so we can see the improvements from the new architecture. This also allows us to gauge how competitive the Opteron "Piledriver" 6300 is.

Intel's Xeon E5 server R2208GZ4GSSPP (2U Chassis)

CPU Two Intel Xeon processor E5-2697 v2 (2.7GHz, 12c, 30MB L3, 130W)

Two Intel Xeon processor E5-2690 (2.9GHz, 8c, 20MB L3, 135W)

Two Intel Xeon processor E5-2660 (2.2GHz, 8c, 20MB L3, 95W)

Two Intel Xeon processor E5-2650L v2 (1.7GHz, 10c, 25MB L3, W)
RAM 64GB (8x8GB) DDR3-1600 Samsung M393B1K70DH0-CK0

or

128GB (8 x 16GB) Micron MT36JSF2G72PZ – BDDR3-1866
Internal Disks 2 x Intel MLC SSD710 200GB
Motherboard Intel Server Board S2600GZ "Grizzly Pass"
Chipset Intel C600
BIOS version SE5C600.86B (August the 6th, 2013)
PSU Intel 750W DPS-750XB A (80+ Platinum)

The Xeon E5 CPUs have four memory channels per CPU and support up to DDR3-1866, and thus our dual CPU configuration gets eight DIMMs for maximum bandwidth. The typical BIOS settings can be found below.

Supermicro A+ Opteron server 1022G-URG (1U Chassis)

CPU Two AMD Opteron "Abu Dhabi" 6380 at 2.5GHz

Two AMD Opteron "Abu Dhabi" 6376 at 2.2GHz
RAM 64GB (8x8GB) DDR3-1600 Samsung M393B1K70DH0-CK0
Motherboard SuperMicro H8DGU-F
Internal Disks 2 x Intel MLC SSD710 200GB
Chipset AMD Chipset SR5670 + SP5100
BIOS version v2.81 (10/28/2012)
PSU SuperMicro PWS-704P-1R 750Watt

The same is true for the latest AMD Opterons: eight DDR3-1600 DIMMs for maximum bandwidth. You can check out the BIOS settings of our Opteron server below.

C6 is enabled, TurboCore (CPB mode) is on.

Common Storage System

To minimize different factors between our tests, we use our common storage system to provide LUNs via iSCSI. The applications are placed on a RAID-50 LUN of ten Cheetah 15k5 disks inside a Promise JBOD J300, connected to an Adaptec 5058 PCIe controller. For the more demanding applications (Zimbra, PhpBB), storage is provided by a RAID-0 of Micron P300 SSDs, with a 6 Gbps Adaptec 72405 PCIe raid controller.

Software Configuration

All vApus testing is done on ESXi vSphere 5 — VMware ESXi 5.1. All vmdks use thick provisioning, independent, and persistent. The power policy is "Balanced Power" unless otherwise indicated. All other testing is done on Windows 2008 Enterprise R2 SP1. Unless noted otherwise, we use the "High Performance" setting on Windows 2008 R2 SP1.

Other Notes

Both servers are fed by a standard European 230V (16 Amps max.) powerline. The room temperature is monitored and kept at 23°C by our Airwell CRACs. We use the Racktivity ES1008 Energy Switch PDU to measure power consumption. Using a PDU for accurate power measurements might seem pretty insane, but this is not your average PDU. Measurement circuits of most PDUs assume that the incoming AC is a perfect sine wave, but it never is. However, the Rackitivity PDU measures true RMS current and voltage at a very high sample rate: up to 20,000 measurements per second for the complete PDU.

Positioning: SKUs and Servers Virtualization Performance
Comments Locked

70 Comments

View All Comments

  • Kevin G - Tuesday, September 17, 2013 - link

    Odd that Intel went the 3 die route with Ivy Bridge-EP. It was no surprise that the lowend would be a variant of the 6 core Ivy Bridge-E found in the Core i7-4900 series. Apple leaked that the line up would scale to 12 cores. The surprise is a native 10 core part and the differences between it and the 12 core design.

    Judging from the diagrams, Intel altered its internal ring bus for connecting cores. One ring goes orbits around all three columns of cores while another connects two columns. Thus the cores in the middle column have better latency for coherency as they have fewer stops on the ring bus to reach any core. The outer columns should have similar latency than the native 10 core chip for coherency: fewer cores to stop but longer traces on the die between columns.

    Not disclosed is how the 12 core chip divides cache. Previously each core would have a 2.5 MB of L3 cache that was more local than the rest of the L3 cache. The middle column may have access to L3 cache on both sides.

    The usage of dual memory controllers on the 12 core die is interesting. I wonder what measurable differences it produces. I'd fathom tests with a mix of reads/writes (ie databases) would show the greatest benefit as a concurrent read and write may occur. In a single socket configuration, enabling NUMA may produce a benefit. (Actually, how many single socket 2011 boards have this option?)
  • madmilk - Tuesday, September 17, 2013 - link

    It looks like each ring is connected to two columns. One ring goes around all three, but does not connect to the center column.
  • JlHADJOE - Tuesday, September 17, 2013 - link

    I'm guessing the 12-core might see action in the 8P segment, which is well overdue for an update.
  • psyq321 - Tuesday, September 17, 2013 - link

    There will be 15-core E7 8xxx v2 CPUs based on the same IvyTown architecture.

    As Intel is not showing the die-shot of a 12 core Ivy EP, I wonder if the 15-core EX and 12-core EP are using the same 3x5 die.
  • Kevin G - Tuesday, September 17, 2013 - link

    The memory controller interfaces are different between the Ivy Bridge-EP and Ivy Bridge-EX. The EP uses DDR3 in all of its forms (vanilla, ECC, buffered ECC, LR ECC) where as the EX version is going to use a serial interface similar in concept to FB-DIMMs. There will be two types of memory buffers for the EX line, one for DDR3 and later another that will use DDR4 memory. No changes need to be made to the new EX socket to support both types of memory.
  • Brutalizer - Tuesday, September 17, 2013 - link

    I would have expected this newest Intel 12-core cpu to perform better. For instance, in Java SPECjbb2013 benchmarks, it gets 35,500 and 4,500. However, the Oracle SPARC T5 gets 75.700 and 23.300 which totally demolishes the x86 cpu. Have not the x86 cpus improved that much in comparison to SPARC? The x86 still lags behind?
    https://blogs.oracle.com/BestPerf/entry/20130326_s...
  • JohanAnandtech - Tuesday, September 17, 2013 - link

    Be careful when you compare inflated, for marketing purposes results with independent "limited optimization" results ;-)
  • Phil_Oracle - Friday, February 21, 2014 - link

    What do you mean by inflated for marketing purposes? SPECjbb2013 is clearly a real world, recent benchmark that’s full audited by all vendors on the SPEC committee. If you make such claims, surely you have some evidence?
  • extide - Tuesday, September 17, 2013 - link

    Dont forget those T5's run at TDP's in the 200-300W range... If you clocked up one of these babies to those power levels I am sure it would be >= to the T5.
  • Kevin G - Tuesday, September 17, 2013 - link

    TDP's are indeed higher on the SPARC side but not as radically as you indicate. Generally they do not consume more than 200W. (Unfortunately Oracle doesn't give a flat power consumption figure for just the CPU, this is just an estimate based upon their total system power calculator. For reference, the POWER7 is 200W and the POWER7+ is 180W.)

Log in

Don't have an account? Sign up now