The 32-Core, 64-Thread Beast: QSSC-S4R

The heavy—50kg—QSSC-S4R server found its way to our lab. The ODM (Original Design Manufacturer) is the Taiwanese firm Quanta, who designed the server jointly with Intel. The 4U server is equipped for maximum expandability with 10 PCIe slots, quad gigabit Ethernet onboard, and 64 DIMM slots.

The enormous amount of DIMM slots is a result of the use of eight separate memory boards. Each memory board has two memory buffers and eight DIMMs onboard.

A 7+1 hot-swap, redundant fan module setup cools this system down. Notice that the disk system is not in front of the cooling as in most server systems. That is a plus, as the disks should not get the coldest air: disks perform best with medium temperatures (30-40°C, 86-104F) as the lower viscosity of the grease in the rotation motor puts less stress on the rotating components. Google’s study also suggests that disks should be kept at a higher temperature than the rest of the server.

The CPUs and DIMMs however should be kept as cool as possible to reduce the leakage power. The fans are well positioned: the memory boards and the heatsinks of the CPUs right behind them get the coolest air. In the back of the server you find the motherboard. You can see that the heatsinks on the 7500 chipset receive extra airflow.

Four 850W high efficiency power supplies feed this massive machine in a 2+2 or 3+1 configuration. You can find more detailed information about this QSSC-S4R server here. The other benchmarked configurations are identical to this page.

21st Century Server Choices Nehalem EX Confusion
POST A COMMENT

51 Comments

View All Comments

  • fynamo - Wednesday, August 11, 2010 - link

    WHERE ARE THE POWER CONSUMPTION CHARTS??????

    Awesome article, but complete FAIL because of lack of power consumption charts. This is only half the picture -- and I dare to say it's the less important half.
    Reply
  • davegraham - Wednesday, August 11, 2010 - link

    +1 on this. Reply
  • JohanAnandtech - Thursday, August 12, 2010 - link

    Agreed. But it wasn't until a few days before I was going to post this article that we got a system that is comparable. So I kept the power consumption numbers for the next article. Reply
  • watersb - Wednesday, August 11, 2010 - link

    Wow, you IT Guys are a cranky bunch! :-)

    I am impressed with the vApus client-simulation testing, and I'm humbled by the complexity of enterprise-server testing complexity.

    A former sysadmin, I've been an ignorant programmer for lo these past 10 years. Reading all these comments makes me feel like I'm hanging out on the bench in front of the general store.

    Yeah, I'm getting off your lawn now...
    Reply
  • Scy7ale - Wednesday, August 11, 2010 - link

    Does this also apply to consumer HDDs? If so is it a bad idea to have an intake fan in front of the drives to cool them as many consumer/gaming cases have now? Reply
  • JohanAnandtech - Thursday, August 12, 2010 - link

    Cold air comes from the bottom of the server aisle, sometimes as low as 20°C (68F) and gets blown at high speed over the disks. Several studies now show that this is not optimal for a HDD. In your desktop, the temperature of the air that is blown over the hdd should be higher, as the fans are normally slower. But yes, it is not good to keep your harddisk at temperatures lower than 30 °C . use hddsentinel or speedfan to check on this. 30-45°C is acceptable. Reply
  • Scy7ale - Monday, August 16, 2010 - link

    Good to know, thanks! I don't think this is widely understood. Reply
  • brenozan - Thursday, August 12, 2010 - link

    http://en.wikipedia.org/wiki/UltraSPARC_T2
    2 sockets =~ 153GHz
    4 sockets =~ 306GHz
    Like the T1, the T2 supports the Hyper-Privileged execution mode. The SPARC Hypervisor runs in this mode and can partition a T2 system into 64 Logical Domains, and a two-way SMP T2 Plus system into 128 Logical Domains, each of which can run an independent operating system instance.

    why SUN did not dominate the world in 2007 when it launched the T2? Besides the two 10G Ethernet builtin processor they had the most advanced architecture that I know, see in
    http://www.opensparc.net/opensparc-t2/download.htm...
    Reply
  • don_k - Thursday, August 12, 2010 - link

    "why SUN did not dominate the world in 2007 when it launched the T2?"

    Because it's not actually that good :) My company bought a few T2s and after about a week of benchmarking and testing it was obvious that they are very very slow. Sure you get lots and lots of threads but each of those threads is oh so very slow. You would not _want_ to run 128 instances of solaris, one on each thread, because each of those instances would be virtually unusable.

    We used them as webservers.. good for that. Or file servers that you don't need to do any cpu intensive work.

    The theory is fine and all but you obviously have never used a T2 or you would not be wondering why it failed.
    Reply
  • JohanAnandtech - Thursday, August 12, 2010 - link

    "http://en.wikipedia.org/wiki/UltraSPARC_T2
    2 sockets =~ 153GHz
    4 sockets =~ 306GHz"

    You are multiplying threads times clockspeed. IIRC, the T2 is a finegrained multithread CPU where 8 (!!) threads share two pipelines of *one* core.

    Compare that with the Nehalem core where 2 threads share 4 "pipelines" (sustained decode/issue/execution/retire) per cycle. So basically, a dual socket T2 is nothing more than 16 relatively weak cores which can execute 2 instructions per clockcycle at the most, or 32 instructions per cycle. The only advantage of having 8 threads per core is that (with enough indepedent software threads) the T2 is able to come relatively close to that kind of throughput.

    A dual six-core Xeon has a maximum throughput of 12 cores x 4 instructions or 48 instructions per cycle. As the Xeon has only 2 threads per core, it is less likely that the CPU will ever come close to that kind of output (in business apps). On the other hand, it performs excellent when you have some amount of dependent threads, or simply not enough threads in parallel. The T2 will only perform well if you have enough independent threads.
    Reply

Log in

Don't have an account? Sign up now