Understanding the SeaMicro Architecture

The secret begins at the motherboard level. It all starts with a single core Intel Atom Z530 (1.6GHz, Silverthorne) and US15 chipset (Poulsbo). Astute readers will recognize this as the previous generation Intel Atom platform for MIDs, codenamed Menlow. SeaMicro chose single core Menlow and not the newer Pine Trail platform in order to hit its power targets. Moorestown would probably be a good fit as well but the chips only recently started shipping. Hanging off Poulsbo is 2GB of DDR2 memory.

SeaMicro excludes the I/O hub and instead connects its custom ASIC over Poulsbo’s PCIe x2 interface. The custom ASIC emulates all I/O features, everything from SATA to Gigabit Ethernet is handled by the SeaMicro chip. As far as the Atom CPU is concerned, it has a bunch of I/O devices that hang off of Poulsbo. The virtualized I/O is a key part of making SeaMicro’s technology work.

Those three chips (+ DRAM) make up the basic building block of a SM10000 server. They occupy a PCB area about the size of a credit card: 2.2” x 3”. Since this basic building block is physically autonomous, SeaMicro refers to it as a single server.

SeaMicro then takes eight of these server building blocks and puts them on a card measuring 5” x 11”. Instead of using one SeaMicro ASIC per Atom, the ratio is one ASIC per two Atom processors.


Eight Atom "servers", four SeaMicro ASICs and a 32-lane electrical PCIe interface to the rest of the box

Each one of these cards has a pair of electrical PCIe x16 connectors that plug into the SM10000’s back plane.

A single SM10000 can support up to 64 of these cards, which is how you end up with 512 Atom CPUs in a 10U chassis. Intra-system communication occurs over a multidimensional torus bus interface. The link is built by connecting all of the SM ASICs together, allowing each Atom server to communicate with any other server in the system.

Despite being well connected, the server architecture doesn’t support shared memory (each Atom has exclusive access to its 2GB of DRAM). The torus interface is instead used to share the virtualized I/O amongst all of the servers. If server/CPU 0 wants to access the virtual HDD on server 206, it can. Each hop takes 8 microseconds so it’s fairly low latency for storage and network I/O but not fast enough for memory.

Since each Atom CPU is paired with 2GB of memory, the total machine has a terabyte of DDR2 memory. But like I said earlier, the memory isn’t shared so you have a 2GB maximum limit on each server. This in itself imposes a restriction on the type of applications you’ll run on a SM10000. If you need more than 2GB of memory per server in your rack, the SM10000 isn’t for you.

Poulsbo’s memory controller doesn’t support ECC, which is fine for MIDs but can be a problem for some enterprise customers. SeaMicro claims that most of its customers aren’t bothered by the lack of ECC. There’s no hope for future ECC support unless Intel eventually embraces the Atom platform for servers.

Networking

SeaMicro not only wants to replace some of your server hardware with its boxes, but also some of Cisco’s networking equipment. A single SM10000 is designed to replace your top rack switch.

The idea is you’d take the uplink provided to your backbone and plug it directly into one of the ports on the back of the SM10000. All load balancing, terminal server and switching functionality is handled by the SM10000 itself. It’s all Linux based so you should be able to add a firewall as well.

On the back of the machine you’ll see rows of ethernet ports, up to 64 to be exact. On each one of these cards is a separate CPU that is used to handle all of the network functionality of the server. It helps SeaMicro justify the pricing of the server as you’re replacing not only your server hardware but also some expensive networking gear.

Each server has a physical Gigabit Ethernet interface on it. A fully populated SM10000 can have up to 64 Gigabit Ethernet ports, or it can be configured to have 16 10GbE ports. If you don’t need that much bandwidth you can just use the Ethernet ports you need.

The networking is fully virtualized so each Atom “server” gets its own IP address and thinks it has its own connection to the outside world.

Storage

SeaMicro’s ASIC virtualizes four SATA ports per Atom processor. The SM10000 can support up to 64 physical 2.5” HDDs or SSDs. The customer will configure the machine to determine what four physical disks or slices of disks will map to each Atom CPU.

The SM ASIC emulates RAID-0, but nothing more. SeaMicro states this is because its target market is to replace dozens of simple servers that have limited or no storage. If you’re replacing a couple hundred web servers that only use their storage for OS and little else, the SeaMicro approach makes sense.

OS

Linux is fully supported today but currently there’s no official Windows support. SeaMicro claims the box works just fine running a VM with Windows Server installed however Microsoft doesn’t officially support the configuration. SM is working with Microsoft on fixing that but for now, if you want support, you need to be running Linux.

Introduction Final Words
Comments Locked

53 Comments

View All Comments

  • CharonPDX - Monday, June 14, 2010 - link

    Except the ENTIRE POINT of this is maximum compatibility with minimum power draw. It is *NOT* meant for maximum power, at all.

    Intel chips are standard. (They could have used an AMD Geode, or Via Nano, same effect.) They run all standard server OSes, all standard software. If you have to get custom-written server software, there goes your money savings!

    Yes, someone could re-compile their software for ARM, but this isn't meant for audiences that recompile their software. It's meant for audiences that need a lot of low-end servers.
  • code65536 - Tuesday, June 15, 2010 - link

    Um, why are you citing flops? Unless this server is being used for scientific computing (or other applications along that sort of line) (for which this is unsuitable anyway for a variety of other reasons), you don't care about how many floating-point operations this thing can do; the number of flops is totally irrelevant. You only care about the integer rate.
  • Shining Arcanine - Tuesday, June 15, 2010 - link

    I was probably wrong to call it flops, as what I did was multiply the number of instructions per clock (2) by the clock and the number of of cores.
  • Calin - Tuesday, June 15, 2010 - link

    This is consumer-driven equipment and consumer-driven requirements.
    Many customers would like quite a bit of RAM in their servers, but the processing need would be small enough. Also, they would like to use "industry-standard" software (SQL, Apache, PHP, ...) for Linux, and it might not be completely and totally supported under ARM.
    Yes, using ARM processors would give you better everything (I'm not sure about allowing access to 2GB of RAM), but it would be like trying to sell a Formula 1 car to someone living in a swamp.
  • Shining Arcanine - Tuesday, June 15, 2010 - link

    ARM is completely supported by Linux:

    http://www.gentoo.org/doc/en/handbook/handbook-arm...
    http://www.debian.org/ports/arm/

    I am running Linux on my Linksys NSLU2 and I can run just about whatever application I want on it.
  • MySchizoBuddy - Tuesday, July 6, 2010 - link

    ARM server using the 4-core Cortex A9 the would have 8 teraflops of computing power.
    Whats the source of this number. or how did you calculate it.
  • yanfei - Sunday, July 25, 2010 - link

    ======= http://www.fashionshoppong.us=======
  • chromal - Monday, June 14, 2010 - link

    Or you could instead embrace virtualization and oversubscribe the hardware a little. I'm not sure who is in the market for a $100000+ machine that does even offer basic enterprise features like ECC memory. Seems like a solution in search of a problem. I'm sure the problem is real and 'out there,' but I'm also sure that that the specific instances that wouldn't better be accommodated by other technology are niche, indeed...

    Myself, I'd rather have one good Xeon X5550 CPU than 24 crappy Atoms.
  • vol7ron - Monday, June 14, 2010 - link

    I'm curious to see how this will pan out.

    What would the ideal server type be? A web server w/ little computational processes?

    I'm trying to think how little the computational processes would be. Would something like a blog/forum based enterprise or maybe a eStore be the ideal? I'm guessing something like a gameserver would not be suitable for this type of technology, nor would something like eBay that's continuously calculating the difference in time?

    I'm also curious how this would scale down in price. How much would something like ~20 cores (w/ less memory) run? Something like this seems nice because it seems more economical to add on to.

    As stated, it'd be nice to see this in other varieties (ARM-based) with ECC support. For some reason I get the feeling SeaMicro has been at this a while and the Cortex A9 may not have been available when this project was started. Though, I think the A9 also lacks certain instructions that the Intel does provide.
  • spazoid - Monday, June 14, 2010 - link

    This is some very interesting hardware, but I see a problem with the money savings comparison.

    If you have a Dell R610 running at 100%, you can't replace it with X amount of Atom CPU's until you hit the same SPECint performance. You'd need a lot of these Atom CPU's to equal one Quad core Xeon, and seeing as they can't work together in any way other than access each others virtual harddrives, the comparison is totally ridiculus.

    Yes, for something like web servers or similar where the CPU usage on a quad core CPU is very low, this could work, but I don't see any good reason for not just virtualizing such a server, which gives you many advantages that this setup simply cannot provide.

Log in

Don't have an account? Sign up now