Open source code allows your operating system and application stack to be recompiled for different systems.
Today, with many applications being migrated into the cloud, good performance per Watt of power usage is paramount in keeping power-costs down.
x86 traditionally has not been optimized for best per-Watt performance – Intel is catching up with Atom, especially with the BayTrail SoC for the mobile application area. For microservers Intel has introduced the C2000 “Avoton” Atom SoC.
Let’s look at a couple of alternatives for modern cloud computing.
The ARM architecture is quite big already in the mobile market, getting more and more into the desktop markets (take the Raspberry Pi for instance), and now it’s taking big strides towards servers.
ARM Contender #1: Calxeda
Calxeda developed one of the first (or maybe THE first) ARM-based server module solutions. Their design “EnergyCore ECX-1000” is based on 4 x ARMv7 Cortex A9 cores (32 bit), running at 1.1 – 1.4 GhZ.
Each board has one memory slot for up to 4 GB of RAM (remember, 32 bit!), and four SATA ports per socket, and five 10 GBit/s on-board LAN-ports. They were specified at 1.5 W power usage per core, and 5 W per node.
It was planned originally by Calxeda to produce a “Midway” chip, which would allow for 40 bit memory addressing. Being socket compatible with the ECX-1000’s, the chip would have allowed to address 16 GB of memory.
According to this article, Calxeda was looking to provide a 15 – 20 x price/performance advantage over “traditional” server processors. This article claims Calxeda was also looking at a 5x – 10x performance / Watt increase.
Dell has built a server based on the Calxeda board architecture and donated it to the Apache Software foundation, so they can tweak Apache, Hadoop, Cassandra, … for the architecture. In this server architecture, up to 360 ECX-1000 nodes can be put in a 4U chassis.
HP has also tested the waters with it’s experimental Redstone ARM Server, based on Calxeda technology. It allows up to 288 ECX-1000 nodes in 4U rack space.
Avantek announced machines based on the Calxeda architecture at the end of 2012, with a 3U base machine (four x ECX-1000 cards, some disk drives) weighing in at about 4000 GBP (~ about 4900 €), and a fully “loaded” machine with 48 cards, giving 192 Cores and 192 GB of memory, mix of disk and flash at about 40.000 GBP (~ about 49.000 €). Here’s Avantek’s info page, which also has a comparison to Xeon E5450 on it.
“Ten times the performance at the same power in the same space”.
Calxeda ran out of money in mid-December 2013, and it looks as though they are shutting down operations. The intellectual property may very well be bought by Dell or HP. It had roughly 125 employees by the time the news hit, and they had raised about 90 – 100 Million $ in venture funding. (Have a look at the article to see an actual Calxeda card, with the SATA ports next to each core). Calxeda was also backed by ARM Holdings Inc.
Tilera has it’s own RISC based design (non-ARM), including many cores (up to 72) in one SoC, interconnected with the “iMesh” non-blocking interconnect, with “Terabits of on-chip bandwidth”. The cores can be programmed in ANSI C/C++ or JAVA. Linux runs on the system – support for the Tilera architecture was added in October 2010, with ver. 2.6.36 of the Linux kernel. The CPU series itself was launched in October 2011.
Facebook claimed, that in their tests the Tilera architecture was about four times more power efficient than the Intel X86 architecture. They ran memcached
Router & Wireless company MikroTik has a product called “Cloud Core Router” which is based on a 36 core Tilera CPU. To give you an idea of it’s cost: the router retails (depending on the version) for about 1000 € including VAT.
Have a look at this page to see the Cloud Core Router. Tilera has also some evaluation platforms of their own.
ARM Contender #2: Marvell ARMADA XP
This is a series of multicore processors, (quad-core ARM). The XP apparently stands for “extreme performance”.
Marvell powers Dell “Copper” ARM Server.
Chinese search giant Baidu has deployed these.
ARM Contender #3: AMD
AMD’s getting on the ARM bandwagon. I always liked that company (and despite my criticism of it these days, I also like Intel!) – they are not disappointing me!
The Opteron A1100 is based on the first true 64 bit addressing ARMv8 core, Cortex A57.
The Octo-Core version of the Opteron A1100 is claimed to be “two to four times faster” when compared with the Opteron X2150, with four x64 Jaguar cores. This is an interesting comparison, because both are targeted to be available on the Moonshot platform (see below).
The TDP of the octo-core version of A1100 is 25 W. It contains two 10 GbE ports, eight SATA 6G ports, eight PCIe-3.0 lanes.
Development platforms based on the Opteron A1100 should be available soon. On the developer board, the chip can address up to 32 x 4 GB of memory (four DIMM slots).
AMD predicts that in 2019 the ARM platform will take up about 25 % of the server market.
The Moonshot platform
Calxeda’s modules (EnergyCore, see above) were also intended to be used for this platform.
HP also uses Intel’s Atom chips for Moonshot. They plan to use Avoton for it (see below for more information about Avoton).
The first Moonshot system is Moonshot 1500 – taking up 4.3 Rack Units, with 45 ProLiant Moonshot Atom S1200, ethernet switch and some more gear, prices start at 50.605 €.
HP wants to offer KeyStone Chips from TI including many DSPs, interesting for instance for content delivery networks (transcoding), etc.
With the BayTrail SoC being targeted at the mobile market, Intel has introduced a different SoC for microservers, which is called Avoton (Atom C2000 series being the first representatives). They also have a SoC Rangeley, which shares some of the Avoton platform and manufacturing process, but is targeted at the communications / networking market.
Avoton has eight CPU cores based on the new Silvermont microarchitecture – the first true reworking of the Atom architecture since it’s beginnings. Intel finally introduced out-of-order execution for it.
Configured with two DIMMs per channel, a single Avoton node can support up to 64 GB RAM. It supports four Gigabit Ethernet connections – but no 10 GBit connection.
They have integrated power control tightly into the chip, and have made sensible tradeoffs – for instance wake up latency has not been compromised upon to avoid dropped packets and such.
They have a choice of different products based on Avoton and Rangeley. Ranging from two cores and 6 W, clocked at 1.7 GhZ to eight cores and 20 W, clocked at 2.4 GhZ.
Figures released from Intel indicate that the Atom C2750 (2.4 GhZ, 8 Cores, 20 Watt) easily outperform Marvell’s ARMADA XP (1,33 GhZ, 4 Cores, A9) and Calxeda’s ECX-1000 (1.4 GhZ, 4 cores, A9) in memory bandwidth and General purpose computing. I agree with the article that AMD’s Cortex A57 core with true 64-bit addressing will be the real rival for Avalon, the one it should be compared against.
Intel is targetting the C2000 at “cold storage” applications. Have a look at this PDF to read more about it.
The C2750 supports Intel’s virtualization feature (VT-x), but not VT-d apparently (which is used to “pass through devices” to the virtualized system, e.g. graphics cards, …)
Have a look at these charts. They even measure against a Raspberry Pi!
The Atom C2750’s list price is 171 $.
Supermicro has a motherboard, the A1SAi-2750F, which integrates the C2750.
This board is avaliable at about 340 € including VAT in Germany. It has 4 DDR-3 SO-DIMM slots, 1 x PCIe 2.0 x 8, 1 x VGA, 2 x 2 x USB 3.0, 2 x USB 2.0, 4 x GB LAN, Also 2 x 6 GB/s SATA, and 4 x 3 GB/s SATA.
It is a Mini-ITX board.
This is another option, but more expensive, and with only 2 GbE ports.