Cloud = x86 and open source

From a high-level perspective, the basic architecture of Facebook is not that different from other high performance web services.

However, Facebook is the poster child of the new generation of Cloud applications. It's hugely popular and very interactive, and as such it requires much more scalability and availability than your average website that mostly serves up information.

The "Cloud Application" generation did not turn to the classic high-end redundant platforms with heavy Relational Database Management Systems. A combination of x86 scale-out clusters, open source websoftware, and "no SQL" is the foundation that Facebook, Twitter, Google and others build upon.

However, facebook has improved several pieces of the Open Source software puzzle to make them more suited for extreme scalability. Facebook chose PHP as its presentation layer as it is simple to learn, write, and read. However, PHP is very CPU and memory intensive.

According to Facebook’s own numbers, PHP is about 39 times slower than C++ code. Thus it was clear that Facebook had to solve this problem first. The traditional approach is to rewrite the most performance critical parts in C++ as PHP Extensions, but Facebook tried a different solution: the engineers developed HipHop, a source code transformer. Hiphop transforms the PHP source code into faster C++ code and compiles it with g++.

The next piece in the Facebook puzzle is Memcached. Memcached is an in-RAM object caching system with some very cool features. Memcached is a distributed caching system, which means a memcached cache can span many servers. The "cache" is thus in fact a collection of smaller caches. It basically recuperates unused RAM that your operating system would probably waste on less efficient file system caching. These “cache nodes” do not sync or broadcast and as a result the memory cache is very scalable.

Facebook quickly became the world's largest user of memcached and improved memcached vastly. They ported it to 64-bit, lowered TCP memory usage, distributed network processing over multiple cores (instead of one), and so on. Facebook mostly uses memcached to alleviate database load.

Facebook Technology Overview The Facebook Open Compute Servers
POST A COMMENT

62 Comments

View All Comments

  • npp - Thursday, November 03, 2011 - link

    What you're talking about is how efficient the power factor correction circuits of those PSUs are, and not how power efficient the units their self are... The title is a bit misleading. Reply
  • NCM - Thursday, November 03, 2011 - link

    "Only" 10-20% power savings from the custom power distribution????

    When you've got thousands of these things in a building, consuming untold MW, you'd kill your own grandmother for half that savings. And water cooling doesn't save any energy at all—it's simply an expensive and more complicated way of moving heat from one place to another.

    For those unfamiliar with it, 480 VAC three-phase is a widely used commercial/industrial voltage in USA power systems, yielding 277 VAC line-to-ground from each of its phases. I'd bet that even those light fixtures in the data center photo are also off-the-shelf 277V fluorescents of the kind typically used in manufacturing facilities with 480V power. So this isn't a custom power system in the larger sense (although the server level PSUs are custom) but rather some very creative leverage of existing practice.

    Remember also that there's a double saving from reduced power losses: first from the electricity you don't have to buy, and then from the power you don't have to use for cooling those losses.
    Reply
  • npp - Thursday, November 03, 2011 - link

    I don't remember arguing that 10% power savings are minor :) Maybe you should've posted your thoughts as a regular post, and not a reply. Reply
  • JohanAnandtech - Thursday, November 03, 2011 - link

    Good post but probably meant to be a reply to erwinerwinerwin ;-) Reply
  • NCM - Thursday, November 03, 2011 - link

    Johan writes: "Good post but probably meant to be a reply to erwinerwinerwin ;-)"

    Exactly.
    Reply
  • tiro_uspsss - Thursday, November 03, 2011 - link

    Is it just me, or does placing the Xeons *right* next to each other seem like a bad idea in regards to heat dissipation? :-/

    I realise the aim is performance/watt but, ah, is there any advantage, power usage-wise, if you were to place the CPUs further apart?
    Reply
  • JohanAnandtech - Thursday, November 03, 2011 - link

    No. the most important rule is that the warm air of one heatsink should not enter the stream of cold air of the other. So placing them next to each other is the best way to do it, placing them serially the worst.

    Placing them further apart will not accomplish much IMHO. most of the heat is drawn away to the back of the server, the heatsinks do not get very hot. You also lower the airspeed between the heatsinks.
    Reply
  • harrkev - Thursday, November 03, 2011 - link

    You should look again at the sine-wave plots. Power factor has more to do with the phase of the current and not so much how much like a sine-wave it looks like.

    As an example, a purely capacitive or purely inductive load will have a perfect sine wave current (but completely out of phase with the voltage), but have a power factor very close to zero...

    So, those graphs do not really tell us much unless you actually crank the numbers to calculate the real power factor.

    http://en.wikipedia.org/wiki/Power_factor#Non-line...
    Reply
  • ezekiel68 - Thursday, November 03, 2011 - link

    On page 2:

    "The next piece in the Facebook puzzle is that the Open Source tools are Memcached."

    In fact, the tools are not memchached. Instead, software objects from the PHP/c++ stack, programmed by the engineers, are stored in Memcached. Side note - those in the know pronounce it "mem-cache-dee", emphasizing with the last syllable that it is a network daemon. (similar to how the DNS server "bind" is pronounced "bin-dee") So the next piece is Memcached, but the tools are not 'memcached'.
    Reply
  • JohanAnandtech - Thursday, November 03, 2011 - link

    That is something that went wrong in the final editing by Jarred. Sorry about that and I feel bad about dragging Jarred into this, but unfortunately that is what happened. As you can see further, "Facebook mostly uses memcached to alleviate database load", I was not under the impression that the "Open Source tools are Memcached. " :-) Reply

Log in

Don't have an account? Sign up now