Facebook’s scaling challenge
Before we get into the details, here are a few factoids to give you an idea of the scaling challenge that Facebook has to deal with:
- Facebook serves 570 billion page views per month (according to Google Ad Planner).
- There are more photos on Facebook than all other photo sites combined (including sites like Flickr).
- More than 3 billion photos are uploaded every month.
- Facebook’s systems serve 1.2 million photos per second. This doesn’t include the images served by Facebook’s CDN.
- More than 25 billion pieces of content (status updates, comments, etc) are shared every month.
- Facebook has more than 30,000 servers (and this number is from last year!)
Software that helps Facebook scale
In some ways Facebook is still a LAMP site (kind of), but it has had to change and extend its operation to incorporate a lot of other elements and services, and modify the approach to existing ones.
- Facebook still uses PHP, but it has built a compiler for it so it can be turned into native code on its web servers, thus boosting performance.
- Facebook uses Linux, but has optimized it for its own purposes (especially in terms of network throughput).
- Facebook uses MySQL, but primarily as a key-value persistent storage, moving joins and logic onto the web servers since optimizations are easier to perform there (on the “other side” of the Memcached layer).
Then there are the custom-written systems, like Haystack, a highly scalable object store used to serve Facebook’s immense amount of photos, or Scribe, a logging system that can operate at the scale of Facebook (which is far from trivial).
But enough of that. Let’s present (some of) the software that Facebook uses to provide us all with the world’s largest social network site.
HIPHOP FOR PHP
HADOOP AND HIVE
Other things that help Facebook run smoothly
GRADUAL RELEASES AND DARK LAUNCHES