Ruby on Rails Scalability – Is it a Problem?
Nick Wilson caught my eye on his new podcasting blog communicontent.com when he stated/asked “Ruby on Rails doesn’t scale?”. Its no secret that we’ve been shifting the NotSleepy shop from a PHP on the front, Java on the back setup to Ruby on Rails for the past few months. In fact we started a new RoR training side business with David Black, and have been building a new suite of tools from scratch in Rails. The experience has been nothing short of revolutionary for me and has been more liberating than switching from briefs to boxers in my teens. So when someone claims I’m not going to be able to scale this thing I felt my heart sink.
Unfortunately Nick doesn’t go into detail as to what the specific bottleneck was for CrazyEgg but after speaking with a few friends and reading DHH’s entry on scaling Rails, I’m thinking its an issue of an architect that doesn’t know how to scale his app or has coded the system poorly. David Hansson’s point is that if you design your system to not maintain state at the app level and keep in line with shared nothing at the app level, then you can easily scale infinitely by adding more app nodes much like LiveJournal, eBay and even Google do. David concedes that Rails can only handle half as many requests per second as PHP and therefore you are roughly going to have to spend $500 a month for 2.6 million requests/day on Rails versus $250 on PHP. If you’ve built a Rails app you know this is absolutely an inconsequential cost compared to the additional cost of man hours to build, maintain, and enhance a PHP app.
There aren’t a lot of mega traffic sites running on Rails yet but CrazyEgg certainly isn’t touching the traffic 43things brings in daily. Also, a friend of mine runs SoulCast and while I don’t know his daily stats I imagine he does as many pages views a day as CrazyEgg and I’ve never seen the site lag.
Holy crap what a small internet world it is! Ok, after blogging this I was just sending a few emails reaching out to some folks I hung out with at Pubcon Vegas last week and I didn’t get a card from Neil Patel so in the process of hunting down his email I discovered that he is CTO for the company that built CrazyEgg and now I feel like an ass. Sorry Neil. I’m absolutely sure you are a skillfull architect and look forward to hearing more details.
I didn’t want to create a new post so I decided to append to the bottom of this post. David on scalability in the middle of a very long thread I just discovered. Great synopis and worth your time if you are still on the fence:
I've said it before, but it bears repeating: There's nothing interesting about how Ruby on Rails scales. We've gone the easy route and merely followed what makes Yahoo!, LiveJournal, and other high-profile LAMP stacks scale high and mighty. Take state out of the application servers and push it to database/memcached/shared network drive (that's the whole Shared Nothing thang). Use load balancers between your tiers, so you have load balancers -> web servers -> load balancers -> app servers -> load balancers -> database/memcached/shared network drive servers. (Past the entry point, load balancers can just be software, like haproxy). In a setup like that, you can add almost any number of web and app servers without changing a thing. Scaling the database is the "hard part", but still a solved problem. Once you get beyond what can be easily managed by a descent master-slave setup (and that'll probably take millions and millions of pageviews per day), you start doing partitioning. Users 1-100K on cluster A, 100K-200K on cluster B, and so on. But again, this is nothing new. LiveJournal scales like that. I hear eBay too. And probably everyone else that has to deal with huge numbers. So the scaling part is solved. What's left is judging whether the economics of it are sensible to you. And that's really a performance issue, not a scalability one. If your app server costs $500 per month (like our dual xeons does) and can drive 30 requests/second on Rails and 60 requests/second on Java/PHP/.NET/whatever (these are totally arbitrary numbers pulled out of my...), then you're faced with the cost of $500 for 2.6 million requests/day on the Rails setup and $250 for the same on the other one. Now. How much is productivity worth to you? Let's just take a $60K/year programmer. That's $5K/month. If you need to handle 5 million requests/day, your programmer needs to be 10% more productive on Rails to make it even. If he's 15% more productive, you're up $250. And this is not even considering the joy and happiness programmers derive from working with more productive tools (nor that people have claimed to be many times more productive). Of course, the silly math above hinges on the assumption that the whatever stack is twice as fast as Rails. That's a very big if. And totally dependent on the application, the people, and so on. Some have found Rails to be as fast or faster than comparable "best-of-breed J2EE stacks" -- see http://weblog.rubyonrails.com/archives/2005/04/04/justingehtland-is-back-with-numbers-to-back-it-up/ The point is that the cost per request is plummeting, but the cost of programming is not. Thus, we have to find ways to trade efficiency in the runtime for efficiency in the "thought time" in order to make the development of applications cheaper. I believed we've long since entered an age where simplicity of development and maintenance is where the real value lies. David Heinemeier Hansson Tuesday, July 12, 2005