I haven't been posting very much recently because I've sadly been working my tail off. I very much enjoy what I do, but there are just weeks (months ... years ...) where it's just a non-stop grind to get everything done. Recently, it's been working on launching a new VPS platform, but that was interrupted by a breakdown of our customer MySQL infrastructure.
Our setup is a bit different than most. Since we don't run a typical box-by-box web hosting architecture, we don't simply have a thousand boxes with each one running Apache and MySQL. Instead, we have a really robust pooled architecture for everything except MySQL, which just isn't something that's very poolable. For MySQL, we've got some big boxes with a bunch of memory and some fast disks that handle our MySQL load. But, slowly over time, performance had degraded.
When you'd hop onto a box and look at the transactions per second or number of queries, nothing looked terribly out of the ordinary. Yet the load would be huge, and performance would be pretty bad. Our team brought up some new boxes, shuffled customers between them to even the load out, moved our backup processing onto the hot spare replicated boxes (to reduce even more load on the disks) and things were better.
But they weren't better enough. (I know, awesome English, eh?)
We started just watching the processlist, looking for the culprit. And after about 5 minutes, it was obvious.
Motherfrakking phpBB spam.
phpBB is written in a really shitty way. Not the forum part, necessarily, which works when it's not being exploited. But the search part is awful. For every word in every post (unless you've got a smart list of words to ignore), it throws entries in some big tables so that when you search for "foobar", it can tell you every post that contains that work. That's a fine design for a small board with a tiny amount of traffic. But as your board grows, even legitimately, that table can become hundreds of thousands of rows long (or more!) and inserts and selects can become extremely slow.
It's ten times worse when the only thing putting content into is spammers who are just flooding it with huge wordlists multiple times per second. Now, all of a sudden, you've got this single board showing up in your processlist five times, with each entry running for 30, 40, 50 seconds. One of those boards can cause some extra load on a server.
When you've got ten or twenty, it can bring the server to a halt. Literally. I popped onto a server where the load was near 10. I turned off 40 phpBB boards getting spammed. The load dropped to less than 1 and stayed there.
After some quick thinking, we identified a bunch of boards that were getting spammed and turned them off. One of our engineers built a brilliant little monitoring script that can identify phpBB boards in the processlist and shut them off if they show up at a high enough frequency with those awful queries (you know then when you see them, believe me). All told, we've turned off maybe 12k boards in the past 2 weeks, and haven't heard a single complaint.
Why? Because these boards were setup by users who then forgot about them. And there they sat, for months or years, collecting spam, draining resources. Basic negligence on the part of users caused a huge server load, which then caused those same customers to call in and complain.
It feels like we've got this mostly under control, except for the user side. We need to figure out a way to get people to realize that the things they install on their site can be exploited and lead to security issues (on their site), performance issues (for everyone), and can suck up the resources they pay for.
But yeah, it sucks when you work about an 80 hour week because people forget about their phpBB install, and the folks who wrote phpBB decided that they'd build the most stupidly designed search setup of all time.
So, when on April 8th my Twitter looked like this:
now you know why.