Why Doesn't Flickr Do Automated Tests?

Sam Newman is surprised to know that Flickr doesn’t use many automated tests because, “being such a high profile and successful application I assumed there would be a more mature approach towards automated testing.”

Actually, that surprises me a bit too, but I think I know why they don’t really care that much about testing, and my take is that they’re not just a bunch of the regular outdated dot-com era cowboys throwing code at a wall to see what sticks, the ones who often get criticized for not testing their stuff and told to do better by the usual agilist (rhymes with pugilist?) at conferences and papers. There might be some reasons to not bother testing, and I’ll try to explain them here, even knowing this post is going to be a bit inflammatory. First, let me propose two different scenarios, with the numbers representing some imaginary unit of cost during a given period, after which a new project phase starts:

Automated tests80

Assuming ceteris paribus, and also that the code was well-written by competent developers, which the Flickr folks surely are, a table like that sounds plausible: where A uses automated testing to find regressions and bugs, saving on the maintenance, B spends solely on maintenance, both with the same results (on that phase, at least).

In A, the company has to spend money upfront in order to get the testing suite up and running so they don’t have to spend with future maintenance. Now break out the pitchforks and torches and quote me to death, but this is a bad economic decision, unless the cost of writing the automated tests is lower than doing all the maintenance work, where the cost would be spread. The risk of this cost not being lower though, seems, at least in my experience, much higher than what most companies are willing to accept, hence the good results of this approach we hear about it from most, if not all, agilists.

Now quote this bit too: I am a fan of automated testing because I’ve seen how maintaining software sucks out every last drop of happiness I happen to have on my poor soul after a few months doing it. I’d much prefer writing fresh code, even if that code isn’t used in the final production environment, like test code, instead of going through another frustrating day hunched over a debugger. I am sure Sam and most of you reading this are also on the same boat.

B is, in my theory, what Flickr is doing: less upfront and more maintenance costs over time. As a startup (well, before they were acquired by Yahoo, anyway), this makes sense: the whole point of a startup is that you can do riskier things, and they guessed at some point that automatically testing anything but the most significant bits (smoke tests?) wasn’t as important as getting code out the door, fast, and obssessively listening and reacting to user feedback. This probably required keeping insane levels of attention to detail and commitment, which is quite rare I might add, but a great part of what I attribute to their success.

As Joel Spolsky put it, almost three years ago, over a similar situation:

You need some kind of economic model to decide where to spend your limited resources. You can’t make sensible decisions reliably by saying things like “load testing is a no-brainer” or “the server will probably survive.” Those are emotional brain droppings, not analysis. And in the long run we scientists will win.

So, I’m not sure who’s right, here: I know most of the projects I’ve participated in would have been (and some actually were) completely fucked without automated tests and heavy test-coverage-keeping work. But then, I’ve never worked for any startup quite like Flickr.