Well, I had… plenty of times. So I created a tiny little tool to help fight that. Encoding Detector is an aptly named tool (if I do say so myself) that recursively detects the encoding of files in a project directory errors out if anything seems fishy.
I’ve only ever used it with Ant, but it should be a breeze to set up and install on any project. All you need to do is call the main script against a directory, like:
$ python encoding-detector.py src
You’ll need a somewhat recent Python installed — anything that came out in the last 5 years should be OK — and that’s about it. Multiple directories can be passed as arguments, and fixing the errors is sometimes as easy as adding a few UTF-8 characters to boost the detector’s confidence up a bit.
This hack would definitely not be possible without Mark Pilgrim’s amazing chardet library. The code, as always, is on GitHub, and you can also grab a neat little tarball here.
It takes about 5 minutes to set up on your project, and you can thank me later ;)