Running Checkstyle at Full Throttle on a Legacy Codebase

 TLDR;

So in summary:

You can turn on all checks for new checkins on a horrible old legacy code base in checkstyle by using checkstyle to generate a supressions file to ignore all the old mistakes.

Get new version of Checkstyle

Download the patched version here.

See the patch in Sourceforge here.

To run simply add the the following new task in your build file to generate the suppressions file from the build.xml here:

<target name="generate.suppressions"
 ...

Then you’ll need to add the following to your existing checkstyle task.

<property key="checkstyle.suppressions.file" 
file="target/suppressions.xml"/>

The Story

I think checkstyle is fantastic, because I work in a large team of developers on a large codebase, and I want to know we’re all on the same page with our coding styles.

“But it shouldn’t matter,” you might argue. “All it does is slow you down. That’s not providing benefits or reducing risks.” Fair enough. Suppose you find the following in your codebase.

...
if (inputString == "Expected String") {
...

Should the developer responsible have known? Should the code reviewer have picked it up? Is everybody on your team on the same page that This Is Not A Good Idea?

The team discussion

“But”, they argue [and here comes the broken window argument.] “This pattern is already all over the code base. The developer was just copying and pasting.”

[At this point you might groan internally and consider changing jobs. Alternatively you can acknowledge that there is real money in maintaining legacy codebases for profitable businesses, and try and find the opportunity in the murkiness.]

“That’s not a good enough reason.” You will respond. “There has to be a way for our tools to support this scenario.”

Enter Checkstyle

So checkstyle can run some checks over your code base. It comes with lots of fancy checks and it feels like it would address this situation.

So you run it with the default set of checks. You get back several thousand errors. So you alter the checks file to just the checks you’re interested in: StringLiteralEqualityCheck. Still your codebase comes back with 50 errors.

The Legacy Push-back

“We can do this,” you hope internally. We only have to touch 50 classes. Now imagine that this system has 20 000 users a day, which collectively transact about $1B a week through your system. The tolerance the organisation has for risk and defects is pretty close to zero, and changes requiring a regression test to system components not on the project will be looked on unfavourably by your project manager and user acceptance test manager. This Is Not The Way Forward.

“Well we can just use the supressions file!” you exuberantly exclaim. For these 50 errors, it should only take An Hour Of My Time to Manually Type In This XML. Because manually typing 150 lines of XML is the reason you took that Computer Science degree – right?

A New Way Forward

“Hang on a minute,” you think. “This doesn’t feel right. What I wanted was a way of working with my team where we can agree on a way of not writing crappy code. Here am I using an hour of my time to make one measly check work. What about the other checks?”

“Surely someone has done this before?” A quick search of Google and StackOverflow says no.

“What I want is for checkstyle to generate the supressions file for me, for all checks if I want. Surely generating a supressions file is about the same complexity as generating an errors file – and it already does that.”

Cracking open the checkstyle codebase one evening, it was surprisingly easy to do just that.

The Benefits

What you want is instant feedback for your developers. As soon as they save a file and run a checkstyle task in eclipse, or check it in and get an email from the Build Server – they should get some indication of whether their new code matches up with the existing codebase.

What you want is checkstyle running at full-throttle. You want all the checks possible. You don’t want to be held back by having to change legacy code to get there. You want the new code to be in line. This helps you get there.

One thought on “Running Checkstyle at Full Throttle on a Legacy Codebase

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.