When Code Duplication is not Code Duplication

Duplicating code is a bad thing. Any engineer worth his salt knows that the more you repeat yourself, the more difficult it will be to maintain your code. We’ve enshrined this in a well-known principle called the DRY principle, where DRY is an acronym standing for Don’t Repeat Yourself. So code duplication should be avoided at all costs. Right?

At work I recently came across an interesting case of code duplication that merits more thought, and shows how there is some subtlety needed in application of every coding guideline, even the bedrock ones.

Consider the following CSS, which is a simplified version of a common scenario.

.title {
  color: #111111;
}
.text-color-gray-1 {
  color: #111111;
}

This looks like code duplication, right? If both classes are applying the same color, then they do the same thing. If the do the same thing, then they should BE the same thing, right?

But CSS and markup in general presents an interesting case. Are these rules really doing the same thing? Are they both responsible for making the text gray? No.

The function of these two rules is different, even though the effect is the same. The first rule styles titles on the website, while the second rule styles any arbitrary div. The first rule is a generalized style, while the second rule is a special case override. The two rules do fundamentally different things.

Imagine a case where we optimized those two classes by removing the title class and just using the latter class. Then the designer changes the title color to dark blue. To change the title color, the developer now has to replace each occurrence of .text-color-gray-1 where it styles a title. So, by optimizing two things with different purposes, the developer has actually made more work.

It’s important to recognize in this case that code duplication is not always code duplication. Just because these two CSS classes are applying the same color doesn’t mean that they are doing the same thing. In this case, the CSS classes are more like variables than methods. They hold the same value, but that is just a coincidence.

What looks like code duplication is not actually code duplication.

But… what is the correct thing?

There is no right answer here. It’s a complex problem. You could solve it in lots of different ways, and there are probably three or four different approaches that are equally valid, in the sense that they result in the same amount of maintenance.

The important thing is not to insist that there is one right way to solve this problem, but to recognize that blithely applying the DRY principle here may not be the path to less maintenance.

How Can Comments be Controversial?

Comments add information to your code. They don’t impact the execution of that code. So how can they be bad? I believe that more commenting is better, and that comments are a vital tool in maintaining an agile codebase, but I’ve met many experienced developers who argue with me about comments.

The Pragmatic Programmer is an excellent book. It outlines best practices that have solidified over the last thirty years or so as software development has grown as a field. The authors share these practices as a series of tips that are justified by real world anecdotes. In a section on comments the authors say

“The DRY principle tells us to keep the low-level knowledge in the code, where it belongs, and reserve the comments for other, high-level explanations. Otherwise, we’re duplicating knowledge, and every change means changing both the code and the comments. The comments will inevitably become out of date, and untrustworthy comments are worse than no comments at all”

They make the point that obsolete comments are worse than no comments at all. This is undoubtably true, but it has been bastardized by many programmers as an excuse for not commenting. I’ve heard this excuse, along with two others to justify a lack of comments in code.

  1. The code is self-commenting.
  2. Out of date comments are worse than no comments.
  3. You shouldn’t rely on comments.

In this post I want to address each of these points. I think that thorough commenting speeds up the rate of consumption of code, and that obsolete comments are a failure in maintenance, rather than in comments in general.

1. The code is self-commenting

This is the excuse I hear most often from developers who don’t comment their code. The thinking is that clear variable and method names replace the need for comments. There are situations where this is true. For very short methods with a single purpose, I see that no additional comments may be necessary.

But most methods are not very short, and most methods achieve several things that relate to one overall purpose. So in the first place, very few methods fit the bill.

This is not a valid complaint because good comments only make the code easier to read. While the code may be legible on its own, a comment that explains the purpose of a method, or the reason for
a string operation, can only make it easier to read. The author of Clean Code agrees due to the amount of time we spend reading code.

“the ratio of time spent reading vs. writing is well over 10:1. We are constantly reading old code as part of the effort to write new code. Because this ratio is so high, we want the reading of code to be easy, even if it makes the writing harder.”

Because we spend more time reading code than writing it, a responsible programmer knows that the extra comment in a method which takes very little time to write may well save another programmer many hours of learning.

2. Out of date comments are worse than no comments

To refute this argument, lets take cars as an analogy. If you buy a car, then drive it for twenty thousand miles without an oil change, it will break down. If you did that, the fault wouldn’t be in cars in general. Rather, the fault is in the lack of maintenance.

Similary, obsolete comments are not a failure of comments in general, but a failure of the programmer who updated the code without updating the comments. That other programmers have bad practices does not justify continuing that bad practice.

Obsolete comments are a broken window. If you see them, it is your job to fix them.

So while obsolete comments ARE worse than no comments, that argument is an argument against bad coding practices, not comments in general.

3. You shouldn’t rely on comments

The argument that a programmer shouldn’t rely on comments is a subtle dig at another programmer’s skill. They’re saying “a good programmer can understand this code with no comments.”

There is a grain of a valid point there. If you consider yourself a senior programmer, then you should be accustomed to wading through large volumes of bad code and parsing out the real meaning and functionality.

But even for senior programmers, comments make it easier to read and understand the code. It’s much easier to interpret a one line summary of a block of code than to parse that block line by line. It’s much easier to store that one line in your head than it is to store the whole block in your head.

The real reason why this is the worst of excuses is that not all programmers working on your code are senior programmers. Some are green graduates straight out of college. Yes, a more experienced programmer could understand the code without comments, but it will cost that graduate hours to figure out what they could have gotten in minutes with good comments.

My Commenting Principle

Comment your code so that a green programmer can understand it.

Or…

Comment your code so that your boss can understand it.

What does it take to make your code readable by a green graduate? Imagine them going through it line by line. If there are no comments, then they need to understand every function call, and every library routine. They need to understand a bit about your architecture and date model.

Now imagine that every single line is commented.

In that scenario, a green programmer can read your file by reading one comment after the next. If they need more information, they can still drill into the code and get the details, but they can understand the overview just through comments.

The closer you can get to that, the better.

Don’t listen to the arguments against thorough commenting habits. Just point the detractors here.