The Fallacy of DRY

DRY, standing for Don’t Repeat Yourself, is a well-known design principle in the software development world.

It is not uncommon for removal of duplication to take center stage via mantras such as “Repetition is the root of all evil”. Yet while duplication is often bad, the well intended pursuit of DRY often leads people astray. To see why, let’s take a step back and look at what we want to achieve by removing duplication.

The Goal of Software

First and foremost, software exists to fulfill a purpose. Your client, which can be your employer, is paying money because they want the software to provide value. As a developer it is your job to provide this value as effectively as possible. This includes tasks beyond writing code to do whatever your client specifies, and might best be done by not writing any code. The creation of code is expensive. Maintenance of code and extension of legacy code is even more so.

Since creation and maintenance of software is expensive, the quality of a developers work (when just looking at the code) can be measured in how quickly functionality is delivered in a satisfactory manner, and how easy to maintain and extend the system is afterwards. Many design discussions arise about trade-offs between those two measures. The DRY principle mainly situates itself in the latter category: reducing maintenance costs. Unfortunately applying DRY blindly often leads to increased maintenance costs.

The Good Side of DRY

So how does DRY help us reduce maintenance costs? If code is duplicated, and it needs to be changed, you will need to find all places where it is duplicated and apply the change. This is (obviously) more difficult than modifying one place, and more error prone. You can forget about one place where the change needs to be applied, you can accidentally apply it differently in one location, or you can modify code that happens to the same at present but should nevertheless not be changed due to conceptual differences (more on this later). This is also known as Shotgun Surgery. Duplicated code tends to also obscure the structure and intent of your code, making it harder to understand and modify. And finally, it conveys a sense of carelessness and lack of responsibility, which begets more carelessness.

Everyone that has been in the industry for a little while has come across horrid procedural code, or perhaps pretend-OO code, where copy-paste was apparently the favorite hammer of its creators. Such programmers indeed should heed DRY, cause what they are producing suffers from the issues we just went over. So where is The Fallacy of DRY?

The Fallacy of DRY

Since removal of duplication is a means towards more maintainable code, we should only remove duplication if that removal makes the code more maintainable.

If you are reading this, presumably you are not a copy-and-paste programmer. Almost no one I ever worked with is. Once you know how to create well designed OO applications (ie by knowing the SOLID principles), are writing tests, etc, the code you create will be very different from the work of a copy-paste-programmer. Even when adhering to the SOLID principles (to the extend that it makes sense) there might still be duplication that should be removed.The catch here is that this duplication will be mixed together with duplication that should stay, since removing it makes the code less maintainable. Hence trying to remove all duplication is likely to be counter productive.

Costs of Unification

How can removing duplication make code less maintainable? If the costs of unification outweigh the costs of duplication, then we should stick with duplication. We’ve already gone over some of the costs of duplication, such as the need for Shotgun Surgery. So let’s now have a look at the costs of unification.

The first cost is added complexity. If you have two classes with a little bit of common code, you can extract this common code into a service, or if you are a masochist extract it into a base class. In both cases you got rid of the duplication by introducing a new class. While doing this you might reduce the total complexity by not having the duplication, and such extracting might make sense in the first place for instance to avoid a Single Responsibility Principle violation. Still, if the only reason for the extraction is reducing duplication, ask yourself if you are reducing the overall complexity or adding to it.

Another cost is coupling. If you have two classes with some common code, they can be fully independent. If you extract the common code into a service, both classes will now depend upon this service. This means that if you make a change to the service, you will need to pay attention to both classes using the service, and make sure they do not break. This is especially a problem if the service ends up being extended to do more things, though that is more of a SOLID issue. I’ll skip going of the results of code reuse via inheritance to avoid suicidal (or homicidal) thoughts in myself and my readers.

DRY = Coupling

— A slide at DDDEU 2017

The coupling increases the need for communication. This is especially true in the large, when talking about unifying code between components or application, and when different teams end up depending on the same shared code. In such a situation it becomes very important that it is clear to everyone what exactly is expected from a piece of code, and making changes is often slow and costly due to the communication needed to make sure they work for everyone.

Another result of unification is that code can no longer evolve separately. If we have our two classes with some common code, and in the first a small behavior change is needed in this code, this change is easy to make. If you are dealing with a common service, you might do something such as adding a flag. That might even be the best thing to do, though it is likely to be harmful design wise. Either way, you start down the path of corrupting your service, which now turned into a frog in a pot of water that is being heated. If you unified your code, this is another point at which to ask yourself if that is still the best trade-off, or if some duplication might be easier to maintain.

You might be able to represent two different concepts with the same bit of code. This is problematic not only because different concepts need to be able to evolve individually, it’s also misleading to have only a single representation in the code, which effectively hides that you are dealing with two different concepts. This is another point that gains importance the bigger the scope of reuse. Domain Driven Design has a strategic pattern called Bounded Contexts, which is about the separation of code that represents different (sub)domains. Generally speaking it is good to avoid sharing code between Bounded Contexts. You can find a concrete example of using the same code for two different concepts in my blog post on Implementing the Clean Architecture, in the section “Lesson learned: bounded contexts”.

DRY is for one Bounded Context

— Eric Evans

Conclusion

Duplication itself does not matter. We care about code being easy (cheap) to modify without introducing regressions. Therefore we want simple code that is easy to understand. Pursuing removal of duplication as an end-goal rather than looking at the costs and benefits tends to result in a more complex codebase, with higher coupling, higher communication needs, inferior design and misleading code.

Leave a Reply