—A Domain-Based Look at Don't Repeat Yourself
In a Practical Application of DRY, I discuss what Sandi Metz had to say about code duplication and the cost of using the wrong abstraction. Recently, I exchanged a few tweets with Steve Streeting (@stevestreeting) about an observation he made in relation to code duplication. That discussion prompted me to revisit DRY and this an attempt to clairify my thinking on it.
Steve challenged the wisdom of seeking a unified vision of a system to support two tasks instead of accepting some code duplication between different systems when those systems could do these tasks better.
DRY is domain based (i.e., has a bounded context), but influences structure (i.e., prevents code duplication). Its goal is to retain knowlege in a meaningful way. Meaningful in the sense it is
I looked at that the notion of authoritative and thought a better word to use is definitive. Then I rejected definitive in favor of authoritative because:
Definitive implies complete knowledge. Authoritative is better because it implies there is a single source of this knowledge and supports the notion of the development of a shared resource to store, reflect upon and revise that knowledge.
DRY seeks to create an authoritative source of knowledge for a concept. That source may or may not be definitive because the information it captures may be incomplete.
So DRY fuses two concepts together: the collection of knowledge and its representation. Breaking it into its base forms you get:
Single Point of Truth (SPOT) is about a singular, unambiguous and authoritative source of knowledge. It applies to the domain the knowledge comes from.
Once And Only Once (OAOO) is about representation. It applies structure to knowledge. Since OAOO comes into existence via refactoring it is also the process by which incremental improvement occurs.1
Importantly, SPOT is a knowledge activity that identifies and organizes information, whereas OAOO is a structural activity that incrementally improves (but does not change) the implementation.
DRY effectively organizes a domain. This begs the question: what is a domain? I think about it this way:
The two overlap. In fact, code might be reused in multiple knowledge domains. For example, the same mathematical function library might be used in a physics and a financial application. The knowledge domains for these applications are likely to differ significantly and are unlikely to be shared.
The converse is also true: the physics application will generate libraries, say on particle physics, that have no meaning in the financial application.
Back to what Steve had to say. To paraphrase, he recommends the rule of three and advises generalizing the lowest levels first and not to jump in with generalizations at the higher levels too soon (or at all). He warns it’s common to see some common implementation details, try to force higher level concepts together and then watch things fall apart as they diverge.
His original statement refers to codebase, tasks and systems. So
codebase == implementation and structuring it is what OAOO is for.
system == knowledge and structuring it is what SPOT is for.
DRY is about organizing knowledge for a system to support a codebase.
Steve also talks about the goal of a unified vision for this system and how it’s difficult to achieve in practice. In his example, the systems are related, but cannot be unified without adding complexity and reducing the performance of the tasks. He advises against going after a unified vision in this example.
I asked Steve for insight on questions to ask that lead to better decision making. I can apply DRY to his answer, particularly if I keep SPOT and OAOO top of mind.
The divergence of higher-level concepts represents a misunderstanding of SPOT. That is, you coupled different domain concepts and took different points of truth and munged them together. You failed to properly recognize the oneness of those concepts.
The commonality of lower level concepts represents an execution of OAOO. That is, refactor the oneness at the edges of the systems, but don’t get carried away.
DRY is about creating authoritative sources of knowledge. Authoritative might be definitive but it needn’t be. The role of DRY then is help guide you through knowledge acquisition and organization as you learn and as the system evolves.
If you don’t Steve provides a heuristic so you know how long to delay these decisions. (In reality, it’s not the count that matters. It’s the knowing. That’s the hard part.)
So an answer on what questions to ask about code duplication in the higher level concepts is:
Ask if you know enough about the two SPOTs (concepts) containing this code.
The emphasis is on understanding the concepts. If they differ the code might not be duplicated. It may just be a temporary property of the implementation.
Recognize that DRY is about creating an authoritative resource for these concepts.
Authoritative means its singular, but it may not be complete. See what you do to determine if these concepts are in fact similar.
An answer on what questions to ask about code duplication in the lowest level concepts is to use the rule of three and apply OAOO.
Thanks to Steve Streeting for sharing his insight.
1: I’m using refactoring in the same way Martin Fowler uses it: no functional changes occuring during refactor and refactor is done only when supported by tests.
2: I’m abusing this definition of domain code. The original definition refers to code substituted out so it can be mocked. The abuse is that I’m applying this notion to the implementation itself. Importantly, OAOO changes domain code.