For most developers, refactoring legacy code is a painful task. For a strange few, myself included, the process can also be oddly gratifying- akin to the satisfaction one feels cleaning up a messy room or restoring an old car. Wherever you fall on that spectrum, knowing when to refactor, and when not to, is a core competency of a good Software Engineer. Like many things in Software Engineering, making that call is more of an art than a science. While we can’t derive a single formula to decide when or how aggressively to refactor, we can understand the tradeoffs involved and then apply this understanding thoughtfully to specific situations.
As a thought experiment let’s imagine the Worst Case Scenario™. You have just inherited a legacy base of very messy, non-idiomatic code with no test coverage and no access to the former developers. The app is live in production and the product owner wants to add features and release new versions in addition to maintaining the production app. How do you handle a situation like this?
At one extreme, you could simply implement new features in the app following the patterns (or anti-patterns) established in the legacy code without doing any refactoring. No technical debt would be paid off in this approach; in fact more debt would be accrued. The time to add the next feature would be short, but the time to add the Nth new feature would be increasingly long as the project began to buckle under the weight of technical debt. Because you’d be making the minimal set of changes to implement the next feature the risk of regression is the lowest. Over time, however, the likelihood of bugs increases.
At the opposite extreme you could refactor the entire application, freezing new feature addition until the codebase was “good.” The time to add the next feature is extremely long, as long as it takes to refactor all the code, but the time to add the Nth feature is very short, because the beautiful refactored code you’ve produced is now a joy to maintain. On the other hand, because you’ve touched every piece of code in the app, the risk of regression for the next feature is very high, but your improved architecture reduces the likelihood of future bugs.
As is usually the case, choosing either extreme is the wrong answer, but considering them helps us understand the tradeoffs involved. In reality, you’ll have to refactor as you add new features, but how high do you turn the refactoring dial? That is the important question.
- Refactoring less means the time to implement the very next feature is shorter, but the time to implement Nth feature increases.
- Refactoring more means the time to implement the very next feature is longer, but the time to implement the Nth feature decreases.
- Refactoring less means the likelihood of regression is lower, but the likelihood of a buggy app in the long term increases.
- Refactoring more means the immediate likelihood of regression is higher, but the likelihood of other bugs cropping up in the long term is lower.
Armed with this understanding of the tradeoffs involved, it’s now up to you to apply this knowledge to any given situation. How urgent is the next batch of features? Is there a strong QA team in place likely to catch regressions? Is the app likely to remain in production for the very long term? All of these questions have to be considered.
I’d add that while there is no one answer, there is a rule of thumb I believe applies in most all situations: try not to accrue new technical debt. At a minimum, refactor just enough such that your changes now won’t have to be unwound later. You may not have the luxury of being able to fix what’s already there, but let your additions light a path for how things could be improved in the future.