Understanding code smells and how refactoring can help
Code smells can be the canary in the coal mine for poor coding. And poor coding is a sign that refactoring is called for. Let's explore how to look at and deodorize code smells.
One thing that most application developers and testers eventually encounter, especially when working with complex applications or across large teams, is code smells. These are tangible and observable indications that there is something wrong with an application's underlying code that could eventually lead to serious failures and kill an application's performance.
Typical examples of code smells include the following:
- duplicate code
- dead code
- long methods
- long parameter list
- unnecessary primitive variables
- data clumps
Particularly "smelly" code could be inefficient, nonperformant, complex, and difficult to change and maintain. While code smells may not always indicate a particularly serious problem, following them often leads to discoveries of decreased code quality, drains on application resources or even critical security vulnerabilities embedded within the application's code. At the least, it requires teams to perform some in-depth tests on the code -- and often reveals some critical areas in the code that need remedial work.
Let's explore how software teams can identify code smells and use them to maintain a higher degree of code cleanliness. We also examine how techniques such as code refactoring and regression testing play integral roles in dealing with code smells and fixing the underlying problems.
What causes code smells
Put simply, code smells are a result of poor or misguided programming. These blips in the application code can often be directly traced to mistakes made by the application programmer during the coding process. Typically, code smells stem from a failure to write the code in accordance with necessary standards. In other cases, it means that the documentation required to clearly define the project's development standards and expectations was incomplete, inaccurate or nonexistent.
There are many situations that can cause code smells, such as improper dependencies between modules, an incorrect assignment of methods to classes, or needless duplication of code segments. Code that is particularly smelly can eventually cause profound performance problems and make business-critical applications difficult to maintain.
Keep in mind, however, that a code smell is not an actual bug -- it's likely that the code still compiles and works as expected. Code smells are simply indications of potential breaches of code discipline and design principles. That said, it's possible that the source of a code smell may cause cascading issues and failures over time. It is also a good indicator that a code refactoring effort is in order.
Eliminating code smells with refactoring
Code refactoring is one of the most effective ways to eliminate code smells and maintain good code hygiene. Refactoring is a restructuring process that attempts to make code cleaner, more concise and more efficient without altering its core functionality. Regular refactoring helps ensure that code meets a team's guidelines and aligns with a defined architecture.
The best time to refactor code is before adding updates or new features to an application, as it is good practice to clean up existing code before programmers add any new code. Another good time to refactor code is after a team has deployed code into production. After all, developers would have more time than usual to clean up code before they're assigned a new task or a project.
One caveat to refactoring is that teams must make sure there is complete test coverage before refactoring an application's code. Otherwise, the refactoring process may simply restructure broken pieces of the application for no gain. Regular refactoring is not a good idea when facing tight release schedules -- the tests required take significant amounts of time and may prevent the team from releasing the application on schedule.
There are plenty of tooling options available to automate the code refactoring process, including SonarQube, Visual Studio IntelliCode, Rider and Eclipse IDE. Many of these tools enable programmers to execute code restructuring alongside the actual development process, which can help teams speed up release cadences when needed.
Refactoring techniques for code smells
Refactoring encompasses a number of specific code hygiene practices. When it comes to eliminating code smells, however, there are three particularly effective techniques: one that focuses on methods, another that focuses on the method calls and a third that focuses on classes.
The first technique, composing, aims to eliminate redundant methods. There are two distinct ways developers can do this:
- Code is broken down into smaller blocks of code. Fragmented code is then isolated, extracted and placed into a separate method.
- Broken or unnecessary methods are identified, as well as the calls to those methods. The method calls are replaced by the method's actual code, and the original method is deleted.
Simplifying method calls
The next technique is to simplify method calls that, over time, have become buried in large amounts of code that are daunting to work with. Programmers have several ways to simplify method calls, including the following:
- adding or removing certain parameters;
- renaming methods with ambiguous names;
- separating queries from the modifying component;
- parameterizing methods and introducing parameter objects;
- removing the methods that assign objects certain values; and
- replacing the parameter with explicit methods or calls.
Refactoring by abstraction
Finally, refactoring by abstraction comes into play when large chunks of code contain duplications or redundancies. There are two techniques that constitute this approach, both of which focus on class inheritance:
- Pull up. The code behind methods that are shared among an entire group of subclasses is extracted into a superclass.
- Push down. Method code that lives within a superclass but is only used by a few of the subclasses is pushed down to those respective subclasses.
Consider the following code snippet that illustrates two classes: FileLogger and DbLogger. As the names suggest, FileLogger is responsible for logging data to a file, and DbLogger logs data to a database.
The IsLogMessageValid method returns true if the log message is valid and false if it is not. In this case, a log message is not considered valid if it contains a null or empty string. Likewise, the log message is considered invalid if it contains any sensitive data, such as a Social Security or credit card number.
Unfortunately, this approach is a little redundant. Programmers would need to write the same logic twice -- one for each of the two classes -- to check if the log messages are valid. A better way is to refactor these two classes and create an abstract class. In the code segment below, notice how the IsLogMessageValid method has moved to an abstract class, which helps mitigate potential code redundancy.
The role of testers in refactoring
It's critical to ensure that code is testable before refactoring begins. This includes -- but is not limited to -- unit tests, regression tests and integration tests. To this end, it helps to involve the testing team throughout the refactoring process, as tests might fail once development teams start to change the code. Regression testing is particularly important in refactoring, as it ensures that the application's functionality remains intact.
Refactoring should be a recurring activity, and it's critical that it be a collaborative effort. By regularly performing code hygiene tasks together, software teams can create centralized strategies to identify code smells and learn from mistakes. It's also valuable for testers to learn about the code restructuring process, since it can help them improve existing test cases and procedures. This may even help them write better test automation code and cut down on manual maintenance tasks.
Refactor vs. rewrite: Deciding what to do with problem software