19 March 2014
Over the last couple of years, I have largely changed my approach to projects and coding. This has been largely due to newer tools like git, learning from my coworkers at Summa, and from moving to a more open-minded environment where it was possible to experiment with new technologies and ideas. But, anyone who has ever had me review their code in the past knows that the first thing I do is run a static analysis tool and dive straight into the method with the highest cyclomatic complexity. More recently, I have found that if a project gets to the point where I need to employ this technique, the project is already headed down a bad path. That being said, cyclomatic complexity is still an extremely important metric to be aware of and to use as an indicator of problems.
Cyclomatic complexity is simple to comprehend at a high level. The basics are that the higher the value, the more paths there are through the method and the more complex. For example, if a method contains a single "if" statement (i.e. if val == true) then there are two paths through the method (i.e. one for true and one for false). If you add an additional "if" statement, (depending on how it relates to the other "if") you could have 4 paths through the method (i.e. 2**2=4). If you then add a loop into the mix, the situation gets even more complex. The more complex the method is, the more likely it is to contain bugs and the harder it is to test for those bugs. Added complexity also makes it more difficult for others to read, which adds to the potential for misunderstanding and increases the likelihood that even more bugs will be introduced.
As software professionals, our jobs are to manage complexity and to communicate our intentions to others. The smaller and less complex you make your methods, the easier it is to accomplish both of these. It might not be possible to completely eliminate conditions from your methods, but experience and the principle of Single Responsibility makes it pretty clear the conditions should be minimized in your methods. A good role of thumb is that if you have to use the word "and" in your method name to accurately describe everything it does (e.g. "ValidateAndSaveCustomerOrder"), then it is doing too much.
All of this being said, even the most disciplined developers in our industry will sometimes make compromises and cut corners to make a deadline or quickly fix a customer issue. These lapses in professionalism and residual technical debt will often go forgotten. In teams made of less experienced developers or on projects with unrealistic deadlines, this tendency is even stronger. For these scenarios, cyclomatic complexity can be an excellent tool for identifying problem areas so that you can make your application code cleaner and more professional. It is a tool in your tool belt.
This approach of singling out and/or monitoring areas that need improvement is not limited to your application code. It also applies to your tests. According to Robert C. Martin in his book Clean Code, "...having dirty tests is equivalent to, if not worse than, having no tests." Tests are essential for not only verifying your code, but also for communication within the team. In this paradigm, it is absolutely essential to maintain your test/specification code and **keep it as clean as your application code.** Therefore, cyclomatic complexity matters in tests as well.
Leveraging Cyclomatic Complexity
Having largely been a .NET developer for most of my career, I am most familiar with that stack and with Visual Studio. Therefore, I will focus in this paradigm. However, there are certainly other related tool sets for other technologies.
Assessing the Whole Solution
Whether you are joining a green-field project that has already begun, maintaining a legacy application, or simply monitoring the quality of your application code, it is a good idea to assess your code base. This can be a manual spot check or a part of your build with breaking upper limits. Since Visual Studio 2010, the Code Metrics Viewer for Visual Studio plug-in has been available. This plug-in generates many different metrics, including cyclomatic complexity, directly through Visual Studio. There are also ways to export these metrics via the command line (perhaps during the build process) that you can use to put limits on what your team finds acceptable.
One of my favorite ways to keep myself honest is the Code Metrics plug-in. This plug-in calculates the cyclomatic complexity of your method and puts it in a colored circle next to the method definition. If the indicator is red, you should pay attention and refactor before you commit your code to your source control repository. When the indicator is green, that method's complexity is within a reasonable range. While there are plenty of other ways to self-monitor this (e.g. TDD), I find this tool to be an excellent reminder and a good mentoring tool.
After you locate methods with high complexity, then what do you do? As is often the case, it depends on the situation. Michael Feathers recently blogged about one such scenario, and how he reduced it. However, there are many situations and many ways to reduce the complexity of a given situation. The focus of this article is limited to the identification of complexity and not the actual reduction of it. However, here are a few things that one might consider:
- Brake down methods into smaller methods - This might seem obvious, but it can be difficult on its own. When breaking down methods, you need to guard against the complexity simply being spread out to the rest of the class. When this happens, the class typically loses cohesion and it too needs to be broken down into smaller classes. This division is as much an art as it is a science. Learning to do it correctly comes with practice and by observing others.
- Test Driven Development - TDD is more about improving the design and cleanliness of your code than it is actually about testing.
By slowly making complex code testable, you will ultimately improve its structure and reduce its overall complexity.
- Use Linq - When your method has multiple loops to process data, Linq can offer a much more readable syntax. If you are a Java developer, similar functionality is coming in Java 8.
- Use Dependency Injection (DI) - Simple "switch" statements can be readable when small.
But, they fundamentally violate the Single Responsibility Principle, which states that a method should have only one reason to change.
Switch statements have multiple reasons to change.
A "case" could be added, removed, or modified (i.e. 3 reasons to change).
By using a standard interface, creating implementations for each "case," and associating the implementation with its condition in your dependency injection framework, you can abstract this complexity away.
It also has the added benefit for allowing for fakes or mocks to be injected during testing.
Not Getting Overwhelmed
If the project that you are working on has acquired a lot of technical debt, it can be overwhelming to generate a huge report of problem areas that tells you and your team everything that was done wrong. This is especially true when your team does not have enough time to fix even a small percentage of the issues. Do not fall into this trap. Tackling everything at once is a battle that no one can win. Even having the team swarm on it for a week will usually not make much of a dent, and it can be demoralizing. Complexity and other technical debt takes a long time to accumulate and it will take time to eliminate. In my experience, the best thing to do is clean up the code around the area you are working on. In other words, if your code depends on modules that are unstable and/or a mess, spend some time cleaning things up and back filling tests before you use them. It does not have to be a full refactor, just leave it a little better than you found it.
Also, instead of just focusing on all of the code that is already complex and untested, keep new code from getting into this condition. Use test driven development to force your code to be testable and therefore not overly complex; use plug-ins like Code Metrics to remind yourself when you are getting out of hand, and monitor your code with static complexity tools during the build process.