The original idea behind continuous integration, a term coined by Grady Booch back in 1991, was that software builds often fail, and performing a build is cheap. Ergo, with more frequent builds, one can find and fix errors earlier in the development process. Today, development environments can sense syntax errors, programming languages are mostly interpreted, version control is widespread and CI catches merge errors or true integration problems between multiple developers.
Continuous integration (CI) with insufficient testing will result in a continuous pipeline of buggy code that must be corrected, fixed, rerun and redeployed. This after-the-fact extra testing is not only inefficient, it's expensive. Moreover, economy of labor suggests to batch that work so that a "testing phase" happens every few weeks.
A typical CI/CD pipeline simultaneously integrates multiple check-ins through a series of various stages, performs software builds and associated tests, and optimizes production deployments with minimal human intervention, at a rate of two to 10 times a day or more. A lack of automated checks in this pipeline essentially prevents continuous delivery or deployment (CD), and ultimately forces a retreat to a waterfall-style software build process.
Why test automation is important in the pipeline
Some teams implement automated unit tests, just not in the CI/CD pipeline. Invariably some new hire makes a code change and doesn't run the automated checks before committing and pushing the code. Older, more experienced developers end up going back to fix the tests. Over time, engineers simply do not "trust" the unit tests, and the test suite is thrown away or forgotten.
Higher-level API and GUI tests also suffer from this problem of disconnection from the pipeline. Releases must be coordinated as an activity wherein someone in a more test-like role creates a test environment, runs the tests, looks at the results, performs maintenance, reruns the test, and then either reports bugs or passes the build. If there are any showstopper bugs, the process happens all over again. This leads to automation delay, which in my experience easily adds up to between four hours and four business days.
When an organization invests in "test automation," including a tool and labor, that multi-day delay is not what they thought they were buying. What they envisioned are benefits from effective automated checks inside of a CI/CD pipeline.
Benefits, costs and risks of tests in the pipeline
The primary benefit of automated checks in CI/CD is fast feedback and isolation. When a change forces a check to break, the tool knows exactly which change it was, and can email the person who introduced the error so it can be fixed. With a CI process, that fix is easy because the change is explicit, perhaps between five and 100 lines of code. In a test-driven development commit-loop, the fix could be very much on the smaller side.
In engineering terms, there is almost no cost to such work. Perhaps the programmers check their email a bit more often. Unit tests as part of the pipeline is certainly cheaper than manually running the unit test suite in the development environment and waiting.
Larger teams that have a pipeline in the cloud may pay for the virtual machines to build everything, or to have the small data center to do it on premises. This is especially true if each build actually creates a virtualized test environment, including web servers and databases. The larger that environment, the more frequently code is pushed into version control; tests will be more extensive -- and overall, costs to rent a cloud-based CI/CD pipeline go up.
Like any other method, a shoddy, simplistic implementation, especially one which allows exceptions and rules-breaking, will lead to a poor outcome. Teams may ignore unit test failures, or consider some tests flaky and devalue or dismiss them: "It's okay if feature set 'Mirror' fails, but feature sets 'Object' and 'Reppledepple' must pass." In an environment where some rule-breaking is okay, developers learn to not trust test results and not worry if the CI fails. This phenomenon is known as the broken window effect or the normalization of deviance. Eventually, someone -- usually a manager -- sees the failing tests as false, "bad news" that "makes the team look bad," and orders that the failures be commented out or deleted.
Practical tips for tests in the CI/CD pipeline
The variety of technologies built to support an organization -- web, mobile, desktop, pure API, a little mainframe, etc. -- make it impossible to create a simple stepladder to tests within the pipeline. Assuming a web application context, though, we can generalize a few steps.
- Start with effective CI. Obviously, first you need to implement a continuous integration setup that checks code out and performs a build. This also includes a strategy to branch and merge code, and a feedback loop when CI fails.
- Isolate code units. Adding unit tests to a legacy app without change will likely involve a great deal of pain. The various components will depend on each other, making a "unit test" two dozen lines of setup, a function call and an assertion. Instead, start making good, testable code units when and where changes are introduced. This may require mocks, stubs and dependency injection.
- Rely on unit tests, but explore others. The next element to add is probably unit tests, which are an easy entryway into testing in the CI/CD pipeline. Unit tests catch important bugs quickly. Also, it's relatively easy to maintain a few of them, and they are typically easy to integrate into CI. Unit test tools generally run from the command line with text output, which can come directly into the format the CI tool needs to read but is not difficult to change.
After 20 years, some leaders in software development are rethinking the test pyramid approach with deep unit tests. Start with unit tests, but consider where the bugs come from -- it might be wise to consider other forms of test tooling earlier in the pipeline.
- Build virtual environments. Before you can run API or GUI tests, the web server to test against needs to exist with the new build. Automating the test environment build and setup makes setup easier and less painful, which leads to people testing more often. It also saves a great deal of manual time -- some teams with build and setup test automation report payback periods in a few days or weeks.
- For chronic failures, run API and GUI tests. Once the environment exists, consider where the bugs originate. If the login feature frequently breaks, add an API test right in the pipeline. If the issue is complex integration between the server, the web page and the API, consider adding a few full, light, end-to-end tests. The idea here is to quickly get feedback on chronic blocking conditions to stop wasted time.
- Iterate on depth, measure coverage. Roughly speaking, "We have some automated checks" is better than "We don't have any." The next step is to figure out what is covered, what is not and how deep that testing is occurring.