In his Hexagonal Architecture model of software testing, Alistair Cockburn looked at core code, the adapters that allow that code to work with the outside world and the external systems that the code will interact with.
For example, a tiny snippet of website code might decide if a password is valid. That snippet could be isolated and called on its own. Other pieces of that software might interact with a database, or a REST API or a web browser. We could isolate and run each piece of code individually, but then we don't exercise the software as a system.
There are three ways to approach this software testing model:
- Think of the isolated snippets as individual dots in the core application.
- Test all the way to the adapter, then fake out the adapter using a mock, stub or spy. We might test our code, calling just a single real external object -- a filesystem, API or database.
- Exercise the entire system as a system, touching a real database or a real web browser.
Tests that are wider in scope are more powerful, but they become expensive to coordinate, create, run and maintain. Higher-level tests also get progressively slower to run. When they fail, it can take time to figure out exactly what went wrong.
Put simply, higher-level tests are more powerful -- and more painful. When Mike Cohn made the test automation pyramid, he suggested teams automate less. He put system tests at the top, but he did not define a slope of the pyramid. A team needs to sort some of this out for itself.
Here are some ideas on how to think about the various software testing methods and how they can help.
The smallest snippet of code that can run in isolation could be called a unit. A true unit does not interact with anything else. In modern programming languages, we might call that a subroutine or function or method.
Until tooling like JUnit came along, unit testing was often intermixed with debugging. A programmer could set a break point on a bit of code, create the variables and run to that break point, stepping through every line of code. Developers rarely did, instead choosing to test at a higher level. Code that is unit tested generally has more bugs; changes to the code introduce even more new bugs.
Unit tests are easy to write, fast to run and relatively easy to maintain. Modern programmers, however, rarely write such completely isolated code. Instead, the code interacts with something else, like a filesystem or the network or a database. To make true unit tests, the programmer mocks or stubs out any such connections. In that case, the unit test might make sure the software sends a certain command to the database or handles the result of a database query in a certain way.
In his book, Working Effectively with Legacy Code, Michael Feathers said a test is not a unit test if:
- It talks to the database.
- It communicates across the network.
- It touches the file system.
- It can't run at the same time as any of your other unit tests.
- You have to adjust your environment (such as edit config files) to run it.
Let's describe tests that connect across those external boundaries as integration tests.
In the hexagonal architecture, an integration test goes from one dot in the middle all the way to an adapter. This is where it connects to a real system, such as calling an API or writing files to disk. It might even exist inside a web browser -- but using static files and not connecting to the internet. Or the test could grab files from the internet and interpret the text directly, and not interact with a web browser.
The point here is to use a real component. This can cause problems. If the system connects to a database and the database changes, then the test could fail -- even if only on a technicality. Consider the case where the test is to query the database for today's transactions. That sort of test requires significant setup on the test database. Another option is to simulate the call using service virtualization.
Integration tests tend to be relatively fast and cover a wider amount of code than unit tests. They are often simple to debug, and they can run in parallel. In many cases, integration is the sweet spot for automated checks.
System, UI and end-to-end tests
In the hexagonal architecture, end-to-end tests run through the entire system, from the user interface to the database, to order processing and fulfillment. As such, they can exercise a lot of functionality quickly.
End-to-end tests run across the entire user interface (UI). If anything changes at the UI level, a graphical test runs the risk of flagging a false error. The sum of those risks over time creates a new kind of maintenance.
Programmers may need to add testability hooks into the application. Adding UI tests can be expensive over time.
One approach to breaking down UI tests is the DOM-to-database technique. Popularized by Titus Fortner of Sauce Labs, DOM-to-database tests run a sliver of functionality. Those tests are quick to write, quick to run and rarely grow out of date.
These tests ensure that the software does what the programmer expects it to, but they cannot handle the case where the programmer misunderstands the requirement. Also, these tests won't help in cases where no one considers how the software should handle certain situations. To reduce that risk, turn to acceptance tests.
Think of acceptance tests as the bare minimum possible for the software to function. Software that passes an acceptance test doesn't necessarily work, but a failed acceptance test definitely means the software does not work.
In Agile software development, a project team defines what acceptance will mean before programmers write code. Having real, working examples is similar to having the quiz questions before you read a textbook. This approach allows programmers to discuss what the software should do, and it prevents entire misunderstandings that lead to errors in the product.
Acceptance tests cross at least two sides of the hexagon. They can often be written in plain language and in a checklist style. Cucumber is a popular open source framework to turn acceptance tests into automated checks. SpecFlow does similar work for .NET software. The trouble here is that teams sometimes create what are essentially system or end-to-end tests with more layers of complexity. The value of acceptance tests is in the conversation. If those discussions aren't happening, automating acceptance will likely limit conversation and shared understanding.
Simple performance tests can be as easy as reusing the UI tests and timing them. When an operation takes too long, send up an alarm. Load testing generally checks system response as the number of users increases. Like security testing, it's a software test method that used to happen at the end of a project. Performance testing is useful because it happens continuously -- or at least frequently enough to predict costs in the cloud. Run the tool for a day in the cloud with a number of daily users, and, if your model of time using the application is accurate, you'll be able to determine the app's per-day cloud cost.
Humans frequently use tools to speed up penetration tests. These tools, such as Nmap, Advanced IP Scanner and Advanced Port Scanner, scan an entire network to find open ports and device signatures. Once you've got a list of IP addresses and open ports and systems, other tools in your test arsenal can try to spot weaknesses. In that sense, penetration testing is highly automated.
Test, but talk first
The ability to design tests, execute them, report, learn and adjust strategy in real time while switching between activities is uniquely human.
A traditional automated test is a test the first time it runs. After that, strictly speaking, that test becomes change detection.
Still, most teams want to do more with test tooling. But where? And what do you do less of in order to do more of something else? At a Scrum cadence, a team can agree to experiment with automating a new type of check, at least for a week or two. Scrum gives teams the tools to create and measure these types of experiments.
Analyze which software testing methods might be practical for your particular projects, form a hypothesis on how to test better, then experiment with it for a sprint or two.