Test pyramids are history

Fairly often I hear a referral to the test pyramid (Mike Cohn), and sometimes in quite a reverential way. In this blog post I’d like to advocate a less dogmatic approach to test structure. (Readers who don’t know what a test pyramid is might be dropping off here, that’s alright).

Let’s say you’re developing a web application to manage corporate travel data. One business process will be “the corporation has hired a new sales woman, and she needs to be registered in the system”. The test pyramid would suggest that there might be a handful of e2e tests, several dozen integration tests (checking eg. that our web application understands updates from SAP), and maybe a hundred or more unit tests.

There are lots of blogs about why it’s good practice to write tests, so I’ll take that as a given. (Readers who don’t write tests might be dropping off here, that’s alright). One point I find is often not emphasized enough is that the most fundamental purpose of the entire test suite is to ensure your software works. With that I don’t mean the SAP interface, nor the destination caching, nor that pesky VAT calculation for domestic Swiss flights. I mean that when the corporate customer using funnel.travel hires a new sales woman, the system administrator can hook up her user account, assign it to her designated travel assistant, and set an annual travel expense budget.

At the top of the test pyramid, thus, we will write one single test, which covers this entire business process. I see such a test as above e2e, we call them epic tests. This single test provides the confidence that the business process works (and if it is the only breaking test, one of the first steps is to add a unit or integration test catching the bug).

Being aware of exceptional cases, we’ll add some more tests at the e2e level to make sure we get proper system behavior for missing input data, duplicate email addresses and such.

Quite likely, during development we came across a few glitches, or just wanted to quickly ensure that the new and tricky stream filter worked ok even with entries from different time zones. For those things we wrote unit tests.

I’d like to point out what we didn’t do:

  •  we didn’t assess the number of e2e tests, and from that deducted a number of unit tests which must be met
  • we didn’t assume that a set of passing e2e tests each covering small steps ensure that an entire business process works

From the tests mentioned above, the epic tests gives us the confidence a business process works. Further e2e tests (and maybe some integration tests) give us the confidence our system will handle exceptional cases as designed. Unit tests helped us develop (and document) our components.

It is very likely that writing appropriate tests will result in a lot of unit tests, a smaller number of integration tests, and an even smaller number of e2e tests. But it’s important to focus on the value each test is providing. If you have thousands of unit tests, but they’re infested with mocks creating a test context completely disconnected from real-life, those tests don’t add any value. They might even hurt by giving a false sense of confidence while actually only proving that your system can handle data where most properties are null and collections are empty.

If you hear that there should be more tests of a certain kind, and the only justification is the test pyramid, there’s a good chance that project has a vast number of tests, and yet defects keep pouring in. Might be time to revise your test strategy. (Some readers might drop off here because they need to attend the “Increase productivity” workshop, that’s alright).

Side note: I believe that test structures are very different between applications and framework libraries. For the former, the “user” is a person who will click a button and expect a response. The most important test here is one which clicks a button and asserts the response. For the latter, the “user” is another developer calling a public method. I feel often application developers adopt best practices of framework developers. With sub par results.

In the process of developing funnel.travel, a corporate post-booking travel management tool, I’m sharing some hopefully useful insights into Angular 4, Spring Boot, jOOQ, or any other technology we’ll be using.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s