I wrote code without tests that ran in production without defects, and I wrote buggy code with TDD (Test Driven Development). Time to look back at 35 years of coding and when tests help, and when there is something better. And especially, what these better things are.
Why test?
For me, there are three reasons to test the software I (or we as a team) write manually or automatically:
- Check expectations – “it works”
- Prevent regressions – “it keeps working (after changes)”
- Drive the implementation – “I know what I have and need, but not yet how to achieve this.”
I like (manual) exploratory testing to check my expectations and to find holes in the code that are not working correctly. On the other side, I dislike manually executed test cases to prevent regressions. They are cumbersome and way too expensive to do over and over again.
Automated testing approaches such as unit testing and approval testing are effective for preventing regressions.
TDD is great for driving the implementation of complicated business logic or algorithms.
Property-based testing can help prevent regressions and identify missing pieces in algorithms.
There are also kinds of tests I dislike:
- E2E (end-to-end) tests: They typically need a complicated setup to run, which makes them brittle.
- BDD / Spec-driven: Too much focus on the specification, often leading to rigid systems (but this would be a whole series of blog posts on its own
) – I, however, like the involvement of domain experts and developers. But I can do that with TDD as well.
Of course, there are many more kinds of tests. Choose what matches your situation.
Reasons for defects?
To choose the right kinds and number of tests, we need to understand where defects in our software come from. The causes vary by team and tech stack. I try to give an overview here.
Nulls: When your programming language allows nulls, then you have to deal with them. Dealing with nulls in every situation is simply difficult and, therefore, error-prone.
Shared mutable state: We quickly get overwhelmed with the hidden coupling that shared mutable state introduces into our codebase – especially in multi-threading scenarios. This quickly leads to wrong assumptions and, therefore, to defects.
Integration problems (wrong assumptions): A component or library we integrate into our own codebase behaves differently than we expected.
Misunderstanding (not solving the problem): We didn’t fully understand the problem to be solved, and we delivered an incomplete or incorrect solution.
Misbehaving infrastructure: Our software is misbehaving because of a defect in the infrastructure we rely on.
Programming traps: Programming is tricky, and we sometimes get it wrong. For example, never divide first, then multiply (x/100*7); always multiply first, then divide (x*7/100). Otherwise, precision can get you. Or when summing floating point numbers, always sort them first (ascending).
Wrong equality checks: In runtimes like .Net, everything provides an Equals method, but they do not always act as we assume. Comparing by reference instead of deep equality of the fields/properties quickly leads to defects. And there are more equality checks happening than we are typically aware of, for example, when using HashSets, Dictionaries, etc.
Ignored expression results: Most programming languages allow ignoring the result of an expression. For example, you can call a method returning a value, and just ignore it. But sometimes we should react to the returned value to prevent defects. For example, when we ignore the result of an expression that tells us whether things went well or badly. Ignoring the “bad” return value likely leads to a defect.
Implicit casts: Implicit casts can introduce hidden, unsafe type conversions that can silently corrupt data, break type safety, and cause unpredictable runtime errors.
There are obviously more sources of defects than the ones above.
Level of correctness
We also need to be aware of the level of correctness that we want to achieve, given our context.

On the continuum from “it might run” to having a “proof of correctness”, we typically want to be at the “we are confident” spot – in the context of business applications. Surely, not at the “we have 100% test coverage” spot. Proofing for correctness is typically a lot of effort and not worth the quality gain.
In our team, we don’t need perfect software. In many cases, it is sufficient to respond quickly, and the impact on our users and our business is minimal. The greater the impact, or the longer it takes to detect and fix a defect, the higher our quality standard needs to be.
So, our quality standard for core business logic code is much higher than for UIs that add simple data. In a later part of this series, I’ll show how slicing reduces the impact radius significantly.
Test Effort
Having (automated) tests brings a lot of effort with them – even when we write good, concise, and refactoring-friendly tests:
- writing the tests (with good error messages)
- running the tests (during development and as part of the build and release pipeline)
- maintaining the test code (due to design changes)
- changing the tests (due to changing business needs)
- mis-hits (due to flaky or overly-specific tests)
Obviously, tests have their benefits. But we shouldn’t overlook that they also require effort. So, if we could replace them with something cheaper or faster (development and runtime), that would be beneficial.
The hard things to test
Many things are easy to test: throw some input at a method or function and check the return value, and maybe some side-effects (preferably state-based or, less preferably, interaction-based).
However, there are things that are inherently hard to test:
- Combinatorics: e.g. calculations or queries with lots of variations (the result depends on subtle dependencies of many input values)
- Multi-threading
- Boundaries of the system (including UI)
- Things with many dependencies (and they can’t be reduced meaningfully)
- Stateful code
For these scenarios, finding a way to ensure quality without relying on tests would be especially beneficial.
Next time
In this post, I’ve written about why we write tests, reasons for defects, and why it would be nice to have something less expensive to replace some tests.
In later parts, you’ll see several ideas about how to reduce the number of tests needed, thus reducing development effort.
