Not all software needs to be correct, but a large subset does.
Last year, I consulted with an organization that develops and maintains reporting solutions. In a nutshell, they extract data from various line-of-business applications and put them in a reporting-specific data store to create reports for decision makers. For regular readers, this may sound like work bordering on the trivial: Read data, transform it, and write it somewhere else. If you've ever done serious ETL work, you may, however, know that it can be quite difficult to get right.
While reviewing a particular code base, I asked the team how important they found correctness to be. Specifically, I asked: "What happens if a bug makes it into production?" They responded: "If that happens, we fix it."
When I, afterwards, related that exchange to the chief data architect, he was livid: "I just had to explain to the board of directors why [some number] was counted double, and how that could have escaped our attention for months."
Perhaps, to a software developer, reporting sounds like a low-stakes environment, but in reality, it's probably where your code has more impact than in the average line-of-business application. If Excel is the World's most common decision support system, reports (generically) is likely to be the runner-up. This is where your code interfaces with the 'non-technical' decision makers in your organization.
People make business decisions based on reports, implicitly assuming that reports are correct. If you count something double, or conversely accidentally discard data, business decisions will be based on incorrect data. This affects the real world. In this particularly case, the data was used to budget the number of people who it was possible to accept into a particular programme. If you count wrong, you may either turn away too many people, or conversely accept too many, which will impact your ability to deliver your services in the future.
And to be clear: These kinds of errors are difficult to spot. The system isn't crashing or throwing exceptions. It just calculates wrong numbers. It is incorrect.
Correctness is not all #
While I've done my share of prototyping, I've spent a great deal of my career working with software where correctness was essential. To a degree that I'd internalized the emphasis on correctness as an axiom. Of course, software has to be correct.
I was, therefore, taken aback when Dan North in a private conversation challenged me. In his experience, non-technical stakeholders often don't realize what they actually want. Of course you can't pin down what is correct if you don't even know what the system is going to do.
It's always valuable to have your assumptions challenged. Dan is right. If you don't know how the system is going to work, insisting that it's correct is going to get you nowhere.
Once you start looking at the world through that lens, there are plenty of examples. This is what drives the whole practice of A/B testing: You may care about some KPIs, but at the outset, you don't know what it will take to get satisfactory results. Most likely, you don't even know what is possible, but only that you wish to maximize or minimize some KPI.
You can approach such problems in epistemologically sound ways, by proposing a falsifiable hypothesis that the KPI of interest is going to change by at least a certain amount if you conduct a particular experiment.
Seeing the wisdom in Dan's words, I spent a significant period readjusting my view. Indeed, correctness is not all.
Not all software is like big tech #
If you're selling subscriptions or ads, and your main goal is to keep users maximally engaged, correctness does, indeed, seem irrelevant. The goal is no longer to present users with 'correct' content, but rather with content that keeps them on your property.
A similar mechanism applies to market platforms that have something to sell. Not only evident web shops like Amazon, but also market places like Airbnb. The goal of companies is to maximize profits, and as a former economist, I have no moral problem with that. The implication, however, is that if you can make a better profit by showing users something other than what they asked for, you'll do that. There's nothing new in this. Physical stores do that, too, and also did so half a century ago: Perhaps present the customer with what he or she asked for, but also offer alternatives, add-ons, etc.
Which companies specifically qualify as 'big tech' changes over time, but the original FANG quartet all operated in this realm of figuring out things as they went.
For the last decade and a half, such companies have had a significant impact on software-design thinking. Technologists from big-tech companies have been prolific in sharing how they do things. As Hillel Wayne observed, thought leaders are those who share their experiences with the rest of us. Most professional technologists don't.
Even though "move fast and break things" is no longer a motto, the mindset lingers. I regularly listen to a tech podcast where a recurring jest is that testing is only done in production. As the hosts snicker, I grind my teeth.
Because such big-tech messaging is as loud as it is, it's easy to forget that there are plenty of software contexts where correctness is important.
The price is right #
I spent my initial years as a programmer developing 'e-commerce sites' (i.e. web shops) for various companies in Denmark. In one case, a technical customer representative spent days with me, painstakingly going over each line of my C++ component to make sure that it calculated all prices and discounts correctly. This was a business-to-business sales system, and discount policies were complicated. The company had long-standing relationships with its customers, and couldn't risk jeopardising trust by miscalculating discounts.
Working on another code base, for another client, I one day received a call from the customer: "Why are all prices on the site zero?"
Fortunately, it was 'only' on the staging environment, so we managed to fix the issue before the real site started giving away everything for free.
In a third incident, a client had hired a third-party white-hat security company to perform a penetration test of the system. One issue they found was that we were using URL parameters to transport product prices from one page to the next. Before you judge me, this is more than twenty years ago, and I didn't know what I was doing. Most of us didn't. The issue, though, was that since URL parameters indicated prices, anyone could edit the URL to give themselves a nice discount. In fact, we didn't even check whether the number was positive.
If you ask an internet merchant, you'll find that he or she finds it important that prices are correct, even if the site also comes with various A/B-test-driven features for cross- and up-selling.
In general, it turns out that if you're dealing with money, correctness is of the utmost importance. This includes not only prices, but investments, pension funds, interest calculations, and taxes (although the new Danish property valuation system may prove to be a counter-argument).
Software that handles money is far from the only example where correctness is important.
Science #
It should be uncontroversial that software related to empirical sciences (physics, chemistry, astronomy, etc.) need to be correct to be useful. By extension, this also pertains to applied sciences. If you wish to insert a space craft into a Mars orbit, you should make sure to use correct units for calculation.
If you drive a car with a digital operating system, you'd prefer that it doesn't accelerate when you step on the brakes.
Not all examples have to be negative. The Copenhagen Metro is a driverless train system controlled by a fully automated computer system, and so far, it has worked pretty well. My point isn't that computerized transportation systems are doomed to fail. Rather, the point is that correctness is important because people may be hurt if things go wrong.
Medicine #
Is correctness important in medical software? Do I even have to argue the case? Would you like to receive radiation therapy from a machine with a race condition?
If you have a system that calculates medicine doses, does correct unit conversion sound like something that should be a priority?
Law #
Law is a funny discipline, because normally it is categorised as a social science. Even so, I think it has something in common with formal science, particularly computer science. In a sense, we may think of it as an axiomatic system of rules, although the 'axioms' are ad-hoc creations by a 'legislature', and we call them 'laws'. In a sense, a law is a little like a computer program, where we call the execution environment a 'court'.
This comparison may, perhaps, be too far out. The topic of this essay, anyway, is correctness, not philosophy of science. Can we think of legal computer systems were correctness is of the essence? How about a software-based cadastre? A civil registration system that keeps track of civil status, inheritance relationships, etc.? Prison parole management systems?
Security #
Software security is a part of many systems. If it doesn't work correctly, it's hardly useful. A system should prevent users from reading other users' data. Unless explicitly granted access, in which case it may then be important that access indeed works.
One problem that is especially pertinent with software security is how to prove the absence of a flaw. To some degree, this is an problem that also plagues other aspects of software development, not to mention epistemology in general. Even so, when testing software features, a comprehensive battery of tests can often convince you that a feature works correctly. With security, however, the stakes are different. All it takes is one flaw somewhere, and the wrong person has access to the wrong data.
Security appears to me as an area where black-box testing is insufficient; where careful inspection of code aids in fostering correctness.
State of affairs #
I could keep going. How about military applications? Robots? The point is that there's no lack of software where correctness is an important part. These are systems where it would be irresponsible, and often illegal, to test in production.
Despite the dominance of move-fast-and-break-things mindset in recent thought-leadership I posit that correctness plays a significant role in a substantial part of the overall software ecosystem. Perhaps it's not in the majority, but neither does it make a vanishingly small part. I imagine that it presently looks like this, where the green partition represents software where correctness is important:
While the current trends in LLM-generated code indicate an exponential increase in future software, it's not clear to me how correctness will be addressed without human skin in the game. I grant that it looks as though LLMs are good at producing certain kinds of software much faster than human programmers. If life or limb is on the line, however, are we ready to trust systems thus created?
Regardless of the answer, I don't think software where correctness is important is going to disappear. Perhaps its percentage dwindles as LLMs create more software, but that may rather be an effect of having a much bigger cake.
This figure is hardly to scale. In reality, we may easily see an exponential explosion in the amount of software created by LLMs, while the scale of correctness-oriented software may stay comparable to today.
Conclusion #
I expect that people and organizations will attempt to develop correctness-oriented software with LLMs. While I find that ill-advised, I'm certain that this will happen. I also expect that some such systems will turn out to be unreliable, and unfortunately, lives and fortunes will be lost. I hope I'm wrong.