Blog Post

Blue Sky Programming – The Optimism Trap

,

Many years ago, before I joined Oracle, I was working on a major modernisation project. We were replacing an existing non-Oracle system with an entirely new Oracle database application written from scratch. Not long after deploying a new version into our test environment, the results came back and a large number of tests had failed.

I sat down with one of the subject matter experts, a long-serving employee who had helped build the original system. As we worked through the failures, he looked at me and said: “The problem here is blue sky programming.” I’d never heard the expression before.

“What do you mean?” I asked, and he replied:

“Developers write software as though the sky is always blue. They assume everything will go right. They assume the environment behaves perfectly, the user follows the intended workflow, and the data always looks the way they expect. They write for the happy path.”

That conversation changed the way I think about software engineering.

The One-Horse Race

The application managed betting on horse races. The subject matter expert pointed to one of the failing tests. “You know that bets are paid on first, second and third place?”

“Yes” I replied, although I was wasn’t really much of a gambler ??

“But what happens if there are only two horses in the race?”

Unbeknown to me it turns out that in some races there are only enough runners to pay first and second. Our code has assumed there would always be three place getters. Then he gave an even stranger example.

“Did you know there are races with only one horse?”

Apparently sometimes several horses are withdrawn immediately before a race due to injury or illness, leaving a single runner. Racing rules in some regions still required that horse to actually complete the course before bets can be paid. The jockey literally trots the horse around the track to officially finish the race.

Then of course came the obvious question: “Does your program handle that?”, and it didn’t. In fact, he’d already found the test cases where my application completely failed because it couldn’t cope with missing second and third place finishers. Somewhere in the code I’d implicitly assumed there must always be multiple horses. This is an example of blue sky programming.

Not Just Edge Cases

You’re probably thinking (like I did) that every application will always hit the occasional edge case. I argued that these seemed like incredibly niche situations, but whilst he conceded this, he added “It’s not the unusual cases that breaks software most often. The biggest problem is workflow assumptions.”

As developers we assume that users will follow a “natural” sequence of steps to use our applications. We assume they’ll do “A” then “B” then “C”.

We are too optimistic about how we think our programs will be used.

Real users don’t behave like that. They jump around. They go to C first. They change their minds. They enter data in unexpected orders. They press Back. They refresh the page. They open multiple browser tabs. When software assumes everyone follows the “natural” workflow, it often behaves unpredictably the moment someone doesn’t. That’s blue sky programming too.

Blue Sky Programming is a Growth Market

I regularly notice this pattern almost everywhere. Recently my son enrolled in an online course before deciding he’d picked the wrong one. There was no Unenrol button. The designers had assumed that once someone clicked Enrol, they’d never change their mind. Instead we had to raise a support request. Days later an administrator manually removed my son from the course. Not because the operation was technically difficult. Simply because nobody imagined a user wanting to undo a perfectly reasonable action.

Another example from the same site was where it asked him to select two reasons for taking a course.

Both dropdown lists contained the same list of reasons, so naturally I instructed him to select the same value for both dropdown lists to see what would happen ?? Instead of a friendly validation message, the application crashed with a database duplicate constraint violation. Someone had assumed no user would ever choose the same option twice.

Or consider this online shopping example from my local supermarket chain.

The retailer site asked whether I wanted delivery or would collect the item myself. The instructions said: “If you’re collecting the item, simply continue to the next screen.” Unfortunately, every delivery field on that screen was marked mandatory.

I’m certainly not immune to this flaw in programming. Years ago, before cloud authentication and stored credentials were common, I wrote a shell script that prompted users for their Oracle username and password, which then connected to the database to run a utility for them.

I tested a reasonable set of scenarios – empty passwords, special characters etc, and deployed it with a firm confidence that it was robust. Of course a user broke it on the very first day in production ?? I can’t remeber what I had forgotten to check, but my testing clearly hadn’t reflected the real world. Users have an incredible ability to find assumptions you never realised you’d made.

The Minimum Viable Product trap

I think one reason for the growth of blue sky programming is a misunderstanding of what “Minimum Viable Product” actually means. An MVP doesn’t have to include every feature. But every feature it does include should provide a good user experience. It shouldn’t assume users always make perfect or even wise choices. You can short cut on features, but you should not short cut on making those features robust as well as giving a good user experience.

Avoiding Blue Sky Programming

You’ll never predict every possible failure, but you should still be focussed on designing for graceful failure. For example, suppose your application updates a database row. A blue sky implementation simply performs the update and waits indefinitely if another session happens to already hold a lock on that row. Eventually the application times out and the user sees a meaningless “Request failed” message.

A better design applies a sensible timeout. If the lock isn’t released, the application can explain what’s happening, perhaps along the lines of: “Another user is currently editing this record. Please try again in a few moments.”

The operation still fails but the experience is dramatically better. Sometimes the solution is even simpler. With a lot of software nowadays, you simply “install” the new software in order to perform an upgrade. I’ve lost track of the number of google searches I’ve done to verify that the new install will not wipe my existing settings.

Why not display a reassuring message? “If you’re upgrading an existing installation, your settings and configuration will be preserved.” That single sentence removes uncertainty before the user even has a chance to worry.

AI Makes This Even More Important

Ironically, the rise of AI-generated code makes blue sky programming even more relevant. Large language models have been trained primarily on successful examples of software. They’ve seen millions of examples showing how code is supposed to work. They’ve seen far fewer examples of the strange, awkward, real-world situations where software breaks. Humans rarely publish stories about every obscure failure they’ve encountered. We publish the elegant solution. That means AI is exceptionally good at generating the happy path.

It still takes human experience to ask questions like “What if there is only one horse?” ??

The sky isn’t always blue. Please make sure your software doesn’t assume it is.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating