HomeBlogPapersTechWolf

Why AI Needs User Stories Too

24 July, 2020 - 3 min read

In agile software development, keeping a strong focus on your end-user and their needs is sacred. More than anything else, you want to understand functional requirements, the motivations behind them and the implications they carry - preferably before writing a single line of code. While most software engineers tend to agree to this idea, many have an inherent tendency to stray from it. There's something strangely seductive about coding first and asking questions later, despite the potential negative consequences of it being well-described. If this habit is concerning in conventional software engineering, it's particularly so in applied machine learning.

Here's the thing: arguably, the main goal of any software project is to deliver meaningful functionality to end-users. That doesn't just mean building something impressive; it means creating something functional. For example, how many stunning web designs have failed because visitors couldn't understand them? How many apps have you stopped using because of friction in the user experience? Academic endeavours aside, machine learning should be no different: the most valuable output for any model out there is output that means something to an end-user. Yet, many ML engineers choose to spend just a fraction of their time thinking about the functional goal of their system, choosing instead to indulge in the new multi-billion parameter model released by Google, OpenAI and the likes.

So how can you do better? The answer is pretty unoriginal: user stories - much more than telling you what your model is supposed to do, they can also help you figure out why and how. For example, say we're building a model to recommend the ideal holiday destination, the most straightforward approach seems to be taking a bunch of travel histories and training a model to predict the next one in the line (or the most highly rated one). However, naively executing this plan will have us end up with an entirely sub-optimal experience. Let's try again with some simple user stories:

  • As a traveller, I want to discover new destinations through my recommendations.
  • As a traveller, I want to find options that fit my budget.
  • ...

These user stories are strongly connected to the success criteria of our system. For example, no one turns to an AI engine to find out they should go on holiday to the same destination they have visited five years in a row, and while a five-star resort in Hawaii does indeed look great to almost everyone, it's only a meaningful suggestion if you have the budget for it. User stories, along with success and failure scenarios, can help you define an evaluation task that corresponds to the value of outputs, rather than just lightly correlating with it.

Often, describing a comprehensive suite of user expectations will come with the realisation that your setup is fundamentally flawed. Your training data might not cover the use case entirely, and even the best possible evaluation setup could still be biased. It's not easy to resolve these issues, and it will often be impossible with the resources at hand. Yet, getting these insights can help you build the right system, and despite the simplicity of that statement, that might just be the hardest thing of all.