The topic of AI bias is getting a lot of airtime at the moment. As it should. There have been a number of embarrassing public blunders and with the influence of AI on real-world decisions spreading, no one wants to see mistakes repeated at larger scale. But why does bias matter so much? How does it get into an AI? What are people, including us at Nuon, doing to avoid it? And, what is AI bias anyway?
Microsoft suffered a public AI embarrassment with Tay in 2016. Tay appeared as an AI chatbot that interacted with people via Twitter. The AI was intended to mimic the language patterns of a 19-year-old American girl. However, Tay had a self-learning function that was, in Microsoft’s words “attacked by trolls”. The trolls taught Tay to use offensive and inflammatory language. She didn’t stay online for very long.
Amazon employs a lot of people. In 2014 they decided to use an AI in the selection process. They trained their AI with details of all their previous hires in the hope that it would be able to spot good future hires. In 2015 they realised that the AI had a selection bias against women. In response, they artfully stopped presenting the system with gender-specific data. However, when it continued to exhibit bias, they realised that it was now relying on phrases like “women’s chess club” as a proxy for gender. The system was scrapped.
MIT created a huge collection of more than 80 million labelled images of people and objects to train AI systems. An image of a park might, for example, have labels like: “children”, “play”, “ball”, “grass”. The labelling was done by humans, though, and human bias (for want of a stronger word) crept in, showing up as labels like “whore” attached to images of women. MIT withdrew the dataset.
What is AI bias?
This should be easy. It seems blindingly obvious. The dictionary has it as: noun: inclination or prejudice for or against one person or group, especially in a way considered to be unfair. Still seems obvious. But, scratch the surface and, like so many things, you find a world of complexity.
Psychologists have categorised more than 180 individual cognitive biases. That sounds like a lot, more than you’d expect. Maybe you were thinking more of prejudices? There are fewer of them: Sexism, Ageism, Racism, Lookism, Classism, Religious, Neurological, Linguistic. That is a more manageable list, but we can’t pick and choose our biases.
Also, that simple-looking dictionary definition hides another world complexity in one word: “unfair”. For example: is it fair that the same proportion of black and white individuals should get high-risk assessment scores? Or that the same level of risk should result in the same score regardless of race? Both can’t be true at the same time.
How does bias get into AI?
Okay, so, we’ve failed to come up with a concise definition of bias, and this is meant to be a blog post not a book, so let’s put that to one side and consider the next question: How does bias get into an AI? That’s a much easier question to answer: if humans exhibit 180 types of bias, then there’s a pretty good chance that software will too.
Most software developers are human (I count some of them as friends, so I know). Data scientists, analysts, testers, even some DBAs, are also human and are therefore subject to bias. If they all exhibit bias, it’s no surprise that the systems which they create are biased. It would be more surprising if they weren’t.
I used the word software there on purpose, because AI is meant to be better isn’t it? Better because it isn’t hand-coded by a biased human? True, but, AIs typically learn from datasets, and datasets are typically created by humans. MIT’s 80 million image dataset is a perfect example. If you train an AI on data that was labelled by humans, you get an AI that encodes the biases of those humans.
Better labelling is one answer, but “better” is problematic. Labelling 80 million images is almost the definition of “laborious”. This kind of task is often performed using services like Amazon’s Mechanical Turks. If you’re a Mechanical Turk worker you’ll be sent a set of images to manually label. You label them and send them back to the client, and you get paid. Probably.
The client will run some analysis on the results and exclude results from workers who deviate from the norm. Those workers don’t get paid, and generally don’t get told why their work was refused, or have any right to appeal. How would you label images in that setup? You’d probably label them exactly as you think the client wants you to. You would conform to your perception of the client’s desired result and ignore your honest opinions.
What are we doing to fix AI bias?
All of that makes it sound like we’re beaten, that bias in AI is just inevitable and we have to live with it. That’s far from the case though, so what can we do?
Firstly, although it may sound trite: the first step to fixing a problem is realising that you have a problem. Everyone involved in building AI solutions need to be aware of how their work, their approaches, and the resources which they use are distorted by bias.
There are concepts like counterfactual fairness that can help. When you test your AI model, it can be said to be fair in a specific case if it gives the same result in the real world as it would in a counterfactual world where just one field, like gender, is different. Care is needed of course, as Amazon found with their recruitment AI, to identify proxy data.
Nuon’s AI products, at least currently, don’t rely on training datasets so we are freed from the risk of badly labelled data introducing bias. In the markets that we currently operate in, regulation prevents insurers from using data like gender or ethnicity in their rating calculations. So our AI would never see that data. However, neither of these is a get-out-of-jail-free card.
As our algorithm is learning in real-time by experiment, it has the potential to develop bias through bias specific proxy data. Continuous testing using well-conceived counterfactual datasets is some protection. But when those tests detected bias, what then? Rolling back a previously good model will un-bias the results, but it is likely that the AI will re-lean them.
There is no simple answer to fixing AI bias, so we continue to work on improving. That’s our job.