Still Racist After All These Datasets: Once Bias Is Baked Into Your AI, It’s Hard to Root Out

Artificial Intelligence (AI)Big DataCompliance

By Corruption Buzz On Nov 9, 2022

Spending on artificial intelligence across all sectors is expected to more than double by 2025, but Nigel Cannings of Intelligent Voice is here to throw some cold water on that hot trend — and bring a bit of well-earned wisdom.

Artificial intelligence has, justifiably, been embraced as a labor-saving productivity tool across all sectors of business, and in compliance it is proving invaluable, improving operational efficiency while reducing expenses. But it’s no secret that AI is only as good as the information you use to train it, and it can easily become subject to bias.

All AI systems have the potential to be exposed to the influences of not only the people who create them but of those who collected the data that powers them, leading to flawed systems rife with unintentional biases that skew data sets and impact the performance of the system.

There have even been cases where the unintended inbuilt biases of a business’ AI systems have landed them in hot water, raising claims of racism and sexism in recruitment — Amazon is one of the most notable cases. Recent research done by the University of Cambridge likens the use of AI in recruitment, for example, to “automated pseudoscience.”

But why does bias in AI really matter? How can AI bias impact compliance? And what can be done to debias AI?

Why does bias in AI matter?

AI bias matters for a variety of reasons. I’ve already touched upon the potential for brand reputational damage but when a system’s training data contains the pre-existing biases of the data generated there is a range of other potential fallouts, depending on the sector involved.

There are ethical concerns: One area impacted has been in the medical sector. The Babylon healthcare app was accused of bias when its symptom checker said the most probable cause of a 59-year-old female’s central chest pain was a panic attack. The same app would have referred a male patient with the same symptoms for immediate medical attention with a suspected heart attack. The system relied on historically biased diagnoses that suggested women were prone to hysteria and perpetuated the myth that women do not suffer the same symptoms of heart attacks as men, something I was personally set straight on by my own daughters. Almost any sector can run afoul of ethical considerations, from giving biased career advice to inappropriate marketing.

There are operational concerns: If the data used to produce or market a product is skewed, your goods will be improperly targeted, which, in turn, impacts sales and revenue.

There is the potential for customer alienation: When your brand is known for any kind of bias, customers run away in droves. Likewise, if your marketing material is poorly targeted because of AI system bias, your customers will move away, seeking a brand that better understands their wants and needs.

And there is the matter of compliance: If your risk perception decision-making data comes from a biased source, is your sample group more likely to focus on particular ethnicities, age groups or genders? Will your compliance monitoring hone in on specific sectors of your workforce? Will other potential risks go unchecked because they fall into a blind spot?

There’s no arguing against the negative impact of bias in AI but how can it be addressed?

Why is there bias in AI?

Computers cannot think like humans. There is no intuition in AI. Artificial intelligence is based solely on training. AI decision-making is based upon the information and trends input into the AI system by its human creators; thus, if the data fed into that system is biased, even unintentionally so, the AI decision-making will be biased too.

Therefore, the problem of removing bias from AI comes down to the quality of the data. “Intelligence” is a very dangerous term to apply in these circumstances. All AI really does is generate an enormous statistical model based on the historical data it is trained on, so its predictions are always based on what has been seen before.

If you produce a weather predicting machine on data that shows certain events only happen every hundred years, you cannot then expect it to predict the same events to happen every week. This is a trite example but demonstrates that statistics based on history drive these machines, not “intelligence” that allows us to observe, generalize and theorize about unforeseen and potentially unforeseeable events.

How can we remove the bias from AI?

Practically, it is impossible to completely de-bias AI systems. But we can try. We’ve already established that biased training data leads to poor conclusions, preventing AI systems from being the universal panacea to removing the human in the loop in the compliance sector. But how can we remove that potential for bias?

AI training begins with labeled data. This is data that has been classified by a human being as a piece of training data belonging to a particular category. This system is used as the basis of almost every AI interface, whether it’s an X-ray depicting the formation of cancer or the CV of a successful job applicant.

The system creator or data collector labels each piece of training data for the AI system to learn from. But there is not much potential training data that not all of the data can be human-categorized. These are described as “self-supervised” training models, whereby the system labels new data using the data it has already processed as a guide. This means if you have used a biased data set to start your training, it creates an imbalanced baseline for future “self-supervised” training, which exacerbates the systems on which decisions will be formed.

So, we have to start to look at obvious steps in de-biasing our initial training data. Anonymization is important; for example, say we don’t include names, which are often gender-specific. Race, addresses, age — all of these again have an impact on the potential for bias. But have we then gone too far? For some use cases, like medicine, age may be important, as it impacts possible health risk factors.

The answer is we have to look at the decisions we are trying to reach, to determine where there might be bias in the data, to help us work out what we should and should not include.

Building explainability into AI

To do this, we have to have tools that enable us to answer why a system is making the decisions that it is. We need the ability to check and double-check the reasoning behind each decision. Transparency is a major concern in AI. And to generate the transparency necessary to build trust in artificial intelligence systems, as well as to remove bias, we need to be able to explain any given decision-making process. And this is where we’re seeing a move into eXplainable AI (XAI).

XAI models allow users to ask a network why it made any given decision. This shows the parts of the input data that had the greatest impact on the system’s decision, showing why a person or object was classified in a particular way. And this is important, as AI is rolled out into increasingly sensitive areas. When the UK’s Department of Work and Pensions (DWP) announced it would be using machine algorithms to decide whether people should receive universal credit, it was met with huge amounts of negativity because there are significant concerns about what this may mean for a claimant’s ability to challenge decisions. While the system has not yet been brought online, there are already questions about the potential for systematic bias and the lack of transparency. XAI models have the potential to resolve these issues.

This is also a matter of compliance. GDPR, the EU’s data protection regulation, states that people have a right to have automated decisions explained to them. A failure to do so would be a failure in compliance. And this doesn’t just apply to the decision-making process. It is needed for the communication of the scope of an AI system with regard to the data it uses and how it uses it.

Artificial intelligence is an incredibly exciting field. Its potential is enormous, and its value will only increase as the technology evolves. But bias has the means to be AI’s greatest impediment — preventing it from reaching its full promise. If we don’t put checks in place now to prevent bias in AI and monitor its presence in AI systems, there is the very real threat of literal systemic racism, sexism and every other form of discrimination, creating a scenario that very few of us would ever wish to encounter.

Source link