Learn from past failures how not to do AI

Kathy Gibson is at Gartner Symposium in Cape Town – One of the best ways to design for the future is to look at the past – and we can learn how to apply artificial intelligence (AI) more effectively by examining high-profile failures.

Some of the things we expect AI to do are actually quite difficult, and don’t necessarily work as planned, says Alys Woodward, senior director analyst at Gartner.

Studying where AI has failed helps us to set the boundaries around where the technology can go, and the use cases in which it can be applied. So even the fails are successes, Woodward says.

One high-profile fail is that of the Chinese system that uses facial recognition to publicly name and shame jaywalkers – until a high-profile businesswoman was incorrectly identified when her picture was picked up on the side of a bus.

Voice assistants can be hacked quite easily. There was the instance where Alexas around the country ordered doll houses when a TV anchor mentioned a story about Alexa ordering a doll house.

Even IBM Watson Oncology has come up short in the healthcare space. It was meant to synthesise enormous amounts of data to come up with novel insights, but the system gave unsafe, incorrect treatment recommendations to the extent that doctors stopped using it.

The challenge turns out to be that there was too little high-quality, randomised trial information to feed the system. Because most of the data fed to it were hypothetical, recommendations were based on the preferences of the few doctors providing the data.

“Another mistake was to market the system before it was ready,” Woodward says.

In New Zealand, a robot passport checker rejected an Asian man’s application because his “eyes are closed”.

The problem here, says Woodward, is that the system was trained on a homogenous population then used on a diverse population.

Amazon scrapped its AI recruitment tool because it had a strong gender bias. The system was trained on resumes submitted in the prior 10 years, in which male applicants were highly dominant.

As a result, it preferred applications from men and even penalised resumes that included the word ‘women’s’.

Although the system has been fixed to remove that bias, managers have lost faith in it.

A startup has launched a new system that claims to be able to classify people based on scanning their faces – the qualities or profiles it says it can identify include: high IQ, academic researcher, professional poker player, bingo player, brand promoter, white-collar offender, and terrorist.

Another one claims to be able to detect sexual orientation – four out of five times for men, and three out of four times for women.

“There are parts of the world where it’s OK to be gay, but there are parts of the world where being identified as gay could be dangerous,” Woodward points out.

“This system is creepy, intrusive – and not even accurate.”

Another creepy and intrusive move was the one by Henn-Na Hotel in Tokyo, which introduced robots as hotel assistants. As front desk staff, they made too many mistakes, and misinterpreted instructions – or perceived instructions – from guests when operating as room assistants.

Today, the robots are still used as porters, but interaction with guests is much more limited.

Microsoft’s Tay is a classic example of how a bot that was taught in just a few hours to become racist and misogynist.

Sometimes the data is wrong, and that is why AI fails, Woodward points out. “Your data is in the past, but if the future is going ot look different, you have account for that. And if the data is simply not there, that is a challenge.”

Often the data is fine, but the technology isn’t up to scratch – it’s not ready and not accurate enough.

The usage scenario needs to be right if a project is going to work. This could change over time and in different circumstances.

System integration has to be spot-on to avoid accidental errors; and the environment in which AI is employed needs to be carefully thought through.

The implications of an AI fail include the creation of loopholes that introduce risk; and inaccurate results as a result of immature technology, lack of data and bad training.

Creepy is another bad outcome, although it is difficult to quantify. Some universal “creepy” elements include too much information and pre-crime accusations.

The consequences include bad publicity, legal consequences, loss of faith in your solutions, loss of trust in our practices. In many instances, the hyperscale cloud providers and large social networks have some immunity to reputational damage, much more than other organisations can absorb.

There are many ways to avoid these outcomes though, says Woodward.

Organisations should match the accuracy to the use case, mapping this against the risk of a false positive and how accurate the prediction needs to be.

Also manage awareness and the risk of loopholes, she says.

In this instance, AI automation should not be treated differently to non-AI automation. It should system test in a rate of use cases to ensure loopholes are not created, wherever possible. And test against a range of use cases.

AI ethics should always be employed to avoid creepy AI.

Guidelines for ethics include the need to be fair, explainable and transparent, accountable, secure and safe; and human centric and socially-beneficial.

In terms of fairness, ethics need to determine how to treat people, be aware of discrimination and bias, and ensure there is no secret manipulation.

Explainability and transparency means that organisations need to be open about an AI system being involved, ensure they are explainable. IP and risk management can be protected, but algorithms should be documented.

At the end of the day, the owners of the system are accountable for AI, so it’s important to implement AI governance.

A secure and safe AI system is vital as it needs to respect privacy. It should monitor the learning process of the models, and ensure proportional use, where the technology is used appropriately. AI systems must be set up to do no harm, Woodward adds.

AI systems should always be human-centric and socially-beneficial, she says. Studies show that a human should always be in charge, while systems must be seen to benefit society. It goes without saying that they must be lawful – although the law does tend to lag technology, so this should be given regular attention.

“We can’t really set specific rules for AI,” Woodward says. “So you set principles, and rules will evolve over time.”

ScotiaBank in Canada, for instance, set five principles for AI. They are:

* Be useful – improve outcomes for customers society and the bank.

* Be monitored for unacceptable outcomes.

* Be accountable for any mistakes, misuse or unfair results.

* Be safe – the protection of data privacy is paramount.

* Be respectful – as the tech developes, objectivity should adapt without losing sight of core values

When building applications and products with AI, Woodward makes the following recommendations:

* Align the desired outcome with your transparently stated values.

* Avoid being connected with unethical or creepy AI.

* Set principles to drive direction, then set rules for specific usage.

* Focus on augmenting humans rather than replacing them.

* Raise awareness of inherent biases and deficiencies in the data, and provide guidelines to mitigate them.

* Establish accountability for decision about the right accuracy and explainability for each use case

* Expect AI to mature.

* Keep discussing failures – don’t make them taboo.