AI: Guardrails, Hallucinations, Slop
 | Author: Victor Sample Vic Sample: MT43 News Treasurer |
AI: Guardrails, Hallucinations, Slop
Victor Sample
AI (Artificial Intelligence) has been a big topic for over 3 years. I see 30-40 different AI-based articles every single day. If you do any research or read any articles, you will see reference to the terms Guardrails, Hallucinations and AI Slop. Strange terms for something that is supposed to be so transformative!
What do they actually mean?
We will discuss each of these terms over the next three weeks.
So, first up: Guardrails.
GUARDRAILS:
From Copilot (the Microsoft AI Engine): “AI guardrails are safety mechanisms—both technical and policy-based-that ensure artificial intelligence systems behave responsibly, ethically, and within defined boundaries. They’re essential for preventing harmful outputs, such as hallucinations, bias, or security breaches, especially in high-stakes applications like healthcare, finance, and law.”
Within a few months of ChatGPT being made widely available to the public, there were reports of ChatGPT generating incredibly offensive, racist, bigoted diatribes. The AI engines react to prompts and how you prompt the AI does a lot to determine the answer that is generated. I have no doubt that prompts were used to lead AI down the path of racism and bigotry – but it did generate the offensive text.
I recently read that one of the AI companies was sued for millions of dollars over the suicide of a child. The saved chats showed that the AI engine not only assisted the teenager in deciding how to commit suicide, it actually encouraged the teenager to commit suicide. The evidence was apparently good enough to convince the judge and jury that the AI engine was complicit and awarded the parents a huge sum. There are several other lawsuits alleging the same misconduct by AI.
AI has no judgement; no ethics; no morality. It is a software program that uses a statistical probability process to choose the next word or phrase and just keeps iteratively doing that until the generated text is done. The AI engines are not inherently biased or racist, but they can generate terrible things based on just probability.
Guardrails are meant to help AI from going down unacceptable paths. Some guardrails examine the AI output and suppress anything that is meant to be objectionable; other guardrails are meant to keep you, the user, from prompts that lead to socially unacceptable output.
The text or images are generated by AI; human programmers implement the guardrails.
But sometimes the guardrails themselves start to be an issue. I was using AI to generate images to be used on the MT43News website for creating banners for various holidays. I asked for an image of Dr. Martin Luther King, depicting him making his famous “I have a dream” speech. Copilot told me that it was not allowed to generate that kind of image.
I pointed out that Dr. King was noted for his advocating social change through peaceful means, working within our social and legal system. How can Dr. King’s “I have a dream” speech be too controversial to generate the image? The AI engine then gave me a very sincere reply extolling the virtues of Dr. King and that he was indeed a role model. But, still was not allowed to generate the image. I think that whoever implemented that particular guardrail got a little overzealous.
Trying to implement guardrails is like playing the arcade game “Whack-a-Mole”. It’s a never-ending process that can never be won; try to stop one thing and another will pop up.
Guardrails are very close to censorship. Human programmers are attempting to restrict what can be asked and what answers are acceptable. Hopefully, advances in AI training will eliminate the need for Guardrails. Censorship of any kind is worrisome.
Next week we will discuss “AI Slop” – an interesting term for software that is supposed to change the very nature of our society.