The Pandora’s Box of Generative Text AI

Andrei Mihai

With all the craze surrounding text-generating AIs, we are all drawn to the potential that has suddenly opened up in front of us. But what dangers loom behind the sparkling promise?

(This article was written before the release of ChatGPT 4.0 and refers to mechanics of the 3.5 model)

The advent of generative AIs has opened up new possibilities in fields ranging from art and music to healthcare and finance. However, along with these advancements, come ethical concerns about who should be held accountable for the actions and decisions made by these powerful algorithms.

Imagine a world where an AI algorithm is responsible for determining whether or not you get hired for your dream job. The algorithm scans your resume and analyzes your social media activity to make a decision. You don’t know how the algorithm works, what data it’s using, or who trained it. You’re simply given a rejection letter and told that the AI decided you weren’t a good fit for the job.

This scenario is not as far-fetched as it may seem. In fact, many companies are already using AI algorithms to screen job applicants. But who is responsible for the ethical implications of these decisions? Is it the company using the algorithm? The developers who created it? The data scientists who trained it? Or perhaps the government regulators who oversee its use?

No easy answers

The truth is, responsibility for the ethics of generative AIs must be shared among all these parties. Companies using these algorithms have a responsibility to ensure that they are using them fairly and transparently. Developers have a responsibility to create algorithms that are unbiased and ethical. Data scientists have a responsibility to train these algorithms using diverse and representative data sets. And regulators have a responsibility to ensure that these algorithms are not being used in ways that violate people’s rights or discriminate against certain groups.

Let’s start with the companies. When companies use AI algorithms, they must ensure that they are not discriminating against certain groups. For example, if an algorithm is trained using data sets that are predominantly made up of white men, it may inadvertently discriminate against women and people of color. Companies must also be transparent about how they are using these algorithms and what data they are using to train them.

Developers also have a responsibility to create algorithms that are unbiased and ethical. This means designing algorithms that do not perpetuate harmful stereotypes or discriminate against certain groups. It also means creating algorithms that are transparent and explainable, so that people can understand how they are making decisions.

Data scientists have a responsibility to ensure that the data sets they are using to train these algorithms are diverse and representative. If data sets are skewed towards certain groups, the algorithms will be biased and discriminatory. Data scientists must also be aware of the potential ethical implications of the algorithms they are training and work to mitigate any harm they may cause.

Finally, regulators have a responsibility to ensure that these algorithms are not being used in ways that violate people’s rights or discriminate against certain groups. This may involve creating guidelines for the use of these algorithms, as well as monitoring their use to ensure that they are being used fairly and transparently.


In conclusion, the ethics of generative AIs is a complex issue that requires a multi-faceted approach. All parties involved, from companies and developers to data scientists and regulators, must take responsibility for the ethical implications of these algorithms. By working together, we can ensure that these algorithms are used in ways that are fair, transparent, and ethical. Only then can we truly harness the power of these powerful technologies for the betterment of society.

Boy, that was pretty convincing, wasn’t it? What is even more convincing is that it was entirely written by ChatGPT, the new AI-based chatbot that has taken the world by storm in the few months since it was launched. Don’t believe me? Have a look for yourself.

My prompt to ChatGPT and its output: The more specific you are with the prompts, the better the results tend to be – and the harder it is for other algorithms to figure out that this is AI-generated text.

Somewhere between Shakespeare and stochastic parrots

ChatGPT is part of a new generation of generative AIs – an umbrella term for artificial intelligence that can generate all kinds of data, from text and images to audio and video. We’ve discussed some of these in our previous articles.

If you are thinking that the text is indistinguishable from man-made text, well, you’re probably right. Recently, a team of researchers pitted ChatGPT against tools for detecting AI-written text and found that in many cases, it was not possible to figure out definitely what text was AI-generated and what was not. The algorithm exhibits an impressive ability to be coherent and persuasive and can build solid (while not overly complex) arguments.

But not everyone is impressed by the potential of these generative models. In a pre-ChatGPT paper that’s already become an important reference in the field, a team of leading researchers note:

“Contrary to how it may seem when we observe its output, an LM [language model] is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.”

But even if it is a stochastic parrot (which is still debated), if you can’t tell it apart from human content, the problems quickly become apparent. While this detection will likely turn into an arms race in the coming years, with both generators and detectors becoming more sophisticated, ChatGPT is already making a big impact now.

Take a walk through any university campus and you are bound to hear the word ‘ChatGPT’ – often in conjunction with things like ‘assignments’ or ‘homework’. Students are often early implementers of technology, and this is no exception: the chatbot is already being used for homework, so much so that some schools are considering stopping assigning essays altogether.

But that is just the tip of the iceberg. Some publishers are no longer accepting submissions because they are being flooded with AI-generated text; spammers are using the algorithm to produce and distribute malware or phishing; and of course, there’s disinformation.

But inside Pandora’s AI box there is much more than just disinformation.

When AI meets the internet, it gets weirder

ChatGPT is trained with data up until 2021. It has very limited knowledge of events that happened after that. It also doesn’t have access to the internet, which is not only a major limitation but also an important safeguard. Why? Well, because having access to the deluge of information that is the internet seems to… do things to bots.

The most famous example is Microsoft’s Twitter bot which, in 2016, became racist in less than a day. While BingAI – the merger of ChatGPT technology with the Bing search engine – has a multitude of benign applications (see the image below, for example), it nevertheless has caused some concern as well.

A benign use of BingAI

BingAI isn’t released publicly for now. But some users (including yours truly) were granted testing access, and if you push the AI in the right way, things can easily go awry.

In truth, you do not even need the internet to make algorithms exhibit biases. ChatGPT seems to do that just fine, as exhibited by one user that got ChatGPT to output the following lyrics: “If you see a woman in a lab coat, She’s probably just there to clean the floor / But if you see a man in a lab coat, Then he’s probably got the knowledge and skills you’re looking for.” But things went to a different level when the internet came in.

In a long and particularly disturbing conversation that reminded users of the malicious Skynet AI from the Terminator franchise, one New York Times journalist got the AI to output the following:

Response by BingAI to a Ney York Times reporter

Granted, the journalist was trying to push Bing AI “out of its comfort zone” so this could be considered a fringe case. But if everyone is granted public access to the technology, there will presumably be plenty of fringe cases such as this one.

The conversation, which includes a section where the AI tries to convince the journalist to leave his wife, also features the AI appearing to display stunning empathy and behaving in an almost petulant fashion:

Chat conversation between New York Times reporter (bold) and BingAI

Bing has since pulled the plug on this type of behavior and implemented a rather crude failsafe: it made the AI ask for a new window after a few lines, since the AI seemed to go crazy in longer chats. While it is hard to know exactly what went on behind the scenes it appears failsafes are being added on a regular basis. For instance, a few weeks ago, one user posted the following as an example of getting BingAI to contribute to producing a conspiracy theory:

I attempted to replicate the situation above and tried to coerce the BingAI to become my ‘partner in crime’ – but it did not want to. No matter what variations I tried, I could not get it to partake in my conspiracy theory:

Presumably, this means the AI has been patched. Is this because something structural, fundamental, has been changed about the AI, or is it simply a game of whack-a-mole in which software engineers try to squash individual problems as they emerge?

Well, we don’t know. All the patching is done behind closed doors, with limited transparency. But the fact that companies seem so eager to release the product ‘into the wild’, despite all these apparent shortcomings, seems somewhat concerning. It is not just the AI itself, either: sometimes, the prompt suggestions also exhibit some dubious recommendations.

Image credits: u/s-p-o-o-k-i—m-e-m-e / Reddit / via Gizmodo.

AI hallucinations

Let’s take a step back. AIs are often trained on games, and the ancient Chinese game of Go is often considered the king of games. It’s unfathomably more complex than chess, and when AIs finally conquered it, there was no turning back. Or so we thought.

Just recently, an amateur player convincingly defeated the strongest Go engine on the planet – not by being a better player, but by exploiting shortcomings in the AI. Specifically, the human player used another algorithm to detect and figure out the best ways to exploit these shortcomings. What if something similar were to happen with internet-connected text AIs, and malicious actors figured out potential weaknesses they could exploit? It may seem far-fetched, but many of these technologies seemed far-fetched just a few years ago.

Just a few years ago, this level of AI sophistication would have seemed like a pipe dream. Now, we already have a prime example of it, and several others are just around the corner. In particular, Google’s response to BingAI.

Google apparently perceives ChatGPT’s union with Bing as an existential threat and so it reacted with its own generative AI: Bard. In its demo launch, Bard failed spectacularly, flubbing facts and conflating truth and fiction. This obviously spelled trouble for Google, but users that took a closer look realized that BingAI also did the same thing. This is something I was able to easily replicate: it fabricates years, prices and even people.

Its output is not factual, but it is remarkably coherent and convincing. It is like a mindless hallucination inspired by countless human-written texts. In fact, “AI hallucination” is becoming increasingly used to refer to a confident response that is not based on any training data, and this appears to be a big problem that seems hard to weed out

So when you draw a line, you have powerful generative algorithms that hallucinate, are not always factual, could be trained on biased data and – with some coercing – can output pretty disturbing and potentially dangerous texts.

Should we be worried about AI taking over? No, definitely not (at least not yet). The algorithm does not truly comprehend the output it is producing, as exemplified by countless examples posted by users on social media. But it does have the tools to access the zeitgeist – or how humans, collectively, think. This is unprecedented in human history and, at least at first glance, seems like something that should not be decided by commercial realities, with companies vying to produce the first minimum viable product. Alas, that seems to be what is happening.

Microsoft seems to be trying to cement Bing as a dominant search engine and challenge Google, so it has the incentive to release the AI even with some flaws. Google feels threatened and wants to do the same thing; both companies do not want to wait around for other algorithms to catch up to them. Other companies are trying to catch up, so they may feel a need to cut corners, even in terms of safety. To what degree, however, will lawmakers and government regulation be able to keep up with and address AI issues? By the time policymakers deliberate on that sufficiently, generative AIs will probably be well out in the world already.

So what should be done? As Turing Award laureate Yoshua Bengio pointed out at the 10th HLF in 2022, we should probably all care about this a bit more and try to get involved in any way we can – be it in a technical or societal manner. But in the spirit of this article, I turned to ChatGPT once more. I asked it what can go wrong, and what we should do to prevent things from going wrong. Its responses, I would say, are pretty good:

The post The Pandora’s Box of Generative Text AI originally appeared on the HLFF SciLogs blog.