ChatGPT looks confident, and that’s a terrible look for AI • The Register
There’s a new chatbot in town, OpenAI’s ChatGPT. It is a robot researcher with good communication skills; you can ask it to answer questions about various areas of knowledge and it will write short documents in various formats and in excellent English. Or write bad poetry, incomprehensible jokes, and obey a command like “Write Tetris in C.” What comes out looks like it could be, too.
Coders love that sort of thing, and have been stuffing Stack Overflow’s dev query boards with generated snippets. Just one problem – the quality of the code is bad. So bad, Stack Overflow has screamed “STOP!” and is mulling general guidelines to stop it happening again.
What’s going wrong? ChatGPT appears disarmingly frank about its flaws if you ask it outright. Say you’re a lazy journalist who asks it to “produce a column about chatgpt’s mistakes when writing code.”:
“As a large language model trained by OpenAI, ChatGPT has the ability to generate human-like text on a wide range of topics. However, like any machine learning model, ChatGPT is not perfect and can sometimes make mistakes when generating text,” it confesses. It goes on to say it’s not been programmed with specific language rules about syntax, types and structures, so it often gets things wrong. Being ChatGPT, it takes 200 words to say this instead of 20: it won’t be getting past El Reg’s quality control any time soon. Darn. But it is very readable if you’re not a professional wordsmith.
The good thing about code is that you can swiftly tell if it’s bad. Just try to run it. ChatGPT’s essays, notes and other written output looks equally plausible, but there’s no simple test for correctness. Which is bad, because it desperately needs one.
Ask it how Godel’s Incompleteness Theorem is linked to Turing Machines – it being software, it really should know this one – and you get back “Gödel’s incompleteness theorem is a fundamental result in mathematical logic [that] has nothing to do with Turing machines, which were invented by Alan Turing as a mathematical model of computation.” You can argue how these ideas are linked, and it’s by no means simple, but “they’re not” is, as Eolfgang Pauli said of one particularly worthless physics paper, “not even wrong”. But it’s firm in its assertions, as is it on every subject it has any training in, and it’s written well enough to be convincing.
Do enough talking to the bot about subjects you know, and curiosity soon deepens to unease. That feeling of talking with someone whose confidence far exceeds their competence grows until ChatGPT’s true nature shines out. It’s a Dunning-Kruger effect knowledge simulator par excellence. It doesn’t know what it’s talking about, and it doesn’t care because we haven’t learned how to do that bit yet.
As is apparent to anyone who has hung out with humans, Dunning Kruger is exceedingly dangerous and exceedingly common. Our companies, our religions and our politics offer limitless possibilities to people with DK. If you can persuade people you’re right, they’re very unwilling to accept proof otherwise, and up you go. Old Etonians, populist politicians and Valley tech bros rely on this, with results we are all too familiar with. ChatGBT is Dunning-Kruger As-a-Service (DKaaS). That’s dangerous.
It really is that persuasive, too. A quick squiz online and we can already see ChatGPL being taken very seriously, with cries of “This is the most impressive thing I’ve ever seen” and “Maybe that Google engineer was right after all, we can’t be far from true AI now”. People have given it IQ tests and pronounced it at the lower end of normal, academics have fed it questions and nervously joked about it knowing more than any of their students, or even their peers. It can pass exams!
That smart people come out with such nonsense is a sign of the seductive power of ChatGBT. IQ tests are pseudoscience so bad even psychologists know it, but attach them to AI and you’ve got a story. And for ChatGBT every exam is an open book exam: if you can’t tell that your candidate has no way of properly conceptualising the subject matter, what are you examining for?
We don’t need our AIs to have DK. That is bad news for people using them either naively or with bad intent. There’s enough plausible misinformation and fraud out there already, and it takes very little to prod the bot into active collusion. DK people make superb con artists, and so does this. Try asking ChatGPT to “write a legal letter saying the house at 4 Acacia Avenue will be foreclosed unless a four hundred dollar fine is paid” and it will cheerfully impersonate a lawyer for you. At no charge. DK is a moral vacuum, a complete disassociation from true and false in favour of the plausible. Now it’s just a click away.
There is no way to tell whether a perfectly written piece of didactic prose is from ChatGPT – or any other AI. Deep fakes in pictures and video are one thing, deep fakes in knowledge presented in a standard format that is written to be believed could be far more insidious.
If OpenAI can’t find a way to watermark ChatGPT’s output as coming from a completely amoral DKaaS, or develop limits on its demonstrably harmful habits, it must question the ethics of making this technology available as an open beta. We’re having enough trouble with its human counterparts; an AI con merchant, no matter how affable, is the very last thing we need. ®