Chatbot or Cheatbot?

by Paul Curzon, Queen Mary University of London

Speech bubbles
Image by Clker-Free-Vector-Images from Pixabay
IImage by Clker-Free-Vector-Images from Pixabay 

The chatbots have suddenly got everyone talking, though about them as much as with them. Why? Because one, chatGPT has (amongst other things) reached the level of being able to fool us into thinking that it is a pretty good student.

It’s not exactly what Alan Turing was thinking about when he broached his idea of a test for intelligence for machines: if we cannot tell them apart from a human then we must accept they are intelligent. His test involved having a conversation with them over an extended period before making the decision, and that is subtly different to asking questions.

ChatGPT may be pretty close to passing an actual Turing Test but it probably still isn’t there yet. Ask the right questions and it behaves differently to a human. For example, ask it to prove that the square root of 2 is irrational and it can do it easily, and looks amazingly smart, – there are lots of versions of the proof out there that it has absorbed. It isn’t actually good at maths though. Ask it to simply count or add things and it can get it wrong. Essentially, it is just good at determining the right information from the vast store of information it has been trained on and then presenting it in a human-like way. It is arguably the way it can present it “in its own words” that makes it seem especially impressive.

Will we accept that it is “intelligent”? Once it was said that if a machine could beat humans at chess it would be intelligent. When one beat the best human, we just said “it’s not really intelligent – it can only play chess””. Perhaps chatGPT is just good at answering questions (amongst other things) but we won’t accept that as “intelligent” even if it is how we judge humans. What it can do is impressive and a step forward, though. Also, it is worth noting other AIs are better at some of the things it is weak at – logical thinking, counting, doing arithmetic, and so on. It likely won’t be long before the different AIs’ mistakes and weaknesses are ironed out and we have ones that can do it all.

Rather than asking whether it is intelligent, what has got everyone talking though (in universities and schools at least) is that chatGPT has shown that it can answer all sorts of questions we traditionally use for tests well enough to pass exams. The issue is that students can now use it instead of their own brains. The cry is out that we must abandon setting humans essays, we should no longer ask them to explain things, nor for that matter write (small) programs. These are all things chatGPT can now do well enough to pass such tests for any student unable to do them themselves. Others say we should be preparing students for the future so its ok, from now on, we just only test what human and chatGPT can do together.

It certainly means assessment needs to be rethought to some extent, and of course this is just the start: the chatbots are only going to get better, so we had better do the thinking fast. The situation is very like the advent of calculators, though. Yes, we need everyone to learn to use calculators. But calculators didn’t mean we had to stop learning how to do maths ourselves. Essay writing, explaining, writing simple programs, analytical skills, etc, just like arithmetic, are all about core skill development, building the skills to then build on. The fact that a chatbot can do it too doesn’t mean we should stop learning and practicing those skills (and assessing them as an inducement to learn as well as a check on whether the learning has been successful). So the question should not be about what we should stop doing, but more about how we make sure students do carry on learning. A big, bad thing about cheating (aside from unfairness) is that the person who decides to cheat loses the opportunity to learn. Chatbots should not stop humans learning either.

The biggest gain we can give a student is to teach them how to learn, so now we have to work out how to make sure they continue to learn in this new world, rather than just hand over all their learning tasks to the chatbot to do. As many people have pointed out, there are not just bad ways to use a chatbot, there are also ways we can use chatbots as teaching tools. Used well by an autonomous learner they can act as a personal tutor, explaining things they realise they don’t understand immediately, so becoming a basis for that student doing very effective deliberate learning, fixing understanding before moving on.

Of course, a bigger problem, if a chatbot can do things at least as well as we can then why would a company employ a person rather than just hire an AI? The AIs can now a lot of jobs we assumed were ours to do. It could be yet another way of technology focussing vast wealth on the few and taking from the many. Unless our intent is a distopian science fiction future where most humans have no role and no point, (see for example, CS Forester’s classic, The Machine Stops) then we still in any case ought to learn skills. If we are to keep ahead of the AIs and use them as a tool not be replaced by them, we need the basic skills to build on to gain the more advanced ones needed for the future. Learning skills is also, of course, a powerful way for humans (if not yet chatbots) to gain self-fulfilment and so happiness.

Right now, an issue is that the current generation of chatbots are still very capable of being wrong. chatGPT is like an over confident student. It will answer anything you ask, but it gives wrong answers just as confidently as right ones. Tell it it is wrong and it will give you a new answer just as confidently and possibly just as wrong. If people are to use it in place of thinking for themselves then, in the short term at least, they still need the skill it doesn’t have of judging when it is right or wrong.

So what should we do about assessment. Formal exams come back to the fore so that conditions are controlled. They make it clear you have to be able to do it yourself. Open book online tests that become popular in the pandemic, are unlikely to be fair assessments any more, but arguably they never were. Chatbots or not they were always too easy to cheat in. They may well be good still for learning. Perhaps in future if the chatbots are so clever then we could turn the Turing test around: we just ask an artificial intelligence to decide whether particular humans (our students) are “intelligent” or not…

Alternatively, if we don’t like the solutions being suggesting about the problems these new chatbots are raising, there is now another way forward. If they are so clever, we could just ask a chatbot to tell us what we should do about chatbots…

.

More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.

The last speaker

by Paul Curzon, Queen Mary University of London

(from the cs4fn archive)

The wings of a green macau looking like angel wings
Image by Avlis AVL from Pixabay

The languages of the world are going extinct at a rapid rate. As the numbers of people who still speak a language dwindle, the chance of it surviving dwindles too. As the last person dies, the language is gone forever. To be the last living speaker of the language of your ancestors must be a terribly sad ordeal. One language’s extinction bordered on the surreal. The last time the language of the Atures, in South America was heard, it was spoken by a parrot: an old blue-and-yellow macaw, that had survived the death of all the local people.

Why do languages die?

The reason smaller languages die are varied, from war and genocide, to disease and natural disaster, to the enticement of bigger, pushier languages. Can technology help? In fact global media: films, music and television are helping languages to die, as the youth turn their backs on the languages of their parents. The Web with its early English bias may also be helping to push minority languages even faster to the brink. Computers could be a force for good though, protecting the world’s languages, rather than destroying them.

Unicode to the rescue

In the early days of the web, web pages used the English alphabet. Everything in a computer is just stored as numbers, including letters: 1 for ‘a’, 2 for ‘b’, for example. As long as different computers agree on the code they can print them to the screen as the same letter. A problem with early web pages is there were lots of different encodings of numbers to letters. Worse still only enough numbers were set aside for the English alphabet in the widely used encodings. Not good if you want to use a computer to support other languages with their variety of accents and completely different sets of characters. A new universal encoding system called Unicode came to the rescue. It aims to be a single universal character encoding – with enough numbers allocated for ALL languages. It is therefore allowing the web to be truly multi-lingual.

Languages are spoken

Languages are not just written but are spoken. Computers can help there, too, though. Linguists around the world record speakers of smaller languages, understanding them, preserving them. Originally this was done using tapes. Now the languages can be stored on multimedia computers. Computers are not just restricted to playing back recordings but can also actively speak written text. The web also allows much wider access to such materials that can also be embedded in online learning resources, helping new people to learn the languages. Language translators such as BabelFish and Google Translate can also help, though they are still far from perfect even for common languages. The problem is that things do not translate easily between languages – each language really does constitute a different way of thinking, not just of talking. Some thoughts are hard to even think in a different language.

AI to the rescue?

Even that is not enough. To truly preserve a language, the speakers need to use it in everyday life, for everyday conversation. Speakers need someone to speak with. Learning a language is not just about learning the words but learning the culture and the way of thinking, of actively using the language. Perhaps future computers could help there too. A long-time goal of artificial intelligence (AI) researchers is to develop computers that can hold real conversations. In fact this is the basis of the original test for computer intelligence suggested by Alan Turing back in 1950…if a computer is indistinguishable from a human in conversation, then it is intelligent. There is also an annual competition that embodies this test: the Loebner Prize. It would be great if in the future, computer AIs could help save languages by being additional everyday speakers holding real conversations, being real friends.

Time is running out…
by the time the AIs arrive,
the majority of languages may be gone forever.

Too late?

The problem is that time is running out. Artificial intelligences that can have totally realistic human conversations even in English are still a way off. None have passed the Turing Test. To speak different languages really well for everyday conversations those AIs will have to learn the different cultures and ‘think’ in the different languages. The window of opportunity is disappearing. By the time the AIs arrive the majority of human languages may be gone forever. Let’s hope that computer scientists and linguists do solve the problems in time, and that computers are not used just to preserve languages for academic interest, but really can help them to survive. It is sad that the last living creature to speak Atures was a parrot. It would be equally sad if the last speakers of all current languages bar English, Spanish and Chinese say, were computers.

More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.

The joke Turing test

A funny thing happened on the way to the computer

by Peter W. McOwan, Queen Mary University of London

(from the archive)

A cabbage smiling at you
Image by Lynn Greyling from Pixabay

Laugh and the world laughs with you they say, but what if you’re a computer. Can a computer have a ‘sense of humour’?

Computer generated jokes can do more than give us a laugh. Human language in jokes can often be ambiguous: words can have two meanings. For example the word ‘bore’ can mean a person who is uninteresting or could be to do with drilling … and if spoken it could be about a male pig. It’s often this slip between the meaning of words that makes jokes work (work that joke out for yourself). To be able to understand how human based humour works, and build a computer program that can make us laugh will give us a better understanding of how the human mind works … and human minds are never boring.

Many researchers believe that jokes come from the unexpected. As humans we have a brain that can try to ‘predict the future’, for example when catching a fast ball our brains have a simple learned mathematical model of the physics so we can predict where the ball will be and catch it. Similarly in stories we have a feel for where it should be going, and when the story takes an unexpected turn, we often find this funny. The shaggy dog story is an example; it’s a long series of parts of a story that build our expectations, only to have the end prove us wrong. We laugh (or groan) when the unexpected twist occurs. It’s like the ball suddenly doing three loop-the-loops then stopping in mid-air. It’s not what we expect. It’s against the rules and we see that as funny.

Some artificial intelligence researchers who are interested in understanding how language works look at jokes as a way to understand how we use language. Graham Richie was one early such researcher, and funnily enough he presented his work at an April Fools’ Day Workshop on Computational Humour. Richie looked at puns: simple gags that work by a play on words, and created a computer program called JAPE that generates jokes.

How do we know if the computer has a sense of humour? Well how would we know a human comic had a sense of humour? We’d get them to tell a joke. Now suppose that we had a test where we had a set of jokes, some made by humans and some by computers, and suppose we couldn’t tell the difference? If you can’t tell which is computer generated and which is human generated then the argument goes that the computer program must, in some way, have captured the human ability. This is called a Turing Test after the computer scientist Alan Turing. The original idea was to use it as a test for intelligence but we can use the same idea as a test for an ability to be funny too.

So let’s finish with a joke (and test). Which of the following is a joke created by a computer program following Richie’s theory of puns, and which is a human’s attempt? Will humans or machines have the last laugh on this test?

Have your vote: which of these two jokes do you think was written by a computer and which by a human.


1) What’s fast and wiry?

… An aircraft hanger!


2) What’s green and bounces?

… A spring cabbage!

Make your choice before scrolling down to find the answer.


More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.


The answers

Could you tell which of the two jokes was written by a human’s and which by a computer?

Lots of cs4fn readers voted over several years and the voting went:

  • 58 % votes cast believed the aircraft hanger joke is computer generated
  • 42 % votes cast believed the spring cabbage joke is computer generated

In fact …

  • The aircraft hanger joke was the work of a computer.
  • The spring cabbage joke was the human generated cracker.

If the voters were doing no better than guessing then the votes would be about 50-50: no better than tossing a coin to decide. Then the computer was doing as well at being funny as the human. A vote share of 58-42 suggests (on the basis of this one joke only) that the computer is getting there, but perhaps doesn’t quite have as good a sense of humour as the human who invented the spring cabbage joke. A real test would use lots more jokes, of course. If doing a real experiment it would also be important that they were not only generated by the human/computer but selected by them too (or possibly selected at random from ones they each picked out as their best). By using ones we selected our sense of humour could be getting in the way of a fair test.

The Chinese room: zombie attack!

by Paul Curzon, Queen Mary University of London

Jigsaw brain with pieces missing
Image by Gordon Johnson from Pixabay 

(From the cs4fn archive)

Iain M Banks’s science fiction novels about ‘The Culture’ imagine a universe inhabited (and largely run) by ‘Minds’. These are incredibly intelligent machines – mainly spaceships – that are also independently thinking conscious beings with their own personalities. From the replicants in Blade Runner and robots in Star Wars to Iain M Banks’s Minds, science fiction is full of intelligent machines. Could we ever really create a machine with a mind: not just a computer that computes, one that really thinks? Philosophers have been arguing about it for centuries. Things came to a head when philosopher John Searle came up with a thought experiment called the ‘Chinese room’. He claims it gives a cast iron argument that programmed ‘Minds’ can never exist. Are the computer scientists who are trying to build real artificial intelligences wasting their time? Or could zombies lurch to the rescue?

The Shaolin warrior monk

Imagine that the galaxy is populated by an advanced civilisation that has solved the problem of creating artificial intelligence programs. Wanting to observe us more closely they build a replicant that looks, dresses and moves just like a Shaolin warrior monk (it has to protect itself and the aliens watch too much TV!) They create a program for it that encodes the rules of Chinese. The machine is dispatched to Earth. Claiming to have taken a vow of silence, it does not speak (the aliens weren’t hot on accents). It reads Chinese characters written by the earthlings, then follows the instructions in its Chinese program that tell it the Chinese characters to write in response. It duly has written conversations with all the earthlings it meets as it wanders the planet, leaving them all in no doubt that they have been conversing with a real human Chinese speaker.

The question is, is that machine monk really a Mind? Does it really understand Chinese or is it just simulating that ability?

The Chinese room

Searle answers this by imagining a room in which a human sits. She speaks no Chinese but instead has a book of rules – the aliens’ computer program written out in English. People pass in Chinese symbols through a slot. She looks them up in the book and it tells her the Chinese symbols to pass back out. As she doesn’t understand Chinese she has no idea what the symbols coming in or going out mean. She is just uncomprehendingly following the book. Yet to the outside world she seems to be just as much a native speaker as that machine monk. She is simulating the ability to understand Chinese. As she’s using the same program as the monk, doing exactly what it would do, it follows that the machine monk is also just simulating intelligence. Therefore programs cannot understand. They cannot have a mind.

Is that machine monk a Mind?

Searle’s argument is built on some assumptions. Programs are ‘syntactic devices’: that just means they move symbols around, swapping them for others. They do it without giving those symbols any meaning. A human mind on the other hand works with ‘semantics’ – the meanings of symbols not just the symbols themselves. We understand what the symbols mean. The Chinese room is supposed to show you can’t get meaning by pushing symbols around. As any future artificial intelligence will be based on programs pushing symbols around they will not be a Mind that understands what it is doing.

The zombies are coming

So is this argument really cast iron? It has generated lots of debate, virtually all of it aiming to prove Searle wrong. The counter-arguments are varied and even the zombies have piled in to fight the cause: philosophical ones at least. What is a philosophical zombie? It’s just a human with no consciousness, no mind. One way to attack Searle’s argument is to attack the assumptions. That’s what the zombies are there to do. If the assumptions aren’t actually true then the argument falls apart. According to Searle human brains do something more than push symbols about\; they have a way of working with meaning. However, there can’t be a way of telling that by talking to one as otherwise it could have been used to tell that the machine monk wasn’t a mind.

Imagine then, there has been a nuclear accident and lots of babies are born with a genetic mutation that makes them zombies. They have no mind so no ability to understand meaning. Despite that they act exactly like humans: so much so that there is no way to tell zombies and humans apart. The zombies grow up, marry and have zombie children.

Presumably zombie brains are simpler than human ones – they don’t have whatever complication it is that introduces minds. Being simpler they have a fitness advantage that will allow them to out-compete humans. They won’t need to roam the streets killing humans to take over the world. If they wait long enough and keep having children, natural selection will do it for them.

The zombies are here

The point is it could have already happened. We could all be zombies but just don’t know it. We think we are conscious but that could just be an illusion – another simulation. We have no way to prove we are not zombies and if we could be zombies then Searle’s assumption that we are different to machines may not be true. The Chinese room argument falls apart.

Does it matter?

The arguments and counter arguments continue. To an engineer trying to build an artificial intelligence this actually doesn’t matter. Whether you have built a Mind or just something that exactly simulates one makes no practical difference. It makes a big difference to philosophers, though, and to our understanding of what it means to be human.

Let’s leave the last word to Alan Turing. He pointed out 30 years before the Chinese room was invented that it’s generally considered polite to assume that other humans are Minds like us (not zombies). If we do end up with machine intelligences so good we can’t tell they aren’t human, it would be polite to extend the assumption to them too. That would surely be the only humane thing to do.


More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.

The paranoid program

by Paul Curzon, Queen Mary University of London

One of the greatest characters in Douglas Adams’ Hitchhiker’s Guide to the Galaxy, science fiction radio series, books and film was Marvin the Paranoid Android. Marvin wasn’t actually paranoid though. Rather, he was very, very depressed. This was because as he often noted he had ‘a brain the size of a planet’ but was constantly given trivial and uninteresting jobs to do. Marvin was fiction. One of the first real computer programs to be able to converse with humans, PARRY, did aim to behave in a paranoid way, however.

PARRY was in part inspired by the earlier ELIZA program. Both were early attempts to write what we would now call chatbots: programs that could have conversations with humans. This area of Natural Language Processing is now a major research area. Modern chatbot programs rely on machine learning to learn rules from real conversations that tell them what to say in different situations. Early programs relied on hand written rules by the programmer. ELIZA, written by Joseph Weizenbaum, was the most successful early program to do this and fooled people into thinking they were conversing with a human. One set of rules, called DOCTOR, that ELIZA could use, allowed it to behave like a therapist of the kind popular at the time who just echoed back things their patient said. Weizenbaum’s aim was not actually to fool people, as such, but to show how trivial human-computer conversation was, and that with a relatively simple approach where the program looked for trigger words and used them to choose pre-programmed responses could lead to realistic appearing conversation.

PARRY was more serious in its aim. It was written by, Kenneth Colby, in the early 1970s. He was a psychiatrist at Stanford. He was trying to simulate the behaviour of person suffering from paranoid schizophrenia. It involves symptoms including the person believing that others have hostile intentions towards them. Innocent things other people say are seen as being hostile even when there was no such intention.

PARRY was based on a simple model of how those with the condition were thought to behave. Writing programs that simulate something being studied is one of the ways computer science has added to the way we do science. If you fully understand a phenomena, and have embodied that understanding in a model that describes it, then you should be able to write a program that simulates that phenomena. Once you have written a program then you can test it against reality to see if it does behave the same way. If there are differences then this suggests the model and so your understanding is not yet fully accurate. The model needs improving to deal with the differences. PARRY was an attempt to do this in the area of psychiatry. Schizophrenia is not in itself well-defined: there is no objective test to diagnose it. Psychiatrists come to a conclusion about it just by observing patients, based on their experience. Could a program display convincing behaviours?

It was tested by doing a variation of the Turing Test: Alan Turing’s suggestion of a way to tell if a program could be considered intelligent or not. He suggested having humans and programs chat to a panel of judges via a computer interface. If the judges cannot accurately tell them apart then he suggested you should accept the programs as intelligent. With PARRY rather than testing whether the program was intelligent, the aim was to find out if it could be distinguished from real people with the condition. A series of psychiatrists were therefore allowed to chat with a series of runs of the program as well as with actual people diagnosed with paranoid schizophrenia. All conversations were through a computer. The psychiatrists were not told in advance which were which. Other psychiatrists were later allowed to read the transcripts of those conversations. All were asked to pick out the people and the programs. The result was they could only correctly tell which was a human and which was PARRY about half the time. As that was about as good as tossing a coin to decide it suggests the model of behaviour was convincing.

As ELIZA was simulating a mental health doctor and PARRY a patient someone had the idea of letting them talk to each other. ELIZA (as the DOCTOR) was given the chance to chat with PARRY several times. You can read one of the conversations between them here. Do they seem believably human? Personally, I think PARRY comes across more convincingly human-like, paranoid or not!


Activity for you to do…

If you can program, why not have a go at writing your own chatbot. If you can’t writing a simple chatbot is quite a good project to use to learn as long as you start simple with fixed conversations. As you make it more complex, it can, like ELIZA and PARRY, be based on looking for keywords in the things the other person types, together with template responses as well as some fixed starter questions, also used to change the subject. It is easier if you stick to a single area of interest (make it football mad, for example): “What’s your favourite team?” … “Liverpool” … “I like Liverpool because of Klopp, but I support Arsenal.” …”What do you think of Arsenal?” …

Alternatively, perhaps you could write a chatbot to bring Marvin to life, depressed about everything he is asked to do, if that is not too depressingly simple, should you have a brain the size of a planet.


More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.

The machines can translate now

by Paul Curzon, Queen Mary University of London

(From the cs4fn archive)

“The Machines can translate now…
…I SAID ‘THE MACHINES CAN TRANSLATE NOW'”

Portion of the Rosetta Stone which has the same text written in three languages.

The stereotype of the Englishman abroad when confronted by someone who doesn’t speak English is just to say it louder. That could soon be a thing of the past as portable devices start to gain speech recognition skills and as the machines get better at translating between languages.

Traditionally machine translation has involved professional human linguists manually writing lots of translation rules for the machines to follow. Recently there have been great advances in what is known as statistical machine translation where the machine learns the translations rules automatically. It does this using a parallel corpus*: just lots of pairs of sentences; one a sentence in the original language, the other its translation. Parallel corpora* are extracted from multi-lingual news sources like the BBC web site where professional human translators have done the translations.

Let’s look at an example translation of the accompanying original arabic:

Machine Translation: Baghdad 1-1 (AFP) – The official Iraqi news agency reported that the Chinese vice-president of the Revolutionary Command Council in Iraq, Izzat Ibrahim, met today in Baghdad, chairman of the Saudi Export Development Center, Abdel Rahman al-Zamil.

Human Translation: Baghdad 1-1 (AFP) – Iraq’s official news agency reported that the Deputy Chairman of the Iraqi Revolutionary Command Council, Izzet Ibrahim, today met with Abdul Rahman al-Zamil, Managing Director of the Saudi Center for Export Development.

This example shows a sentence from an Arabic newspaper then its translation by the Queen Mary University of London’s statistical machine translator, and finally a translation by a professional human translator. The statistical translation does allow a reader to get a rough understanding of the original Arabic sentence. There are several mistakes, though.

Mistranslating the “Managing Director” of the export development center as its “chairman” is perhaps not too much of a problem. Mistranslating “Deputy Chairman” as the “Chinese vice-president” is very bad. That kind of mistranslation could easily lead to grave insults!

That reminds me of the point in ‘The Hitch-Hiker’s Guide to the Galaxy’ where Arthur Dent’s words “I seem to be having tremendous difficulty with my lifestyle,” slip down a wormhole in space-time to be overheard by the Vl’hurg commander across a conference table. Unfortunately this was understood in the Vl’hurg tongue as the most dreadful insult imaginable, resulting in them waging terrible war for centuries…

For now the human’s are still the best translators but the machines are learning from them fast!

*corpus and corpora = singular and plural for the word used to describe a collection of written texts, literally a ‘body’ of text. A corpus might be all the works written by one author, corpora might be works of several authors.

More on …