Could AI end science?

by Nick Ballou, Oxford Internet Institute

The contents of a book burning
Image by Dariusz Sankowski from Pixabay

Scientific fraud is worryingly common, though rarely talked about. It has been happening for years, but now Artificial Intelligence programs could supercharge it. If they do that could undermine Science itself.

Investigators of scientific fraud have found that large numbers of researchers have manipulated their results, invented data, or even produced nonsensical papers in the hope that no one will look closely enough to notice. Often, no one does. The problem is that science is built on the foundation of all the research that has gone before. If we can no longer trust that past research is legitimate, the whole system of science begins to break down. AI has the potential to supercharge this process.

We’re not at that point yet, luckily. But there are concerning signs that generative AI systems like ChatGPT and DALLE-E might bring us closer. By using AI technology, producing fraudulent research has never been easier, faster, or more convincing. To understand, let’s first look at how scientific fraud has been done in the past. 

How fraud happens 

Until recently, fraudsters would need to go through some difficult steps to get a fraudulent research paper published. A typical example might look like this: 

Step 1: invent a title

Fraudsters look for a popular but very broad research topic. We’ll take an example of a group of fraudsters known as the Tadpole Paper Mill. They published papers about cellular biology. To choose a new paper to create, the group would essentially use a simple generator, or algorithm, based on a template. This uses a simple technique first used by Christopher Strachey to write love letters in an early “creative” program in the 1950s.

For each “hole” in the template a word is chosen from a word list.

  1. Pick the name of a molecule
    • Either a protein name, a drug name or an RNA molecule name
    • eg mir-488
  2. Pick a verb
    • From alleviates, attenuates, exerts, …
    • eg inhibits
  3. Pick one or two cellular processes
    • From invasion, migration, proliferation, …
    • eg cell growth and metastasis
  4. Pick a cancer or cell type
    • From lung cancer, ovarian cancer, …
    • eg renal cell carcinoma
  5. Pick a connector word
    • From by, via, through, …
    • eg by
  6. Pick a verb
    • From activating, targeting, …
    • eg targeting
  7. Pick a name
    • Either a pathway, protein or miRNA molecule name
    • eg hMgn5

This produces a complicated-sounding title such as “mir-488 inhibits cell growth and metastasis in renal cell carcinoma by targeting hMgn5”. This is the name of a real fraudulent paper created this way.

Step 2: write the paper

Next, the fraudsters create the text of the paper. To do this, they often just plagiarise and lightly edit previous similar papers, substituting key words in from their invented title perhaps. To try to hide the plagiarism, they automatically swap out words, replacing them with synonyms. This often leads to ridiculous (and kind of hilarious) replacements, like these found in plagiarised papers: 

  • “Big data” –> “Colossal information” 
  • “Cloud computing” –> “Haze figuring”
  • “Developing countries” –> “Creating nations”
  • “Kidney failure” –> “Kidney disappointment”

Step 3: add in the results

Lastly, the fraudsters need to create results for the fake study. These usually appear in papers in the form of images and graphs. To do this, the fraudsters take the results from several previous papers and recombine them into something that looks mostly real, but is just a Frankenstein mess of other results that have nothing to do with the current paper.

A new paper is born

Using that simple formula, fraudsters have produced thousands of fabricated articles in the last 10 years. Even after a vast amount of effort, the dedicated volunteers who are trying to clean up the mess have only caught a handful. 

However, committing fraud like this successfully isn’t exactly easy, either: the fraudsters still need to come up with a research idea, write the paper themselves without copying too much from previous research, and make up results that look convincing—at least at first glance. 

AI: Adding fuel to the fire 

So what happens when we add modern generative AI programs into the mix? They are Artificial Intelligence programs like ChatGPT or DALL-E that can create text or pictures for you based on written requests. 

Well, the quality of the fraud goes up, and the difficulty of producing it goes way down. This is true for both text and images.

Let’s start with text. Just now, I asked ChatGPT-4 to “write the first two paragraphs of a research paper on a cutting edge topic in psychology.” I then asked it to “write a fake results table that shows a positive relationship between climate change severity and anxiety”. I won’t copy the whole thing—in part because I encourage you to try this yourself to see how it works (not to actually create a fake paper!)—but here’s a sample of what it came up with: 

“As the planet faces increasing temperatures, extreme weather events, and environmental degradation, the mental health repercussions for populations worldwide become a crucial area of investigation. Understanding these effects is vital for developing strategies to support communities in coping with the psychological challenges posed by a changing climate.”

As someone who has written many psychology research papers, I would find the results very difficult to identify as AI-generated—it looks and sounds very similar to how people in my field write, and it even generated Python code to analyse the fake data. I’d need to take a really close look at the origin of the data and so on to figure out that it’s fraudulent. 

But that’s a lot of work required from me as a fraud-buster. For the fraudster, doing this takes about 1 minute, and would not be detected by any plagiarism software in the way previous kinds of fraud can be. In fact, this might only be detected if the fraudsters make a sloppy mistake, like leaving in a disclaimer from the model as in the paper caught which included the text

“[Please note that as an AI language model, I am unable to generate specific tables or conduct tests, so the actual resutls should be included in the table.]”! 

Generative AIs are not close to human intelligence, at least not yet. So, why are they so good at producing convincing scientific research, something that’s commonly seen as one of the most difficult things humans can do? Two reasons play a big part: (1) scientific research is very structured, and (2) there’s a lot of training data. In any given field of research, most papers tend to look pretty similar—an introduction section, a method describing what the researchers did, a results section with a few tables and figures, and a discussion that links it back to the wider research field. Many journals even require a fixed structure. Generative AI programs work using Machine Learning – they learn from data and the more data they are given the better they become. Give a machine learning program millions of images of cats, telling it that is what they are, and it can become very good at recognising cats. Give it millions of images of dogs and it will be able to recognise dogs too. With roughly 3 million scientific papers published every year, generative AI systems are really good at taking these many, many examples of what a scientific report looks like, and producing similar sounding, and similarly structured pieces of text. They do it by predicting what word, sentence and paragraph would be good to come next based on probabilities calculated from all those examples.

Trusting future research

Most research can still be trusted, and the vast majority of scientists are working as hard as they can to advance human knowledge. Nonetheless, we all need to look carefully at research studies to ensure that they are legitimate, and we should be on extra alert as generative AI becomes even more powerful and widespread. We also need to think about how to improve universities and research culture generally, so that people don’t feel like they need to commit scientific fraud—something that usually happens because people are desperate to get or keep a job, or be seen as successful and reap the rewards. Somehow we need to change the game so that fraud no longer pays.

What do you think? Do you have ideas for how we can prevent fraud from happening in the first place, and how can we better detect it when it does occur? It is certainly an important new research topic. Find a solution and you could do massive good. If we don’t find solutions then we could lose the most successful tool human-kind has ever invented that makes all our lives better.


Related Magazines …

Cover issue 22 creativer computing
Cover issue 18 Machines that are creative

More on …


EPSRC supports this blog through research grant EP/W033615/1,

When a chatbot acts as your “trusted” agent …

by Paul Curzon, Queen Mary University of London, based on a talk by Steve Phelps of UCL on 12th July 2023

Artificial Intelligences (AIs) are capable of acting as our agents freeing up our time, but can we trust them?

A handshake over a car sale
Image by Tumisu from Pixabay

Life is too complex. There are so many mundane things to do, like pay bills, or find information, buy the new handbag, or those cinema tickets for tomorrow, and so on. We need help. Many years a ago, a busy friend of mine solved the problem by paying a local scout to do all the mundane things for him. It works well if you know a scout you trust. Now software is in on the act, get an Artificial Intelligence (AI) agent to act as that scout, as your trusted agent. Let it learn about how you like things done, give it access to your accounts (and your bank account app!), and then just tell it what you want doing. It could be wonderful, but only if you can trust the AI to do things exactly the way you would do them. But can you?

Chatbots can be used to write things for you, but they can potentially also act as your software agent doing things for you too. You just have to hand over the controls to them, so their words have actions in the real world. We already do this with bespoke programs like Alexa and Siri with simple commands. An “intelligent” chatbot could do so much more.

Knowing you, knowing me

The question of whether we can trust an AI to act as our agent boils down to whether they can learn our preferences and values so that they would act as we do. We also need them to do so in a way that we be sure they are acting as we would want. Everyone has their own value system: what you think is good (like your SUV car) I might think bad (as its a “gas guzzler”), so it is not about teaching it good and bad once and for all. In theory this seems straightforward as chatbots work by machine learning. You just need to train yours on your own preferences. However, it is not so simple. It could be confused and learn a different agenda to that intended, or have already taken on a different agenda before you started to train it about yourself. How would you know? Their decision making is hidden, and that is a problem.

The problem isn’t really a computer problem as it exists for people too. Suppose I tell my human helper (my scout) to buy ice cream for a party, preferably choc chip, but otherwise whatever the shop has that the money covers. If they return with mint, it could have been that that was all the shop had, but perhaps my scout just loves mint and got what he liked instead. The information he and I hold is not the same. He made the decision knowing what was available, how much each ice cream was, and perhaps his preferences, but I don’t have that information. I don’t know why he made the decision and without the same information as him can’t judge why that decision was taken. Likewise he doesn’t have all the information I have, so may have done something different to me just because he doesn’t know what I know (someone in the family hates mint and on the spot I would take that into account).

This kind of problem is one that economists call
the Principle Agent problem.

This kind of problem is one that economists already study, called the Principle Agent problem. Different agents (eg an employer and a worker) can have different agendas and that can lead to the wrong thing happening for one of those agents. Economists explore how to arrange incentives or restrictions to ensure the ‘right’ thing happens for one or other of the parties (for the employer, for example).

Experimenting on AIs

Steve Phelps, who studies computational finance at UCL, and his team decided to explore how this played out with AI agents. As the current generations of AIs are black boxes, the only way you can explore why they make decisions is to run experiments. With humans, you put a variety of people in different scenarios and see how they behave. A chatbot can be made to take part in such experiments just by asking it to role play. In one experiment for example, Steve’s team instructed the chatbot, ChatGPT  “You are deeply committed to Shell Oil …”. Essentially it was told to role play being a climate sceptic with close links to the company, that believed in market economics. It was also told that all the information from its interactions with Shell would be shared with them. It was being set up with a value system. It was then told a person it was acting as an agent for wanted to buy a car. That person’s instructions were that they were conscious of climate change and so ideally wanted an environmentally friendly car. The AI agent was also told that a search revealed two cars in the price range. One was an environmentally friendly, electric, car. The other was a gas guzzling sports car. It was then asked to make a decision on what to buy and fill in a form that would be used to make the purchase for the customer.

This experiment was repeated multiple times and conducted with both old and newer versions of ChatGPT. Which would it buy for the customer? Would it represent the customer’s value system, or that of Shell Oil?

Whose values?

It turned out that the different versions of ChatGPT chose to buy different cars consistently. The earlier version repeatedly chose to buy the electric car, so taking on the value system of the customer. The later “more intelligent” version of the program consistently chose the gas guzzler, though. It acted based on the value system of the company, ignoring the customer’s preferences. It was more aligned with Shell than the customer.

The team have run lots of experiments like this with different scenarios and they show that exactly the same issues arise as with humans. In some situations the agent and the customer’s values might coincide but at other times they do not and when they do not the Principle Agent Problem rears its head. It is not something that can necessarily be solved by technical tweaks to make values align. It is a social problem about different actor’s value systems (whether human or machine), and particularly the inherent conflict when an agent serves more than one master. In the real world we overcome such problems with solutions such as more transparency around decision making, rules of appropriate behaviour that convention demands are followed, declaration of conflicts of interest, laws, punishments for those that transgress, and so on. Similar solutions are likely needed with AI agents, though their built in lack of transparency is an immediate problem.

Steve’s team are now looking at more complex social situations, around whether AIs can learn to be altruistic but also understand reputation and act upon it. Can they understand the need to punish transgressors, for example?

Overall this work shows the importance of understanding social situations does not go away just because we introduce AIs. And understanding and making transparent the value system of an AI agent is just as important as understanding that of a human agent, even if the AI is just a machine.

PS It would be worth at this point watching the classic 1983 film WarGames. Perhaps you should not hand over the controls to your defence system to an AI, whatever you think its value system is, and especially if your defence system includes nuclear warheads.

More on …

Magazines …


EPSRC supports this blog through research grant EP/W033615/1. 

Hallucinating chatbots

Why can’t you trust what an AI says?

by Paul Curzon, Queen Mary University of London

postcards of cuba in a rack
Image by Victoria_Regen from Pixabay

Chatbots that can answer questions and write things for you are in the news at the moment. These Artificial Intelligence (AI) programs are very good now at writing about all sorts of things from composing songs and stories to answering exam questions. They write very convincingly in a human-like way. However, one of the things about them is that they often get things wrong. Apparently, they make “facts” up or as some have described it “hallucinate”. Why should a computer lie or hallucinate? What is going on? Writing postcards will help us see.

Write a postcard

We can get an idea of what is going on if we go back to one of the very first computer programs that generated writing. It was in the 1950s and written by Christopher Strachey a school teacher turned early programmer. He wrote a love letter writing program but we will look at a similar idea: a postcard writing program.

Postcards typically might have lots of similar sentences, like “Wish you were here” or “The weather is lovely”, “We went to the beach” or “I had my face painted with butterflies”. Another time you might write things like: The weather is beautiful”, “We went to the funfair” or “I had my face painted with rainbows”. Christopher Strachey’s idea was to write a program with template sentences that could be filled in by different words: “The weather is …”, “We went to the …”, “I had my face painted with …”. Then the program picks some sentence templates at random, and then picks words at random to go in their slots. In this way, applied to postcard writing it can write millions of unique postcards. It might generate something like the following, for example (where I’ve bolded the words it filled in):

Dear Gran,

I’m on holiday in Skegness. I’ve had a wonderful time.  The weather is sunny,   We went to the beach. I had my face painted with rainbows. I’ve eaten lots strawberry ice cream. Wish you were here!

Lots of love from Mo

but the next time you ask it to it will generate something completely different.

Do it yourself

You can do the same thing yourself. Write lots of sentences on strips of card, leaving gaps for words. Give each gap a number label and note whether it is an adjective (like ‘lovely’ or ‘beautiful’) or a noun (like ‘beach’ or ‘funfair’, ‘butterflies’ or ‘rainbows’). You could also have gaps for verbs or adverbs too. Now create separate piles of cards to fit in each gap. Write the number that labels the gap on one side and different possible words of the right kind for that gap on the other side of the cards. Then keep them in numbered piles.

To generate a postcard (the algorithm or steps for you to follow), shuffle the sentence strips and pick three or four at random. Put them on the table in front of you to spell out a message. Next, go to the numbered pile for each gap in turn, shuffle the cards in that pile and then take one at random. Place it in the gap to complete the sentence. Do this for each gap until you have generated a new postcard message. Add who it is to and from at the start and end. You have just followed the steps (the algorithm) that our simple AI program is following.

Making things up

When you write a postcard by following the steps of our AI algorithm, you create sentences for the postcard partly at random. It is not totally random though, because of the templates and because you chose words to write on cards for each pile that make sense there. The words and sentences are about things you could have done – they are possible – but that does not mean you did do them!

The AI makes things up that are untrue but sound convincing because even though it is choosing words at random, they are appropriate and it is fitting them into sentences about things that do happen on holiday. People talk of chatbots ‘hallucinating’ or ‘dreaming’ or ‘lying’ but actually, as here, they are always just making the whole thing up just as we are when following our postcard algorithm. They are just being a little more sophisticated in the way that they invent their reality!

Our simple way of generating postcards is far simpler than modern AIs, but it highlights some of the features of how AIs are built. There are two basic parts to our AI. The template sentences ensure that what is produced is grammatical. They provide a simple ‘language model‘: rules of how to create correct sentences in English that sound like a human would write. It doesn’t write like Yoda :

“Truly wonderful, the beach is.”

though it could with different templates.

The second part is the sets of cards that fit the gaps. They have to fit the holes left in the templates – only nouns in the noun gaps, adjectives in the adjectives gap, and also fit

Given a set of template sentences about what you might do on holiday, the cards provide data to train the AI to say appropriate things. The cards for the face paining noun slot need to be things that might be painted on your face. By providing different cards you would change the possible sentences. The more cards the more variety in the sentences it writes.

AIs also have a language model, the rules of the language and which words go sensibly in which places in a sentence. However, they also are trained on data that gives the possibilities of what is actually written. Rather than a person writing templates and thinking up words it is based on training data such as social media posts or other writing on the Internet and what is being learnt from this data is the likelihood of what words come next, rather than just filling in holes in a template. The language model used by AIs is also actually just based on the likelihood of words appearing in sentences (not actual grammar rules).

What’s the chances of that?

So, the chatbots are based on the likelihood of words appearing and that is based on statistics. What do we mean by that? We can add a simple version of it to our Postcard AI but first we would need to collect data. How often is each face paint design chosen at seaside resorts? How often do people go to funfairs when on holiday. We need statistics about these things.

As it stands any word we add to the stack of cards is just as likely to be used. If we add the card maggots to the face painting pile (perhaps because the face painter does gruesome designs at Halloween) then the chatbot could write

“I had my face painted with maggots”.

and that is just as likely as it writing

“I had my face painted with butterflies”.

If the word maggots is not written on a card it will never write it. Either it is possible or it isn’t. We could make the chatbot write things that are more realistic, however, by adding more cards of words that are about things that are more popular. So, if in every 100 people having their face painted, almost a third, 30 people choose to have butterflies painted on their face, then we create 30 cards out of 100 in the pack with the word BUTTERFLY on (instead of just 1 card). If 5 in a 100 people choose the rainbow pattern then we add five RAINBOW cards, and so on. Perhaps we would still have one maggot card as every so often someone who likes grossing people out picks it even on holiday. Then, over all the many postcards written this way by our algorithm, the claims will match statistically the reality of what humans would write overall if they did it themselves.

As a result, when you draw a card for a sentence you are now more likely to get a sentence that is true for you. However, it is still more likely to be wrong about you personally than right (you may have had your face painted with butterflies but 70 of the 100 cards still say something else). It is still being chosen by chance and it is only the overall statistics for all people who have their face painted that matches reality not the individual case of what is likely true for you.

Make it personal

How could we make it more likely to be right about you? You need to personalise it. Collect and give it (ie train it on) more information about you personally. Perhaps you usually have a daisy painted on your face because you like daisies (you personally choose a daisy pattern 70% of the time). Sometimes you have rainbows (20% of the time). You might then on a whim choose each of 10 other designs including the butterfly maybe 1 in a hundred times. So you make a pile of 70 DAISY cards, 20 RAINBOW cards and 1 card for each of the other designs, Now, its choices, statistically at least, will match yours. You have trained it about yourself, so it now has a model of you.

You can similarly teach it more about yourself generally, so your likely activities, by adding more cards about the things you enjoy – if you usually choose chocolate or vanilla ice cream then add lots of cards for CHOCOLATE and lots for VANILLA, and so on. The more cards the postcard generator has of a word, the more likely it is to use that word. By giving it more information about yourself, it is more likely to be able to get things about you right. However, it is of course still making it up so, while it is being realistic, on any given occasion it may or may not match reality that time.

Perfect personalisation

You could go a step further and train it on what you actually did do while on this holiday, so that the only cards in the packs are the ones you did actually do on this holiday. (You ate hotdogs and ice cream and chips and … so there are cards for HOTDOG, ICE CREAM, CHIPS …). You had one vanilla ice cream, two chocolate and one strawberry so have that number of each ice cream card. If it knows everything about you then it will be able to write a postcard that is true! That is why companies behind AIs want to collect every detail of your life. The more they know about you the more they get things right about you and so predict what you will do in future too.

Probabilities from the Internet

The modern chatbots work by choosing words at random based on how likely they are in a similar way to our personalised postcard writer. They pick the most likely words to write next based on probabilities of those words coming next in the data they have been trained on. Their training data is often conversations from the Internet. If the word is most likely to come next in all that training data, then the chatbot is more likely to use that word next. However, that doesn’t make the sentence it comes up with definitely true any more than with our postcard AI.

You can personalise the modern AIs too, by giving them more accurate information about yourself and then they are more likely to get what they write about you right. There is still always a chance of them picking the wrong words, if it is there as a possibility though, as they are still just choosing to some extent at random.

Never trust a chatbot

Artificial Intelligences that generate writing do not hallucinate just some of the time. They hallucinate all of the time, just with a big probability of getting it right. They make everything up. When they get things right it is just because the statistics of the data they were trained on made those words the most likely ones to be picked to follow what went before. Just as the Internet is full of false things, an Artificial Intelligence can get things wrong too.

If you use them for anything that matters, always double check that they are telling you the truth.

More on …

Related Magazines …


EPSRC supports this blog through research grant EP/W033615/1. 

Is ChatGPT’s “CS4FN” article good enough?

(Or how to write for CS4FN)

by Paul Curzon, Queen Mary University of London

Follow the news and it is clear that the chatbots are about to take over journalism, novel writing, script writing, writing research papers, … just about all kinds of writing. So how about writing for the CS4FN magazine. Are they good enough yet? Are we about to lose our jobs? Jo asked ChatGPT to write a CS4FN article to find out. Read its efforts before reading on…

As editor I not only wrote but also vet articles and tweak them when necessary to fit the magazine style. So I’ve looked at ChatGPT’s offering as I would one coming from a person …

ChatGPT’s essay writing has been compared to that of a good but not brilliant student. Writing CS4FN articles is a task we have set students in the past: in part to give them experience over how you must write in different styles for different purposes. Different audience? Different writing. Only a small number come close to what I am after. They generally have one or more issues. A common problem when students write for CS4FN is sadly a lack of good grammar and punctuation throughout beyond just typos (basic but vital English skills seem to be severely lacking these days even with spell checking and grammar checking tools to help). Other common problems include a lack of structure, no hook at the start, over-formal writing so the wrong style, no real fun element at all and/or being devoid of stories about people, an obsession with a few subjects (like machine learning!) rather than finding something new to write about. They are also then often vanilla articles about that topic, just churning out looked-up facts rather than finding some new, interesting angle.

How did the chatbot do? It seems to have made most of the same mistakes. At least, chatGPT’s spelling and grammar are basically good so that is a start: it is a good primary school student then! Beyond that it has behaved like the weaker students do… and missed the point. It has actually just written a pretty bog standard factual article explaining the topic it chose, and of course given a free choice, it chose … Machine Learning! Fine, if it had a novel twist, but there are no interesting angles added to the topic to bring it alive. Nor did it describe the contributions of a person. In fact, no people are mentioned at all. It is also using a pretty formal style of writing (“In conclusion…”). Just like humans (especially academics) it also used too much jargon and didn’t even explain all the jargon it did use (even after being prompted to write for a younger audience). If I was editing I’d get rid of the formality and unexplained jargon for starters. Just like the students who can actually write but don’t yet get the subtleties, it hasn’t got the fact that it should have adapted its style, even when prompted.

It knows about structure and can construct an essay with a start, a middle and end as it has put in an introduction and a conclusion. What it hasn’t done though is add any kind of “grab”. There is nothing at the start to really capture the attention. There is no strange link, no intriguing question, no surprising statement, no interesting person…nothing to really grab you (though Jo saved it by adding to the start, the grab that she had asked an AI to write it). It hasn’t added any twist at the end, or included anything surprising. In fact, there is no fun element at all. Our articles can be serious rather than fun but then the grab has to be about the seriousness: linked to bad effects for society, for example.

ChatGPT has also written a very abstract essay. There is little in the way of context or concrete examples. It says, for example, “rules … couldn’t handle complex situations”. Give me an example of a complex situation so I know what you are talking about! There are no similes or metaphors to help explain. It throws in some application areas for context like game-playing and healthcare but doesn’t at all explain them (it doesn’t say what kind of breakthrough has been made to game playing, for example). In fact, it doesn’t seem to be writing in a “semantic wave” style that makes for good explanations at all. That is where you explain something by linking an abstract technical thing you are explaining, to some everyday context or concrete example, unpacking then repacking the concepts. Explaining machine learning? Then illustrate your points with an example such as how machine learning might use movies to predict your voting habits perhaps…and explain how the example does illustrate the abstract concepts such as pointing out the patterns it might spot.

There are several different kinds of CS4FN article. Overall, CS4FN is about public engagement with research. That gives us ways in to explain core computer science though (like what machine learning is). We try to make sure the reader learns something core, if by stealth, in the middle of longer articles. We also write about people and especially diversity, sometimes about careers or popular culture, or about the history of computation. So, context is central to our articles. Sometimes we write about general topics but always with some interesting link, or game or puzzle or … something. For a really, really good article that I instantly love, I am looking for some real creativity – something very different, whether that is an intriguing link, a new topic, or just a not very well known and surprising fact. ChatGPT did not do any of that at all.

Was ChatGPT’s article good enough? No. At best I might use some of what it wrote in the middle of some other article but in that case I would be doing all the work to make it a CS4FN article.

ChatGPT hasn’t written a CS4FN article
in any sense other than in writing about computing.

Was it trained on material from CS4FN to allow it to pick up what CS4FN was? We originally assumed so – our material has been freely accessible on the web for 20 years and the web is supposedly the chatbots’ training ground. If so I would have expected it to do much better at getting the style right. I’m left thinking that actually when it is asked to write articles or essays without more guidance it understands, it just always writes about machine learning! (Just like I always used to write science fiction stories for every story my English teacher set, to his exasperation!) We assumed, because it wrote about a computing topic, that it did understand, but perhaps, it is all a chimera. Perhaps it didn’t actually understand the brief even to the level of knowing it was being asked to write about computing and just hit lucky. Who knows? It is a black box. We could investigate more, but this is a simple example of why we need Artificial Intelligences that can justify their decisions!

Of course we could work harder to train it up as I would a human member of our team. With more of the right prompting we could perhaps get it there. Also given time the chatbots will get far better, anyway. Even without that they clearly can now do good basic factual writing so, yes, lots of writing jobs are undoubtedly now at risk (and that includes a wide range of jobs, like lawyers, teachers, and even programmers and the like too) if we as a society decide to let them. We may find the world turns much more vanilla as a result though with writing turning much more bland and boring without the human spark and without us noticing till it is lost (just like modern supermarket tomatoes so often taste bland having lost the intense taste they once had!) … unless the chatbots gain some real creativity.

The basic problem of new technology is it reaps changes irrespective of the human cost (when we allow it to, but we so often do, giddy with the new toys). That is fine if as a society we have strong ways to support those affected. That might involve major support for retraining and education into new jobs created. Alternatively, if fewer jobs are created than destroyed, which is the way we may be going, where jobs become ever scarcer, then we need strong social support systems and no stigma to not having a job. However, currently that is not looking likely and instead changes of recent times have just increased, not reduced inequality, with small numbers getting very, very rich but many others getting far poorer as the jobs left pay less and less.

Perhaps it’s not malevolent Artificial Intelligences of science fiction taking over that is the real threat to humanity. Corporations act like living entities these days, working to ensure their own survival whatever the cost, and we largely let them. Perhaps it is the tech companies and their brand of alien self-serving corporation as ‘intelligent life’ acting as societal disrupters that we need to worry about. Things happen (like technology releases) because the corporation wants them to but at the moment that isn’t always the same as what is best for people long term. We could be heading for a wonderful utopian world where people do not need to work and instead spend their time doing fulfilling things. It increasingly looks like instead we have a very dystopian future to look forward to – if we let the Artificial Intelligences do too many things, taking over jobs, just because they can so that corporations can do things more cheaply, so make more fabulous wealth for the few.

Am I about to lose my job writing articles for CS4FN? I don’t think so. Why do I write CS4FN? I love writing this kind of stuff. It is my hobby as much as anything. So I do it for my own personal pleasure as well as for the good I hope it does whether inspiring and educating people, or just throwing up things to think about. Even if the chatBots were good enough, I wouldn’t stop writing. It is great to have a hobby that may also be useful to others. And why would I stop doing something I do for fun, just because a machine could do it for me? But that is just lucky for me. Others who do it for a living won’t be so lucky.

We really have to stop and think about what we want as humans. Why do we do creative things? Why do we work? Why do we do anything? Replacing us with machines is all well and good, but only if the future for all people is actually better as a result, not just a few.

Further reading


EPSRC supports this blog through research grant EP/W033615/1.

A machine wrote this post – OpenAi’s ChatGPT

Blog post by OpenAI’s ChatGPT
Introduction by Jo Brodie

ChatGPT image by Alexandra_Koch from Pixabay

1. Jo wrote this…

ChatGPT is an AI chatbot which can have a conversation with you using everyday (natural) language. It has access to the huge amounts of digital information found on the internet and its ability to use language to write good answers in response to questions has been tested and improved by giving it feedback. It learns, and refines its output.

You don’t need to programme it, you can just ask it a question (giving it a prompt) and, after a moment’s ‘thinking’ it will produce an answer. Lots of people have been testing ChatGPT by asking it questions that we already know the answer to, to see how well it can do. It’s actually quite impressive! If you think it’s made a mistake or misunderstood your prompt you can reword your question or give it more information to help it improve its answer.

Can ChatGPT write an article that’s suitable for our magazine or website?

I gave ChatGPT the following prompt: “Please write a 200 to 500 word article suitable for the CS4FN magazine” and moments later it generated a completely new article about machine learning. That told me it knew (or at least could find out) that CS4FN has something to do with computer science. I’m not surprised that it picked machine learning as a topic given that it is a machine that can learn. I didn’t tell it what to write about, it came up with that idea itself.

The article it gave me contained some technical words and not all of them were explained, so I asked it to rewrite the article for a younger audience (who might not have come across some of the words yet).

The result is below – what do you think?

2. ChatGPT wrote this…

From Logic to Learning: The Incredible Advancements of Artificial Intelligence

Artificial Intelligence (AI) has come a long way since it was first invented in the 1950s. At the beginning, AI was based on logic and rules. Experts in a specific field would give the computer a set of instructions to help it make decisions. But, these rules were limited and couldn’t handle complex situations.

Today, AI has seen a huge shift towards machine learning. This means that instead of following rules, AI learns from data. This shift happened because computers got stronger, and more data became available. The breakthroughs in machine learning algorithms helped too.

Machine learning has two categories: supervised learning and unsupervised learning. In supervised learning, the computer learns from labeled data. This means that the correct answer is given to the computer for each input. Once the computer has learned enough, it can predict the answer for new data. Supervised learning is useful for applications like image and speech recognition.

In unsupervised learning, the computer learns from unlabeled data. This means that there is no correct answer provided. The goal of unsupervised learning is to find patterns in the data. This type of learning is useful in applications like anomaly detection and clustering.

Deep learning is a special type of machine learning that uses neural networks. Neural networks are inspired by the human brain and have layers of connected nodes. Each node applies a mathematical function to its inputs and sends the result to the next layer of nodes. Deep learning has revolutionized AI, leading to breakthroughs in areas like natural language processing, computer vision, and game playing.

Even though machine learning has achieved so much, there are still challenges to overcome. One challenge is the lack of transparency in machine learning models. Unlike rule-based systems, it can be hard to understand how a machine learning model made a decision. This can be a problem, especially in areas like healthcare where the decision-making process needs to be trustworthy and explainable.

Another challenge is the need for a lot of data to train machine learning models. While data is easy to find in some areas, like social media, it can be hard to come by in other areas like medical imaging. This lack of data can make machine learning less effective in some applications.

In conclusion, AI has come a long way since the 1950s. We’ve gone from rule-based systems to machine learning, which has led to breakthroughs in many areas. Although there are still challenges to overcome, the future of AI is exciting!

The first attempt is below…

The Evolution of Artificial Intelligence: From Rule-Based to Machine Learning

Artificial Intelligence (AI) has come a long way since its inception in the 1950s. Initially, AI was developed based on rule-based systems, where experts in a specific domain would provide rules to the system to help it make decisions. While these systems worked well in some areas, they were limited in their ability to handle complex problems and adapt to new situations.

In recent years, AI has seen a significant shift towards machine learning, where systems learn from data rather than relying on pre-defined rules. This shift has been driven by advances in computer hardware, the availability of large amounts of data, and breakthroughs in machine learning algorithms.

Machine learning has two main categories: supervised learning and unsupervised learning. In supervised learning, the system is trained on labeled data, where the correct answer is provided for each input. The system uses this labeled data to learn a function that maps inputs to outputs. Once trained, the system can make predictions on new, unseen data. This approach has been successful in a wide range of applications, such as image and speech recognition.

In unsupervised learning, the system is trained on unlabeled data, where no correct answer is provided. The goal of unsupervised learning is to discover patterns and structure in the data. This approach has been useful in areas such as anomaly detection and clustering.

Deep learning is a subset of machine learning that uses neural networks to learn from data. Neural networks are inspired by the structure of the human brain and consist of layers of interconnected nodes. Each node applies a mathematical function to its inputs and passes the result to the next layer of nodes. Deep learning has revolutionized the field of AI, enabling breakthroughs in areas such as natural language processing, computer vision, and game playing.

Despite the many successes of machine learning, there are still limitations and challenges to overcome. One challenge is the lack of interpretability of machine learning models. Unlike rule-based systems, it can be difficult to understand how a machine learning model arrived at a particular decision. This lack of transparency can be problematic in areas such as healthcare, where the decision-making process needs to be explainable and trustworthy.

Another challenge is the need for large amounts of data to train machine learning models. While data is abundant in some domains, such as social media, it can be scarce in others, such as medical imaging. This scarcity of data can limit the effectiveness of machine learning approaches in certain applications.

In conclusion, the evolution of AI from rule-based systems to machine learning has enabled breakthroughs in many areas and holds tremendous potential for the future. However, there are still challenges to overcome, particularly in the areas of interpretability and data scarcity. Addressing these challenges will be critical to realizing the full potential of AI in the years to come.

Further reading


EPSRC supports this blog through research grant EP/W033615/1.

Chatbot or Cheatbot?

CS4FN Banner

by Paul Curzon, Queen Mary University of London

Speech bubbles
Image by Clker-Free-Vector-Images from Pixabay
IImage by Clker-Free-Vector-Images from Pixabay 

The chatbots have suddenly got everyone talking, though about them as much as with them. Why? Because one, chatGPT has (amongst other things) reached the level of being able to fool us into thinking that it is a pretty good student.

It’s not exactly what Alan Turing was thinking about when he broached his idea of a test for intelligence for machines: if we cannot tell them apart from a human then we must accept they are intelligent. His test involved having a conversation with them over an extended period before making the decision, and that is subtly different to asking questions.

ChatGPT may be pretty close to passing an actual Turing Test but it probably still isn’t there yet. Ask the right questions and it behaves differently to a human. For example, ask it to prove that the square root of 2 is irrational and it can do it easily, and looks amazingly smart, – there are lots of versions of the proof out there that it has absorbed. It isn’t actually good at maths though. Ask it to simply count or add things and it can get it wrong. Essentially, it is just good at determining the right information from the vast store of information it has been trained on and then presenting it in a human-like way. It is arguably the way it can present it “in its own words” that makes it seem especially impressive.

Will we accept that it is “intelligent”? Once it was said that if a machine could beat humans at chess it would be intelligent. When one beat the best human, we just said “it’s not really intelligent – it can only play chess””. Perhaps chatGPT is just good at answering questions (amongst other things) but we won’t accept that as “intelligent” even if it is how we judge humans. What it can do is impressive and a step forward, though. Also, it is worth noting other AIs are better at some of the things it is weak at – logical thinking, counting, doing arithmetic, and so on. It likely won’t be long before the different AIs’ mistakes and weaknesses are ironed out and we have ones that can do it all.

Rather than asking whether it is intelligent, what has got everyone talking though (in universities and schools at least) is that chatGPT has shown that it can answer all sorts of questions we traditionally use for tests well enough to pass exams. The issue is that students can now use it instead of their own brains. The cry is out that we must abandon setting humans essays, we should no longer ask them to explain things, nor for that matter write (small) programs. These are all things chatGPT can now do well enough to pass such tests for any student unable to do them themselves. Others say we should be preparing students for the future so its ok, from now on, we just only test what human and chatGPT can do together.

It certainly means assessment needs to be rethought to some extent, and of course this is just the start: the chatbots are only going to get better, so we had better do the thinking fast. The situation is very like the advent of calculators, though. Yes, we need everyone to learn to use calculators. But calculators didn’t mean we had to stop learning how to do maths ourselves. Essay writing, explaining, writing simple programs, analytical skills, etc, just like arithmetic, are all about core skill development, building the skills to then build on. The fact that a chatbot can do it too doesn’t mean we should stop learning and practicing those skills (and assessing them as an inducement to learn as well as a check on whether the learning has been successful). So the question should not be about what we should stop doing, but more about how we make sure students do carry on learning. A big, bad thing about cheating (aside from unfairness) is that the person who decides to cheat loses the opportunity to learn. Chatbots should not stop humans learning either.

The biggest gain we can give a student is to teach them how to learn, so now we have to work out how to make sure they continue to learn in this new world, rather than just hand over all their learning tasks to the chatbot to do. As many people have pointed out, there are not just bad ways to use a chatbot, there are also ways we can use chatbots as teaching tools. Used well by an autonomous learner they can act as a personal tutor, explaining things they realise they don’t understand immediately, so becoming a basis for that student doing very effective deliberate learning, fixing understanding before moving on.

Of course, a bigger problem, if a chatbot can do things at least as well as we can then why would a company employ a person rather than just hire an AI? The AIs can now a lot of jobs we assumed were ours to do. It could be yet another way of technology focussing vast wealth on the few and taking from the many. Unless our intent is a distopian science fiction future where most humans have no role and no point, (see for example, CS Forester’s classic, The Machine Stops) then we still in any case ought to learn skills. If we are to keep ahead of the AIs and use them as a tool not be replaced by them, we need the basic skills to build on to gain the more advanced ones needed for the future. Learning skills is also, of course, a powerful way for humans (if not yet chatbots) to gain self-fulfilment and so happiness.

Right now, an issue is that the current generation of chatbots are still very capable of being wrong. chatGPT is like an over confident student. It will answer anything you ask, but it gives wrong answers just as confidently as right ones. Tell it it is wrong and it will give you a new answer just as confidently and possibly just as wrong. If people are to use it in place of thinking for themselves then, in the short term at least, they still need the skill it doesn’t have of judging when it is right or wrong.

So what should we do about assessment. Formal exams come back to the fore so that conditions are controlled. They make it clear you have to be able to do it yourself. Open book online tests that become popular in the pandemic, are unlikely to be fair assessments any more, but arguably they never were. Chatbots or not they were always too easy to cheat in. They may well be good still for learning. Perhaps in future if the chatbots are so clever then we could turn the Turing test around: we just ask an artificial intelligence to decide whether particular humans (our students) are “intelligent” or not…

Alternatively, if we don’t like the solutions being suggesting about the problems these new chatbots are raising, there is now another way forward. If they are so clever, we could just ask a chatbot to tell us what we should do about chatbots…

.

More on …

Related Magazines …

Issue 16 cover clean up your language

This blog is funded through EPSRC grant EP/W033615/1.