Testing AIs in Minecraft

by Paul Curzon, Queen Mary University of London

A complex Minecraft world with a lake, grasslands, mountains and a building — Image by allinonemovie from Pixabay

What makes a good environment for child AI learning development? Possibly the same as for human child learning development: Minecraft.

Lego is one of the best games to play for impactful learning development for children. The word Lego is based on the words Play and Well in Danish. In the virtual world, Minecraft has of course taken up the mantle. A large part of why they are wonderful games is because they are open-ended and flexible. There are infinite possibilities over what you can build and do. They therefore help encourage not just focussing on something limited to learn as many other games do, but support open-ended creativity and so educational development. Given how positive it can be for children, it shouldn’t be surprising that Minecraft is now being used to help AIs develop too.

Games have long been used to train and test Artificial Intelligence programs. Early programs were developed to play and ultimately beat humans at specific games like Checkers, Chess and then later Go. That mastered they started to learn to play individual arcade games as a way to extend their abilities. A key part of our intelligence is flexibility though, we can learn new games. Aiming to copy this, the AIs were trained to follow suit and so became more flexible, and showed they could learn to play multiple arcade games well.

This is still missing a vital part of our flexibility though. The thing about all these games is that the whole game experience is designed to be part of the game and so the task the player has to complete. Everything is there for a reason. It is all an integral part of the game. There are no pieces at all in a chess game that are just there to look nice and will never, ever play a part in winning or losing. Likewise all the rules matter. When problem solving in real life, though, most of the world, whether objects, the way things behave or whatever, is not there explicitly to help you solve the problem. It is not even there just to be a designed distractor. The real world also doesn’t have just a few distractors, it has lots and lots. Looking round my living room, for example, there are thousands of objects, but only one will help me turn on the tv.

AIs that are trained on games may, therefore, just become good at working in such unreal environments. They may need to be told what matters and what to ignore to solve problems. Real problems are much more messy, so put them in the real world, or even a more realistic virtual world, to problem solve and they may turn out to be not very clever at all. Tests of their skills that are based on such tasks may not really test them at all.

Researchers at the University of Witwatersrand in South Africa decided to tackle this issue, but using yet another game: Minecraft. Because Minecraft is an open-ended virtual world, tackling challenges created in it will involve working in a world that is much more than just about the problem itself. The Witwatersrand team’s resulting MinePlanner system is a collection of 45 challenges, some easy, some harder. They include gathering tasks (like finding and gathering wood) and building tasks (like building a log cabin), as well as tasks that include combinations of these things. Each comes in three versions. In the easy version nothing is irrelevant. The medium version contains a variety of extraneous things that are not at all useful to the task. The hard version is in a full Minecraft world where there are thousands of objects that might be used.

To tackle these challenges an AI (or human) needs to solve not just the complex problem set, but also work out for themselves what in the Minecraft world is relevant to the task they are trying to perform and what isn’t. What matters and what doesn’t?

The team hope that by setting such tests they will help encourage researchers to develop more flexible intelligences, taking us closer to having real artificial intelligence. The problems are proposed as a benchmark for others to test their AIs against. The Witwatersrand team have already put existing state-of-the-art AI planning systems to the test. They weren’t actually that great at solving the problems and even the best could not complete the harder tasks.

So it is back to school for the AIs but hopefully now they will get a much better, flexible and fun education playing games like Minecraft. Let’s just hope the robots get to play with Lego too, so they don’t get left behind educationally.

Magazines …

Issue 29 – Diversity

Front cover of CS4FN issue 29 - Diversity in Computing

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Computers that read emotions

by Matthew Purver, Queen Mary University of London

Yellow balls with different expressions — Emoticons image by Gino Crescoli from Pixabay

One of the ways that computers could be more like humans – and maybe pass the Turing test – is by responding to emotion. But how could a computer learn to read human emotions out of words? Matthew Purver of Queen Mary University of London tells us how.

Have you ever thought about why you add emoticons to your text messages – symbols like 🙂 and :-@? Why do we do this with some messages but not with others? And why do we use different words, symbols and abbreviations in texts, Twitter messages, Facebook status updates and formal writing?

In face-to-face conversation, we get a lot of information from the way someone sounds, their facial expressions, and their gestures. In particular, this is the way we convey much of our emotional information – how happy or annoyed we’re feeling about what we’re saying. But when we’re sending a written message, these audio-visual cues are lost – so we have to think of other ways to convey the same information. The ways we choose to do this depend on the space we have available, and on what we think other people will understand. If we’re writing a book or an article, with lots of space and time available, we can use extra words to fully describe our point of view. But if we’re writing an SMS message when we’re short of time and the phone keypad takes time to use, or if we’re writing on Twitter and only have 140 characters of space, then we need to think of other conventions. Humans are very good at this – we can invent and understand new symbols, words or abbreviations quite easily. If you hadn’t seen the 😀 symbol before, you can probably guess what it means – especially if you know something about the person texting you, and what you’re talking about.

But computers are terrible at this. They’re generally bad at guessing new things, and they’re bad at understanding the way we naturally express ourselves. So if computers need to understand what people are writing to each other in short messages like on Twitter or Facebook, we have a problem. But this is something researchers would really like to do: for example, researchers in France, Germany and Ireland have all found that Twitter opinions can help predict election results, sometimes better than standard exit polls – and if we could accurately understand whether people are feeling happy or angry about a candidate when they tweet about them, we’d have a powerful tool for understanding popular opinion. Similarly we could automatically find out whether people liked a new product when it was launched; and some research even suggests you could even predict the stock market. But how do we teach computers to understand emotional content, and learn to adapt to the new ways we express it?

One answer might be in a class of techniques called semi-supervised learning. By taking some example messages in which the authors have made the emotional content very clear (using emoticons, or specific conventions like Twitter’s #fail or abbreviations like LOL), we can give ourselves a foundation to build on. A computer can learn the words and phrases that seem to be associated with these clear emotions, so it understands this limited set of messages. Then, by allowing it to find new data with the same words and phrases, it can learn new examples for itself. Eventually, it can learn new symbols or phrases if it sees them together with emotional patterns it already knows enough times to be confident, and then we’re on our way towards an emotionally aware computer. However, we’re still a fair way off getting it right all the time, every time.

Related Magazine …

Issue 14 – The genius who gave us the future

EPSRC supports this blog through research grant EP/W033615/1.

Neurodiversity and what it takes to be a good programmer

by Paul Curzon, Queen Mary University of London

A multicoloured jigsaw pattern ribbon — Image by Oberholster Venita from Pixabay

People often suggest neurodiverse people make good computer scientists. For example, one of the most famous autistic people, Temple Grandin, an academic at Colorado State University and animal welfre expert, has suggested programming is one of the jobs autistic people are potentially naturally good at (along with other computer science linked jobs) and that “Half the people in Silicon Valley probably have autism.”.. So what makes a good computer scientist? And why might people suggest neurodiverse people are good at it?

What makes a good programmer? Is it knowledge, skills or is it the type of person you are? It is actually all three though it’s important to realise that all three can be improved. No one is born a computer scientist. You may not have the knowledge, the skills or be the right kind of person now, but you can improve them all.

To be a good programmer, you need to develop specialist knowledge such as knowing what the available language constructs are, knowing what those constructs do, knowing when to use them over others, and so on. You also need to develop particular technical skills like an ability to decompose problems into sub-problems, to formulate solutions in a particular restricted notation (the programming language), to generalise solutions, and so on. However, you also need to become the right kind of person to do well as a programmer.

Thinking about what kind of person makes a good programmer, to help my students work on them and so become better programers, I made a list of the attributes I associate with good student programmers. My list includes: attention to detail, an ability to think clearly and logically, being creative, having good spatial visualising skills, being a hard worker, being resilient to things going wrong so determined, being organised, being able to meet deadlines, enjoying problem solving, being good at pattern matching, thinking analytically and being open to learning new and different ways of doing things.

More recently, when taking part in a workshop about neurodiversity I was struck by a similar list we were given. Part of the idea behind ‘neurodiversity’ is that everyone is different and everyone has strengths and weaknesses. If you think of ‘disability’ you tend to think of apparent weaknesses. Those ‘weaknesses’ are often there because the world we have created has turned them into weaknesses. For example, being in a wheelchair makes it hard to travel because we have built a world full of steps, kerbs and cobbles, doors that are hard to manipulate, high counters and so on. If we were to remove all those obstacles, a wheelchair would not have to reduce your ability to get around. Thinking about neurodiversity, the suggestion is to think about the strengths that come with it too, not just the difficulties you might encounter because of the way we’ve made the world.

The list of strengths of neurodiverse people given at the workshop were: attention to detail, focussed interest, problem-solving, creative, visualising, pattern recognition. Looking further you find both those positives reinforced and new positives. For example, one support website gives the positives of being an autistic person as: attention to detail, deep focus, observation skills, ability to absorb and retain facts, visual skills, expertise, a methodological approach, taking novel approaches, creativity, tenacity and resilience, accepting of difference and integrity. Thinking logically is also often picked out as a trait that neurodiverse people are often good at. The similarity of these lists to my list of what kind of person my students should aim to turn themselves into is very clear. Autistic people can start with a very solid basis to build on. If my list is right, then their personal positives may help neurodiverse people to quickly become good programmers.

Here are a few of those positives others have picked out that neurodiverse people may have and how they relate to programming:

Attention to detail: This is important in programming because a program is all about detail, both in the syntax (not missing brackets or semicolons) but more importantly not missing situations that could occur so cases the program must cover. A program must deal with every possibility that might arise, not just some. The way it deals with them also matters in the detail. Poor programs might just announce a mistake was made and shut down. A good program will explain the mistake and give the user a way to correct it for example. Detail like that matters. Attention to detail is also important in debugging as bugs are just details gone wrong.

Resilience and determination: Programming is like being on an emotional roller coaster. Getting a program right is full of highs and lows. You think it is working and then the last test you run shows a deep flaw. Back to the drawing board. As a novice learning it is even worse. The learning curve is steep as programming is a complex skill. That means there are lots of lows and seemingly insurmountable highs. At the start it can seem impossible to get a program to even compile never mind run. You have to keep going. You have to be determined. You have to be resilient to take all the knocks.

Focussed interest. Writing a program takes time and you have to focus. Stop and come back later and it will be so much harder to continue and to avoid making mistakes. Decomposition is a way to break the overall task into smaller subtasks, so methods to code, and that helps, once you have the skill. However, even then being able to maintain your focus to finish each method, so each subtask, makes the programming job much easier.

Pattern recognition: Human expertise in anything ultimately comes down to pattern matching new situations against old. It is the way our brains work. Expert chess players pattern match situations to tell them what to do, and so do firefighters in a burning building. So do expert programmers. Initially programming is about learning the meaning of programming constructs and how to use them, problem solving every step of the way. That is why the learning curve is so steep. As you gain experience though it becomes more about pattern matching: realising what a particular program needs at this point and how it is similar to something you have seen before then matching it to one of many standard template solutions. Then you just whip out the template and adapt it to fit. Spot something is essentially a search task and you whip out a search algorithm to solve it. Need to process a 2 dimensional array – you just need the rectangular for loop solution. Once you can do that kind of pattern matching, programming becomes much, much simpler.

Creativity and doing things in novel ways: Writing a program is an act of creation, so just like arts and crafts involves creativity. Just writing a program is one kind of creativity, coming up with an idea for a program or spotting a need no one else has noticed so you can write a program that fills that need requires great creativity of a slightly different kind. So does coming up with a novel solution once you have a novel problem. Developing new algorithms is all about thinking up a novel way of solving a problem and that of course takes creativity. Designing interfaces that are aesthetically pleasing but make a task easier to do takes creativity. If you can think about a problem in a different way to everyone else, then likely you will come up with different solutions no one else thought of.

Problem solving and analytical minds: Programming is problem solving on steroids. Being able to think analytically is an important part of problem solving and is especially powerful if combined with creativity (see above). You need to be able to analyse a problem, come up with creative solutions and be able to analyse what is the best way of solving it from those creative solutions. Being analytical helps with solid testing too.

Visual thinking: research suggests those with good visual, spatial thinking skills make good programmers. The reasons are not clear, but good programs are all about clear structure, so it may be that the ability to easily see the structure of programs and manipulate them in your head is part of it. That is part of the idea of block-based programming languages like Scratch and why they are used as a way into programming for young children. The structure of the program is made visual. Some paradigms of programming are also naturally more visual. In particular object-oriented programming sees programs as objects that send messages to each other and that is something that can naturally be visualised. As programs become bigger that ability to still visualise the separate parts and how they work as a whole is a big advantage.

A methodological approach: Novice programmers just tinker and hack programs together. Expert programmers design them. Many people never seem to get beyond the hacking stage, struggling with the idea of following a method to design first, yet it is vital if you are to engineer serious programs that work. That doesn’t mean that programming is just following methods, tinkering can be part of the problem solving and coming up with creative ideas, but it should be used within rigorous methodology not instead of it. More time is also spent by good programming teams testing programs as writing them, and that takes rigorous methods to do well too. Software engineering is all about following rigorous methods precisely because it is the only way to develop real programs that work and are fit for purpose. Vast amounts of software written is never used because it is useless. Rigorous methods give a way to avoid many of the problems.

Logical thinking: Being able to think clearly and logically is core to programming. It combines some of the things above like attention to detail, and thinking clearly and methodologically. Writing programs is essentially the practice of applied logic as logic underpins the semantics (ie meaning) of programming languages and so programs. You have to think logically when writing programs, when testing them and when debugging them. Computers are machines built from logic and you need to think in part like a computer to get it to do what you want. A key part of good programming is doing code walkthroughs – as a team stepping through what a program does line by line. That requires clear logical thinking to think in teh steps the computer performs.

I could go on, but will leave it there. The positives that neurodiverse people might have are very strongly positives for becoming a good programmer. That is why some of the best students I’ve had the privilege to teach have been neurodiverse students.

Different people, neurodiverse or otherwise, will start with different positives, different weaknesses. People start in different places for different reasons, some ahead, some behind. I liked doing puzzles as a child, so spent my childhood devouring logic and algorithmic puzzles. That meant when I first tried to learn to program, I found it fairly easy. I had built important skills and knowledge and had become a good logical thinker and problem solver just for fun. I learnt to program for fun. That meant it was if I’d started way beyond the starting line in the race to become a programmer. Many neurodiverse people do the same, if for different reasons.

Other skills I’ve needed as a computer scientist I have had to work hard on and develop strategies to overcome. I am a shy introvert. However, I need to both network and give presentations as a computer scientist (and ultimately now I give weekly lectures to hundreds at a time as an academic computer scientist). For that I had to practice, learn theory about good presentation and perhaps most importantly, given how paralysing my shyness was, devise a strategy to overcome that natural weakness. I did find a strategy – I developed an act. I have a fake extrovert persona that I act out. I act being that other person in these situations where I can’t otherwise cope, so it is not me you see giving presentations and lectures but my fake persona. Weaknesses can be overcome, even if they mean you start far behind the starting line. Of course, some weaknesses and ways we’ve built the world mean we may not be able to overcome the problems, and not everyone wants to be a computer scientist anyway so has the desire to. What matters is finding the future that matches your positives and interests, and where you can overcome the weaknesses where you set your mind to it.

Programming is not about born talent though (nothing is). We all have strengths and weaknesses and we can all become better by practicing and finding strategies that work for us, building upon our strengths and working on our weaknesses, especially when we have the help of a great teacher (or have help to change the way the world works so the weaknesses vanish).

My list above gives some of the key personal characteristics you need to work on improving (however good at them you are or are not right now) If you do want to be a good programmer. Anyone can become better at programming than they are if they have that desire. What matters is that you want to learn, are willing to put the practice in, can develop strategies to overcome your initial weaknesses, and you don’t give up. Neurodiverse people often have a head start on those personal attributes for becoming good at new things too.

Magazines …

Issue 29 – Diversity

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

The top 10 bugs

by Paul Curzon, Queen Mary University of London

(updated from the archive)

Bugs are everywhere, but why not learn from the mistakes of others. Here are some common bugs with examples of how they led to it all going terribly wrong.

The bugs of others show how solid testing, code walkthroughs, formal reasoning and other methods for preventing bugs really matter. In the examples below the consequences were massive, but none of these bugs should have made it to final systems, whatever the consequences.

Here then is my personal countdown of the top 10 bugs to learn from.

BUG 10: Divide by Zero

USS Yorktown (from wikipedia) — USS Yorktown, Public Domain via wikimedia

The USS Yorktown was used as a testbed for a new generation of “smart” ship. On 21 September 1997, its propulsion system failed leaving it “dead in the water” for 3 hours. It tried to divide by zero after a crew member input a 0 where no 0 should have been, crashing every computer on the ship’s network.

Moral: Input validation matters a lot and there are some checks you should know should be done as standard.

BUG 9: Arithmetic Overflow

Boeing 787 Dreamliner
From Wikipedia Author
pjs2005 from Hampshire, UK — From Wikipedia Author pjs2005 from Hampshire, UK CC BY-SA 2.0

Keep adding to an integer variable and you run out of bits. Suddenly you have a small number not the bigger one expected. This was one bug in the Therac-25 radiation therapy machine that killed patients. The Boeing 787 Dreamliner had the same problem. Fly for more than 248 days and it would switch off.

Moral: Always have checks for overflow and underflow, and if it might matter then you need something better than a fixed but-length number.

BUG 8: Timing Problems

Three telephone handsets on pavement. — Image by Alexa from Pixabay

AT&T lost $60 million the day the phones died (all of them). It was a result of changing a few lines of working code. Things happened too fast for the program. The telephone switches reset but were told they needed to reset again before they’d finished, … and so on.

Moral: even small code changes need thorough testing…and timing matters but needs extra special care.

BUG 6.99999989: Wrong numbers in a lookup table

Intel Pentium chip — Konstantin Lanzet – CPU Collection Konstantin Lanzet CC BY-SA 3.0

Intel’s Pentium chip turned out not to be able to divide properly. It was due to a wrong entry in a lookup table. Intel set aside $475 million to cover replacing the flawed processors. Some chips were turned in to key rings.

Moral: Data needs to be thoroughly checked not just instructions.

BUG 6: Wrong units

Artist's impression of the Mars Climate orbiter from wikipedia — Mars Climate Orbiter. Public Domain via wikimedia.

The Mars Climate Orbiter spent 10 months getting to Mars …where it promptly disintegrated. It passed too close to the planet’s atmosphere. The programmers assumed numbers were in pound-force seconds when they were actually in newton-seconds.

Moral: Clear documentation (including comments) really does matter.

BUG 5: Non-terminating loop

Spinning pizza of death. Image Public Domain via wikimedia

The spinning pizza of death is common. Your computer claims to be working hard, and puts up a progress symbol like a spinning wheel…forever. There are lots of ways that this happens. The simple version is that the program has entered a loop in a way that means the test to continue is never false. This took on a greater spin in the Fujitsu-UK Post Office Horizon system where bugs triggered the biggest ever miscarriage of justice. Hundreds of post masters were accused of stealing money because Horizon said they had. One of many bugs was that mid-transaction, the Horizon terminal could freeze. However, while in this infinite loop hitting any key duplicated the transaction, subtracting money again for every key press, not just the once for the stamp being bought.

Moral: Clear loop structure matters – where loops only exit via of one clear (and reasoned correct) condition.

BUG 4: Storing a big number in a small space

Arianespace’s Ariane 5 rocket Photo Credit: (NASA/Chris Gunn) from wikipedia. Creative Commons Attribution 2.0 Generic

The first Ariane 5 rocket exploded at a cost of $500 million 40 seconds after lift-off. Despite $7 billion spent on the rocket, the program stored a 64 bit floating point number in to a variable that could only hold a 16 bit integer.

Moral: Strong type systems are there to help not hinder. Use languages without them at your peril.

BUG 3: Memory Leak

Memory leaks (forgetting to free up space when done with) are responsible for many computer problems. The Firefox browser had one. It was infamous because Firefox (implausibly) claimed their program had no memory leaks.

Moral: If your language doesn’t use a garbage collector then you need to be extra careful about memory management.

BUG 2: Null pointer errors

**Photograph by Rama, Wikimedia Commons, Cc-by-sa-2.0-fr**

Tony Hoare who invented the null pointer (a pointer that points nowhere) called it his “billion-dollar” mistake because programmers struggle to cope with it. Null pointer bugs crash computers, give hackers ways in and generally cause chaos.

Moral: Avoid null pointers and do not rely on just remembering to check for them.

BUG 1: Buffer overflow

Photo **Go Card USA**, Creative Commons Attribution-Share Alike 2.0 Generic

The Morris Worm, an early Internet worm, came close to shutting down the Internet. It used a buffer overflow bug in network software to move from computer to computer shutting them down. Data was stored in a buffer but store too much and it would just be placed in the next avaialbale location in memory so could overwrite the program with new code.

Moral: Array-like data structures need extra care. Always triple check bounds are correct and that the code ensures overflows cannot happen

BUG 0: Buffer overflow

Lift buttons 0, 1 2, 3 — Image by Coombesy from Pixabay

Arrays in many languages start from position 0. This means the last position is one less than the length of the array. Get it wrong… as every novice (and expert) programmer does at some point … and you run off the end. Oddly (in Europe), we have no problem in a lift pressing 1 to go to the second floor up. In other situations, count from 0 and you do one too many things. Did I say 10 bugs…OOPS!

Moral: Think hard about every loop counter in your program.

The number 1 moral from all of this is that thorough testing matters a lot. Just trying a program a few times is not enough. In fact thorough testing is not enough either. You also need code walkthroughs, clear logical thinking about your code, formal reasoning tools where they exist, strong type systems, good commenting, and more. Most of all programmers need to understand their code will NOT be correct and they must put lots of effort into finding bugs if crucial ones are not to be missed. knowing the kinds of mistakes everyone makes, and being extra vigilant about them, is a good start.

And that is the end of my top 10 bugs…until the next arrogant, over-confident programmer causes the next catastrophe.

Magazines …

Front cover of CS4FN issue 17 - Machines Making Medicine Safer

Do something computationally funny for money

It is Red nose day in the UK the day of raising money for the comic relief charity by buying and wearing red noses and generally doing silly things for money.

Red noses are not just for red nose day though and if you ‘ve been supporting it every year, you possibly now have a lot of red noses like we do. What can you do with lots of red noses? Well one possibility is to count in red nose binary as a family or group of friends. (Order your red noses (a family pack has 4 or a school pack 25) from comic relief or make a donation to the charity there.)

Red nose binary

Let’s suppose you are a family of four. All stand in a line holding your red noses (you may want to set up a camera to film this). How many numbers can 4 red noses represent? See if you can work it out first. Then start counting:

No one wearing a red nose is 0,
the rightmost end person puts theirs on for 1,
they take it off and the next person puts theirs on for 2,
the first person puts theirs back on for 3,
the first two people take their noses off and the third person puts theirs on for 4
and so on…

The pattern we are following is the first (rightmost end) person changes their nose every time we count. The second person has the nose off for 2 then on for the next 2 counts. The third person changes theirs every fourth count (nose off for 4 then on for 4) and the last person changes theirs every eighth count (off for 8, on for 8). That gives a unique nose pattern every step of the way until eventually all the noses are off again and you have counted all the way from 0 to 15. This is exactly the pattern of binary that computers use (except they use 1s and 0s rather than wear red noses).

What is the biggest number you get to before you are back at 0? It is 15. Here is what the red nose binary pattern looks like.

The binary sequence in faces wearing red noses — Image by CS4FN

Try and count in red nose binary like this putting on and taking off red noses as fast as you can, following the pattern without making mistakes!

The numbers we have put at the top of each column are how much a red nose is worth in that column. You could write the number of the column on that person’s red nose to make this obvious. In our normal decimal way of counting, digits in each column are worth 10 times as much (1s 10s 100s, 1000s, etc) Here we are doing the same but with 2s (1s 2s 4s 8s etc). You can work out what a number represents just by adding that column number in if there is a red nose there. You ignore it if there is no red nose. So for example 13 is made up of an 8s red nose + a 4s red nose + a 1s red nose. 8 + 4 + 1 = 13.

13 in red nose binary with the 8, the 4 and the 1 red nose all worn. — Image by CS4FN

Add one more person (perhaps the dog if they are a friendly dog willing to put up with this sort of thing) with a red nose (now worth 16) to the line and how many more numbers does that now mean you can count up to? Its not just one more. You can now go through the whole original sequence twice once with the dog having no red nose, once with them having a red nose. So you can now count all the way from 0 to 31. Each time you add a new person (or pet*, though goldfish don’t tend to like it) with a red nose, you double the number you can count up to.

There is lots more you can do once you can count in red nose binary. Do red nose binary addition with three lines of friends with red noses, representing two numbers to add and compute the answer on the third line perhaps… for that you need to learn how to carry a red nose from one person to the next! Or play the game of Nim using red nose binary to work out your moves (it is the sneaky way mathematicians and computer scientists use to work out how to always win). You can even build a working computer (a Turing Machine) out of people wearing red noses…but perhaps we will save that for next year.

What else can you think of to do with red nose binary?

Paul Curzon, Queen Mary University of London

*Always make sure your pet (or other family member) has given written consent before you put a red nose on them for ethical counting.

Magazines …

Issue 29 – Diversity

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Calculating Pi for Pi Day

There are several Pi Day’s (14 March: 3.14; 22 July: 22/7) so we should look at how on earth you compute a number like Pi (3.1.4159….). It has an infinite number of digits containing no repeating pattern so you can never tie it down exactly. One of my favourite ways for calculating pi was first devised by the Indian mathematician Mādhava of Sangamagrāma 600 years ago. He worked out an algorithm for working out Pi based on the maths of infinite series that he had also worked out.

Pi symbol as a sculpture against a blue sky with digits written across it — Image by Naji Habib from Pixabay

Pi is one of the most useful numbers in all of maths. In school you come across it when working out the area or circumference of a circle, but it crops up all over the place including in practical computer science situations. Digital music, for example, relies on it deep down. Remember that the next time you stream your favourite music!

So how, 600 years ago did Mādhava manage to work out a much more accurate version of Pi than anyone before him? He had worked out that certain sequences of infinite numbers wouldn’t get bigger and bigger but would just get closer and closer to some specific number. In particular, he worked out one such sequence linked to pi.

π / 4 = 1 – 1/3 + 1/5 – 1/7 + 1/9 – …

Writing this a slightly different way it gives us a way of calculating pi itself

π = 4 – 4/3 + 4/5 – 4/7 + 4/9 – …

With an infinite number of terms, this gives an accurate value for pi. We can’t add an infinite number of numbers together though. Instead, we can use it to get a good answer. To get an approximation to pi we just follow an algorithm where we gradually add / subtract the next term. Each new calculation then gives us a better estimate of what pi is.

So to start with we just take the first term which says

π = 4 (very approximately)

That isn’t very good as it doesn’t get any digits right! Pi is closer to 3 than to 4. So its not looking hopeful! That doesn’t matter though as it is just a starting point. When we subtract the next term it gets a bit better

π = 4 – 4/3 = 2.6666…

Hmm. Now we have overshot the other way. However, we are closer to the real value of pi than we were. So don’t lose heart, keep going and add the next term

π = 4 – 4/3 + 4/5 = 3.46666…

And another term …

π = 4 – 4/3 + 4/5 – 4/7 = 2.895 …

And another term …

π = 4 – 4/3 + 4/5 – 4/7 + 4/9 = 3.339…

and so on.

The important thing to notice is that after each term included we get a more accurate answer, and we can keep adding terms for as long as we are happy to do the calculations. Mādhava (or his followers) obviously liked doing calculations so kept going until he had worked out pi accurate to 10 decimal places (3.1415926535…) : a new world record at the time beating the previous best of 6 decimal places by a Chinese astronomer Zhao Youqin using a different algorithm, That record had been set 80 years earlier but was smashed by 4 decimal places. This new record lasted for another 96 years. In doing these calculations Mādhava was acting as a ‘computer’ in the original meaning of the word: a human following an algorithm to do computation.

His algorithm is what computer scientists call an iterative algorithm. This kind of algorithm is used quite a lot by computer scientists as it gives a general way of getting a good enough (if not perfect) answer to a problem that otherwise is hard (or impossible) to get a perfect answer to in a reasonable time. You start with a good guess and then gradually refine the answer until you are happy that it is accurate enough. These algorithms can be straightforward to code as it is just running a loop doing calculations that refine the answer. Mādhava was happy with 10 decimal places of accuracy but he could have kept going. The trouble is this is a very slow algorithm. As we saw with the first few iterations above, it takes a long time even to home in on the first digit being 3! Every new digit took a lot of extra work to get right. When calculating machines and then computers were invented it became easier to use slow algorithms like this, but even with a faster computer it is still better to have a faster algorithm. Now far faster algorithms have been invented and the world record at the time of writing gives pi accurate to 105,000,000,000,000 decimal places!

Mādhava would have needed to really like doing calculations (and have discovered the secret to eternal life) to have calculated pi that accurately. 600 years ago his world record for pi was still an amazing achievement.

– Paul Curzon, Queen Mary University of London

Related Magazines …

Issue 29 – Diversity

Could AI end science?

by Nick Ballou, Oxford Internet Institute

The contents of a book burning — Image by Dariusz Sankowski from Pixabay

Scientific fraud is worryingly common, though rarely talked about. It has been happening for years, but now Artificial Intelligence programs could supercharge it. If they do that could undermine Science itself.

Investigators of scientific fraud have found that large numbers of researchers have manipulated their results, invented data, or even produced nonsensical papers in the hope that no one will look closely enough to notice. Often, no one does. The problem is that science is built on the foundation of all the research that has gone before. If we can no longer trust that past research is legitimate, the whole system of science begins to break down. AI has the potential to supercharge this process.

We’re not at that point yet, luckily. But there are concerning signs that generative AI systems like ChatGPT and DALLE-E might bring us closer. By using AI technology, producing fraudulent research has never been easier, faster, or more convincing. To understand, let’s first look at how scientific fraud has been done in the past.

How fraud happens

Until recently, fraudsters would need to go through some difficult steps to get a fraudulent research paper published. A typical example might look like this:

Step 1: invent a title

Fraudsters look for a popular but very broad research topic. We’ll take an example of a group of fraudsters known as the Tadpole Paper Mill. They published papers about cellular biology. To choose a new paper to create, the group would essentially use a simple generator, or algorithm, based on a template. This uses a simple technique first used by Christopher Strachey to write love letters in an early “creative” program in the 1950s.

For each “hole” in the template a word is chosen from a word list.

Pick the name of a molecule
- Either a protein name, a drug name or an RNA molecule name
- eg mir-488
Pick a verb
- From alleviates, attenuates, exerts, …
- eg inhibits
Pick one or two cellular processes
- From invasion, migration, proliferation, …
- eg cell growth and metastasis
Pick a cancer or cell type
- From lung cancer, ovarian cancer, …
- eg renal cell carcinoma
Pick a connector word
- From by, via, through, …
- eg by
Pick a verb
- From activating, targeting, …
- eg targeting
Pick a name
- Either a pathway, protein or miRNA molecule name
- eg hMgn5

This produces a complicated-sounding title such as “mir-488 inhibits cell growth and metastasis in renal cell carcinoma by targeting hMgn5”. This is the name of a real fraudulent paper created this way.

Step 2: write the paper

Next, the fraudsters create the text of the paper. To do this, they often just plagiarise and lightly edit previous similar papers, substituting key words in from their invented title perhaps. To try to hide the plagiarism, they automatically swap out words, replacing them with synonyms. This often leads to ridiculous (and kind of hilarious) replacements, like these found in plagiarised papers:

“Big data” –> “Colossal information”
“Cloud computing” –> “Haze figuring”
“Developing countries” –> “Creating nations”
“Kidney failure” –> “Kidney disappointment”

Step 3: add in the results

Lastly, the fraudsters need to create results for the fake study. These usually appear in papers in the form of images and graphs. To do this, the fraudsters take the results from several previous papers and recombine them into something that looks mostly real, but is just a Frankenstein mess of other results that have nothing to do with the current paper.

A new paper is born

Using that simple formula, fraudsters have produced thousands of fabricated articles in the last 10 years. Even after a vast amount of effort, the dedicated volunteers who are trying to clean up the mess have only caught a handful.

However, committing fraud like this successfully isn’t exactly easy, either: the fraudsters still need to come up with a research idea, write the paper themselves without copying too much from previous research, and make up results that look convincing—at least at first glance.

AI: Adding fuel to the fire

So what happens when we add modern generative AI programs into the mix? They are Artificial Intelligence programs like ChatGPT or DALL-E that can create text or pictures for you based on written requests.

Well, the quality of the fraud goes up, and the difficulty of producing it goes way down. This is true for both text and images.

Let’s start with text. Just now, I asked ChatGPT-4 to “write the first two paragraphs of a research paper on a cutting edge topic in psychology.” I then asked it to “write a fake results table that shows a positive relationship between climate change severity and anxiety”. I won’t copy the whole thing—in part because I encourage you to try this yourself to see how it works (not to actually create a fake paper!)—but here’s a sample of what it came up with:

“As the planet faces increasing temperatures, extreme weather events, and environmental degradation, the mental health repercussions for populations worldwide become a crucial area of investigation. Understanding these effects is vital for developing strategies to support communities in coping with the psychological challenges posed by a changing climate.”

As someone who has written many psychology research papers, I would find the results very difficult to identify as AI-generated—it looks and sounds very similar to how people in my field write, and it even generated Python code to analyse the fake data. I’d need to take a really close look at the origin of the data and so on to figure out that it’s fraudulent.

But that’s a lot of work required from me as a fraud-buster. For the fraudster, doing this takes about 1 minute, and would not be detected by any plagiarism software in the way previous kinds of fraud can be. In fact, this might only be detected if the fraudsters make a sloppy mistake, like leaving in a disclaimer from the model as in the paper caught which included the text

“[Please note that as an AI language model, I am unable to generate specific tables or conduct tests, so the actual resutls should be included in the table.]”!

Generative AIs are not close to human intelligence, at least not yet. So, why are they so good at producing convincing scientific research, something that’s commonly seen as one of the most difficult things humans can do? Two reasons play a big part: (1) scientific research is very structured, and (2) there’s a lot of training data. In any given field of research, most papers tend to look pretty similar—an introduction section, a method describing what the researchers did, a results section with a few tables and figures, and a discussion that links it back to the wider research field. Many journals even require a fixed structure. Generative AI programs work using Machine Learning – they learn from data and the more data they are given the better they become. Give a machine learning program millions of images of cats, telling it that is what they are, and it can become very good at recognising cats. Give it millions of images of dogs and it will be able to recognise dogs too. With roughly 3 million scientific papers published every year, generative AI systems are really good at taking these many, many examples of what a scientific report looks like, and producing similar sounding, and similarly structured pieces of text. They do it by predicting what word, sentence and paragraph would be good to come next based on probabilities calculated from all those examples.

Trusting future research

Most research can still be trusted, and the vast majority of scientists are working as hard as they can to advance human knowledge. Nonetheless, we all need to look carefully at research studies to ensure that they are legitimate, and we should be on extra alert as generative AI becomes even more powerful and widespread. We also need to think about how to improve universities and research culture generally, so that people don’t feel like they need to commit scientific fraud—something that usually happens because people are desperate to get or keep a job, or be seen as successful and reap the rewards. Somehow we need to change the game so that fraud no longer pays.

What do you think? Do you have ideas for how we can prevent fraud from happening in the first place, and how can we better detect it when it does occur? It is certainly an important new research topic. Find a solution and you could do massive good. If we don’t find solutions then we could lose the most successful tool human-kind has ever invented that makes all our lives better.

Related Magazines …

Cover issue 18 Machines that are creative

Find your own time zone

The theme for British Science Week 2024 is Time so here we’re going back in time to our archives to bring you this article about… time. Below are the instructions to find out your own personal time zone but be careful if you’re sharing your results with others, remember that your longitude (if combined with your latitude) can give away your location.

Andy Broomfield has given us the secret to figuring out your own personal time zone based on your longitude! Now you can figure out your time zone right down to the second, just like his gadget did.

Step one: find your longitude

First you need find out the longitude of the place you’re at. Longitude is the measure of where you are on the globe in an east-west direction (the north-south measurement is called latitude).

The best resource to do this is Google Earth, which will give you a very accurate longitude reading in degrees, minutes and seconds. Just find your location in Google Earth, and when you hover your mouse over it, the latitude and longitude are in the bottom right corner of the window.

There are alternatives to Google Earth online, but they tend to only work for one country rather than the whole world. If you can’t use Google Earth, try an internet search for finding longitude in your country.

If you’ve got a GPS system (e.g. on your phone), you can get it to tell you your longitude as well.

Step two: find your time zone

We’ll be finding your time relative to Greenwich Mean Time (GMT or UTC), the base for timekeeping all over the world. If your Longitude is west of 0° you’ll be behind GMT, and if it’s east then you’ll be ahead of it.

Longitude is usually measured in degrees, minutes and seconds. Here’s how longitude converts into your personal time zone:
• 15 degrees of longitude = 1 hour difference; 1 degree longitude = 4 minutes difference.
• 15 minutes of longitude = 1 minute difference; 1 minute of longitude = 4 seconds difference.
• 15 seconds of longitude = 1 second difference, 1 second of longitude = 0.066(recurring) seconds difference.

The best way to find your personal time zone is to convert the whole thing into seconds of longitude, then into seconds of time. Do this by adding together:

(degrees x 3600) + (minutes x 60) + (seconds)

You’ll get a big number – that’s your seconds in longitude. Then if you divide that big number by 15, that’s how many seconds your personal time zone is different from GMT. Once you’ve got that, you can convert it back into hours, minutes and seconds.

An example

Let’s find the personal time zone for the President of the United States. The White House is at 77° 2′ 11.7″ West, so converting this all to seconds of longitude gives:

(degrees x 3600) + (minutes x 60) + (seconds)
= (77 x 3600) + (2 x 60) + (11.7)
= (277,200) + (120) + (11.7)
= 277,331.7

Now we find the time zone difference in seconds of time:

277,331.7 / 15 = 18,488.78 seconds

This means that the President is 18,488.78 seconds behind GMT. Next it’s the slightly fiddly business of expanding those seconds back into hours, minutes and seconds. Because time is based on units of 60 rather than 10, dividing hours and minutes into decimals doesn’t tell you much. You’ll have to use whole numbers and figure out the remainders. Here’s how.

If you divide 18,488.78 by 3600 (the number of seconds in an hour), you’ll find out how many hours can fit in all of those seconds. The answer is 5, with some left over. 5 hours is 18,000 seconds (because 5 x 3600 = 18,000), so now you’re left with 488.78 seconds to deal with. Divide 488.78 by the number of seconds in a minute (60), and you get 8, plus some left over. 8 x 60 is 480, so you’ve got 8.78 seconds still left.

That means that the president’s personal time zone at the White House is 5 hours, 8 minutes and 8.78 seconds behind GMT.

If you’re using decimal longitude

Longitude is usually measured in degrees, minutes and seconds, but sometimes, like if you use a GPS receiver, you might get a measurement that just lists your longitude in degrees with a decimal. For example, the CS4FN office is located at 0.042 degrees west.

Figuring out your time zone with a decimal is simpler than with degrees, minutes and seconds. It’s just one calculation! Just take your decimal longitude and divide it by 0.004167.

So the local time at the CS4FN office is:

(longitude) / 0.004167
= (0.042) / 0.004167
= 10.079 seconds behind GMT

The only problem with this simple calculation is that it’s not as accurate as the one above for degrees, minutes and seconds. Plus, if you get a large number of seconds you’ll still have to do the last step from the method above, where you convert seconds back into hours and minutes.

Now you’ve got your own personal time zone!

Paul Curzon, Queen Mary University of London

The Social Machine of Maths

Two heads with a shared thought bubble of numbers — Thinking maths image by Gerd Altmann from Pixabay

In school we learn about the maths that others have invented: results that great mathematicians like Euclid, Pythagoras, Newton or Leibniz worked out. We follow algorithms for getting results they devised. Ada Lovelace was actually taught by one of the great mathematicians, Augustus De Morgan, who invented important laws, ‘De Morgan’s laws’ that are a fundamental basis for the logical reasoning computer scientists now use. Real maths is about discovering new results of course not just using old ones, and the way that is done is changing.

We tend to think of maths as something done by individual geniuses: an isolated creative activity, to produce a proof that other mathematicians then check. Perhaps the greatest such feat of recent years was Andrew WIles’ proof of Fermat’s Last Theorem. It was a proof that had evaded the best mathematicians for hundreds of years. Wiles locked himself away for 7 years to finally come up with a proof. Mathematics is now at a remarkable turning point. Computer science is changing the way maths is done. New technology is radically extending the power and limits of individuals. “Crowdsourcing” pulls together diverse experts to solve problems; computers that manipulate symbols can tackle huge routine calculations; and computers, using programs designed to verify hardware, check proofs that are just too long and complicated for any human to understand. Yet these techniques are currently used in stand-alone fashion, lacking integration with each other or with human creativity or fallibility.

‘Social machines’ are a whole new paradigm for viewing a combination of people and computers as a single problem-solving entity. The idea was identified by Tim Berners-Lee, inventor of the world-wide web. A project led by Ursula Martin at the University of Oxford explored how to make this a reality, creating a mathematics social machine – a combination of people, computers, and archives to create and apply mathematics. The idea is to change the way people do mathematics, so transforming the reach, pace, and impact of mathematics research. The first step involves social science rather than maths or computing though – studying what working mathematicians really do when working on new maths, and how they work together when doing crowdsourced maths. Once that is understood it will then be possible to develop tools to help them work as part of such a social machine.

The world changing mathematics results of the future may be made by social machines rather than solo geniuses. Team work, with both humans and computers is the future.

– Ursula Martin, University of Oxford
and Paul Curzon, Queen Mary University of London

Related Magazine …

CS4FN Issue 20 – Ada Lovelace: the computer scientist without a computer

Related Magazine …

CS4FN Issue 28 – Cunning Computational Contraptions

The history of computational devices: automata, core rope memory (used by NASA in the Moon landings), Charles Babbage’s Analytical Engine (never built) and Difference Engine made of cog wheels and levers, mercury delay lines, standardising the size of machine parts, Mary Coombs and the Lyons tea shop computer, computers made of marbles, i-Ching and binary, Ada Lovelace and music, a computer made of custard, a way of sorting wood samples with index cards and how to work out your own programming origin story.

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Software for Justice

A fingerprint benig scanned electronically — Digital fingerprint image by ar130405 from Pixabay

A jury is given misleading information in court by an expert witness. An innocent person goes to prison as a result. This shouldn’t happen, but unfortunately it does and more often than you might hope. It’s not because the experts or lawyers are trying to mislead but because of some tricky mathematics. Fortunately, a team of computer scientists at Queen Mary, University of London are leading the way in fixing the problem.

The Queen Mary team, led by Professor Norman Fenton, is trying to ensure that forensic evidence involving probability and statistics can be presented without making errors, even when the evidence is incredibly complex. Their solution is based on specialist software they have developed.

Many cases in courts rely on evidence like DNA and fibre matching for proof. When police investigators find traces of this kind of evidence from the crime scene they try to link it to a suspect. But there is a lot of misunderstanding about what it means to find a match. Surprisingly, a DNA match between, say, a trace of blood found at the scene and blood taken from a suspect does not mean that the trace must have come from the suspect.

Forensic experts talk about a ‘random match probability’. It is just the probability that the suspect’s DNA matches the trace if it did not actually come from him or her. Even a one-in-a-billion random match probability does not prove it was the suspect’s trace. Worse, the random match probability an expert witness might give is often either wrong or misleading. This can be because it fails to take account of potential cross-contamination, which happens when samples of evidence accidentally get mixed together, or even when officers leave traces of their own DNA from handling the evidence. It can also be wrong due to mistakes in the way the evidence was collected or tested. Other problems arise if family members aren’t explicitly ruled out, as that makes the random match probability much higher. When the forensic match is from fibre or glass, the random match probabilities are even more uncertain.

The potential to get the probabilities wrong isn’t restricted to errors in the match statistics, either. Suppose the match probability is one in ten thousand. When the experts or lawyers present this evidence they often say things like: “The probability that the trace came from anybody other than the defendant is one in ten thousand.” That statement sounds OK but it isn’t true.

The problem is called the prosecutor fallacy. You can’t actually conclude anything about the probability that the trace belonged to the defendant unless you know something about the number of potential suspects. Suppose this is the only evidence against the defendant and that the crime happened on an island where the defendant was one of a million adults who could have committed the crime. Then the random match probability of one in ten thousand actually means that about one hundred of those million adults match the trace. So the probability of innocence is ninety-nine out of a hundred! That’s very different from the one in ten thousand probability implied by the statement given in court.

Norman Fenton’s work is based around a theorem, called Bayes’ theorem, which gives the correct way to calculate these kinds of probabilities. The theorem is over 250 years old but it is widely misunderstood and, in all but the simplest cases is very difficult to calculate properly. Most cases include many pieces of related evidence – including evidence about the accuracy of the testing processes. To keep everything straight, experts need to build a model called a Bayesian network. It’s like a graph that maps out different possibilities and the chances that they are true. You can imagine that in almost any court case, this gets complicated awfully quickly. It is only in the last 20 years that researchers have discovered ways to perform the calculations for Bayesian networks, and written software to help them. What Norman and his team have done is develop methods specifically for modelling legal evidence as Bayesian networks in ways that are understandable by lawyers and expert witnesses.

Norman and his colleague Martin Neil have provided expert evidence (for lawyers) using these methods in several high-profile cases. Their methods help lawyers to determine the true value of any piece of evidence – individually or in combination. They also help show how to present probabilistic arguments properly.

Unfortunately, although scientists accept that Bayes’ theorem is the only viable method for reasoning about probabilistic evidence, it’s not often used in court, and is even a little controversial. Norman is leading an international group to help bring Bayes’ theorem a little more love from lawyers, judges and forensic scientists. Although changes in legal practice happen very slowly (lawyers still wear powdered wigs, after all), hopefully in the future the difficult job of judging evidence will be made easier and fairer with the help of Bayes’ theorem.

If that happens, then thanks to some 250 year-old maths combined with some very modern computer science, fewer innocent people will end up in jail. Given the innocent person in the dock could one day be you, you will probably agree that’s a good thing.

Paul Curzon, Queen Mary University of London (originally published in 2011)

Related Magazine …

Issue 14 – The genius who gave us the future

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

	The art of animatron… on I’m (not) a little …
	The Decline and Fall… on Victorian volunteers needed…
	Gatsby Benchmarks fo… on Finding work experience, or a…
	BBC Sound Effects li… on AMPER: AI helping future you r…
	PEEECS Bulletin Augu… on Involving disabled people in t…

More on …

Magazines …

Related Magazine …

More on …

Magazines …

BUG 10: Divide by Zero

BUG 9: Arithmetic Overflow

BUG 8: Timing Problems

BUG 6.99999989: Wrong numbers in a lookup table

BUG 6: Wrong units

BUG 5: Non-terminating loop

BUG 4: Storing a big number in a small space

BUG 3: Memory Leak

BUG 2: Null pointer errors

BUG 1: Buffer overflow

BUG 0: Buffer overflow

Magazines …

More on …

Red nose binary

More on computers and comedy

Magazines …

More on

Related Magazines …

More on …

How fraud happens

Step 1: invent a title

Step 2: write the paper

Step 3: add in the results

A new paper is born

AI: Adding fuel to the fire

Trusting future research

Related Magazines …

More on …

Step one: find your longitude

Step two: find your time zone

An example

If you’re using decimal longitude

More on …

Related Magazine …

Related Magazine …

More on … justice

Related Magazine …