Join the crowd with swarm intelligence

A shoal of fish crammed into a cylindrical tank. — Image by Paul Sloane from Pixabay

Next time you are in a large crowd, look around you: all those people moving together, and mostly not bumping into each other. How does it happen? Flocks of birds and schools of fish are also examples of this ‘swarm intelligence’. Computer Scientists have been inspired by this behaviour to come up with new solutions to really difficult problems.

Swarming behaviour requires the individuals (birds, fish or people) to have a set of rules about how to interact with the individuals nearest to them. These so-called ‘local’ rules are all that’s needed to give rise to the overall or ‘global’ behaviour of the swarm. We adjust our individual behaviour according to our current state but also the current state of those around us. If I want to turn left then I do it slowly so that others in the crowd can be aware of it and also start to turn. I know what I am doing and I know what they are doing. This is how information makes its way from the edges of the crowd to the centre and vice versa.

A swarm is born

The way a crowd or swarm interacts can be explained with simple maths. This maths is a way of approximating the complex psychological behaviour of all the individuals on the basis of local and global rules. Back in 1995 James Kennedy, a research psychologist, and Computer Scientist Russ Eberhart, having been inspired by the study of bird flocking behaviour by biologist Frank Heppner, realised that this swarm intelligence could be used to solve difficult computer problems. The result was a technique called Particle Swarm Optimisation (PSO).

Travel broadens the mind

An optimisation problem is one where you have a range of options to choose from and you want to know the best solution. One of the classic problems is called ‘The travelling salesperson’ problem. You work for a company and have to deliver packages to say 12 towns. You look at the map. There are many different routes to take, but which is the one that will let you visit all 12 towns using the least petrol? Here the choices are the order in which you visit the towns, and the constraint is that you want to do the least driving. You could have a guess, select the towns in a random order and work out how far you’d have to travel to visit them in this order. You then try another route through all 12 and see if it takes less mileage, and so on. Phew! It could take a long time to work out all the possible routes for 12 towns to see which was best. Now imagine your company grows and you have to deliver to 120 towns or 1200 towns. You would spend all your time with the maps trying to come up with the cheapest solutions. There must be a better way? Well actually simple as this problem seems it’s an example of a set of computational problems known as NP-Complete problems and it’s not easy to solve! You need some guidance to help you through and that’s where swarm optimisation come in. It’s an example of a so-called metaheuristic algorithm: a sort of ‘general rule of thumb’ to help solve these types of problem. It won’t always work unless you have infinite time but it’s better than just trying random solutions. So how does swarm optimisation work here?

State space: the final frontier

First we need to turn the problem into something called a state space. You probably use state spaces all the time but just don’t know it. Think about yourself. What are the characteristic you would use to tell people about your current state: age, height, weight and so on. You can identify yourself by a list of say 3 numbers – one for age, one for height, one for weight. It’s not everything about you of course but it does define your state to some extent. Your numbers will be different to other people’s numbers, if you take all the numbers for your friends you would have a state space. It would be a 3-dimensional space with axes: age, height and weight, and each person would be a point in that space at some coordinate (X, Y, Z).

So state spaces are just a way of representing the possible variations in some problem. With the 12 towns problem, however, you couldn’t draw this space: it would be 12 dimensional! It would have one axis for each town, with the position on the axis an indication of where in the route it was. Every point in that space would be a different route through the 12 towns, so each point in the space would have coordinates (x1, x2, x3, … x11, x12). For each point there would also be a mileage associated with the route, and the task is to find the coordinate point (route) with the lowest mileage.

Where no particle has swarmed before

Enter swarm optimisation. We create a set of ‘particles’ that will be like birds or fish, and will fly and swarm through our state space. Each particle starts with a random starting location in the state space and calculates the mileages involved for the coordinates (route) they are at. The particles remember (store) this coordinate and also the value of the mileage at that position. Each particle therefore has it’s own known ‘local’ best value (where the lowest mileage was) but can compare this with other neighbouring particles to see if they have found any even better solutions. The particles then move onwards randomly in a direction that tends to move them towards their own local best and the best value found by their neighbours. The particles ‘fly’ around the state space in a swarm homing in on even better solutions until they all converge on the best they can find: the answer.

It may be that somewhere in a part of the space where no particle has been there is an even better solution, perhaps even the best solution possible. Our swarm will have missed it! That’s why this algorithm is a heuristic, a best guess to a tough problem. We can always add some more particles at the start to fill more of the state space and reduce the chance of missing a good solution, but we cant ever be 100% sure.

Swarm optimisation has been applied to a whole range of tough problems in computing, electronic engineering, medicinal chemistry and economics. All it needs is for you to create an appropriate state space for your problem and let the particles fly and swarm to a good solution.

It’s yet another example of clever computing based on behaviours found in the natural world. So next time you’re in a crowd, look around and appreciate all that collective interacting swarm intelligence…but make sure you remember to watch where you are stepping.

by Peter W. McOwan, Queen Mary University of London. From the archive.

Related Magazines …

Our Books …

The Power of Computational Thinking:
- Games Magic and Puzzles to help you become a computational thinker
Conjuring with Computation
- Learn the basics of computer science through magic tricks

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

The Teleporting Robot

What is an algorithm? It is just a set of instructions that if followed precisely and in the given order, guarantees some result. The concept is important to computer scientists because all computers can do is follow instructions, but they do so perfectly. Computers do not understand what they are doing so can’t do anything but follow their instructions. That means whatever happens the instructions must always work. We can see what we mean by an algorithm by comparing it to the idea of a self-working trick from conjuring.

If you follow the steps of a self-working trick you will have the magic effect even if you have no idea how it works. Below is a a demonstration of a self-working magic jigsaw trick (you can download it from here https://conjuringwithcomputation.wordpress.com/resources/ print it, cut out the pieces and do it yourself, following the instructions below).

The teleporting robot jigsaw — Image by CS4FN

The step of the trick are.

1) Count the robots (ignore the green monsters and the robot dog)….There are 17.
2) Swap the top two pieces on the left with those on the right lining the jigsaw back up
3) Count the robots ….There are 16. One has disappeared.

Magically a robot has disappeared! Which one disappears and where did it go? Was it swallowed by a green monster, did it teleport away?

How did that happen anyway?

The teleporting robot jigsaw trick in action: a robot appears and disappears. — Image by CS4FN

By following the steps you can make the trick work…even if you haven’t worked out how it works, a robot still disappears. You do not need to understand, you just need to be able to follow instructions. It is a self-working trick. Follow the steps of the trick exactly and the robot disappears. It is just an algorithm. Self-working tricks are just algorithms for doing magic. When you follow the steps of the trick you are acting like a computer, blindly following the instructions in its program!

Paul Curzon, Queen Mary University of London

Linked Book …

Conjuring with Computation

Magazines …

Issue 17 – Machines Making Medicine Safer

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Cooking up computer style

Spaghetti in a jar looking down from above so looks like a ball of spikes surrounded by ingredients below — Pasta image by Gentina Bintang Danurendra from Pixabay

Using clever computer vision techniques it’s now possible for your ingredients to tell you how they should be cooked in a kitchen. The system uses cameras and projectors to first recognise the ingredients on the chopping board, for example the size, shape and species of fish you are using. Then the system projects a cutting line on the fish to show you how to prepare it, and a speech bubble telling you how long it should be cooked for and suggesting ways it can be served. In the future these cooking support systems could take some of the strain from mealtimes. At least it will help to make us all better cooks, and perhaps with an added pinch of artificial intelligence we can all become more like Jamie Oliver.

Jo Brodie, Queen Mary University of London

Do something computationally funny for money

It is Red nose day in the UK the day of raising money for the comic relief charity by buying and wearing red noses and generally doing silly things for money.

Red noses are not just for red nose day though and if you ‘ve been supporting it every year, you possibly now have a lot of red noses like we do. What can you do with lots of red noses? Well one possibility is to count in red nose binary as a family or group of friends. (Order your red noses (a family pack has 4 or a school pack 25) from comic relief or make a donation to the charity there.)

Red nose binary

Let’s suppose you are a family of four. All stand in a line holding your red noses (you may want to set up a camera to film this). How many numbers can 4 red noses represent? See if you can work it out first. Then start counting:

No one wearing a red nose is 0,
the rightmost end person puts theirs on for 1,
they take it off and the next person puts theirs on for 2,
the first person puts theirs back on for 3,
the first two people take their noses off and the third person puts theirs on for 4
and so on…

The pattern we are following is the first (rightmost end) person changes their nose every time we count. The second person has the nose off for 2 then on for the next 2 counts. The third person changes theirs every fourth count (nose off for 4 then on for 4) and the last person changes theirs every eighth count (off for 8, on for 8). That gives a unique nose pattern every step of the way until eventually all the noses are off again and you have counted all the way from 0 to 15. This is exactly the pattern of binary that computers use (except they use 1s and 0s rather than wear red noses).

What is the biggest number you get to before you are back at 0? It is 15. Here is what the red nose binary pattern looks like.

The binary sequence in faces wearing red noses — Image by CS4FN

Try and count in red nose binary like this putting on and taking off red noses as fast as you can, following the pattern without making mistakes!

The numbers we have put at the top of each column are how much a red nose is worth in that column. You could write the number of the column on that person’s red nose to make this obvious. In our normal decimal way of counting, digits in each column are worth 10 times as much (1s 10s 100s, 1000s, etc) Here we are doing the same but with 2s (1s 2s 4s 8s etc). You can work out what a number represents just by adding that column number in if there is a red nose there. You ignore it if there is no red nose. So for example 13 is made up of an 8s red nose + a 4s red nose + a 1s red nose. 8 + 4 + 1 = 13.

13 in red nose binary with the 8, the 4 and the 1 red nose all worn. — Image by CS4FN

Add one more person (perhaps the dog if they are a friendly dog willing to put up with this sort of thing) with a red nose (now worth 16) to the line and how many more numbers does that now mean you can count up to? Its not just one more. You can now go through the whole original sequence twice once with the dog having no red nose, once with them having a red nose. So you can now count all the way from 0 to 31. Each time you add a new person (or pet*, though goldfish don’t tend to like it) with a red nose, you double the number you can count up to.

There is lots more you can do once you can count in red nose binary. Do red nose binary addition with three lines of friends with red noses, representing two numbers to add and compute the answer on the third line perhaps… for that you need to learn how to carry a red nose from one person to the next! Or play the game of Nim using red nose binary to work out your moves (it is the sneaky way mathematicians and computer scientists use to work out how to always win). You can even build a working computer (a Turing Machine) out of people wearing red noses…but perhaps we will save that for next year.

What else can you think of to do with red nose binary?

Paul Curzon, Queen Mary University of London

*Always make sure your pet (or other family member) has given written consent before you put a red nose on them for ethical counting.

Magazines …

Issue 29 – Diversity

Front cover of CS4FN issue 29 – Diversity in Computing

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Calculating Pi for Pi Day

There are several Pi Day’s (14 March: 3.14; 22 July: 22/7) so we should look at how on earth you compute a number like Pi (3.1.4159….). It has an infinite number of digits containing no repeating pattern so you can never tie it down exactly. One of my favourite ways for calculating pi was first devised by the Indian mathematician Mādhava of Sangamagrāma 600 years ago. He worked out an algorithm for working out Pi based on the maths of infinite series that he had also worked out.

Pi symbol as a sculpture against a blue sky with digits written across it — Image by Naji Habib from Pixabay

Pi is one of the most useful numbers in all of maths. In school you come across it when working out the area or circumference of a circle, but it crops up all over the place including in practical computer science situations. Digital music, for example, relies on it deep down. Remember that the next time you stream your favourite music!

So how, 600 years ago did Mādhava manage to work out a much more accurate version of Pi than anyone before him? He had worked out that certain sequences of infinite numbers wouldn’t get bigger and bigger but would just get closer and closer to some specific number. In particular, he worked out one such sequence linked to pi.

π / 4 = 1 – 1/3 + 1/5 – 1/7 + 1/9 – …

Writing this a slightly different way it gives us a way of calculating pi itself

π = 4 – 4/3 + 4/5 – 4/7 + 4/9 – …

With an infinite number of terms, this gives an accurate value for pi. We can’t add an infinite number of numbers together though. Instead, we can use it to get a good answer. To get an approximation to pi we just follow an algorithm where we gradually add / subtract the next term. Each new calculation then gives us a better estimate of what pi is.

So to start with we just take the first term which says

π = 4 (very approximately)

That isn’t very good as it doesn’t get any digits right! Pi is closer to 3 than to 4. So its not looking hopeful! That doesn’t matter though as it is just a starting point. When we subtract the next term it gets a bit better

π = 4 – 4/3 = 2.6666…

Hmm. Now we have overshot the other way. However, we are closer to the real value of pi than we were. So don’t lose heart, keep going and add the next term

π = 4 – 4/3 + 4/5 = 3.46666…

And another term …

π = 4 – 4/3 + 4/5 – 4/7 = 2.895 …

And another term …

π = 4 – 4/3 + 4/5 – 4/7 + 4/9 = 3.339…

and so on.

The important thing to notice is that after each term included we get a more accurate answer, and we can keep adding terms for as long as we are happy to do the calculations. Mādhava (or his followers) obviously liked doing calculations so kept going until he had worked out pi accurate to 10 decimal places (3.1415926535…) : a new world record at the time beating the previous best of 6 decimal places by a Chinese astronomer Zhao Youqin using a different algorithm, That record had been set 80 years earlier but was smashed by 4 decimal places. This new record lasted for another 96 years. In doing these calculations Mādhava was acting as a ‘computer’ in the original meaning of the word: a human following an algorithm to do computation.

His algorithm is what computer scientists call an iterative algorithm. This kind of algorithm is used quite a lot by computer scientists as it gives a general way of getting a good enough (if not perfect) answer to a problem that otherwise is hard (or impossible) to get a perfect answer to in a reasonable time. You start with a good guess and then gradually refine the answer until you are happy that it is accurate enough. These algorithms can be straightforward to code as it is just running a loop doing calculations that refine the answer. Mādhava was happy with 10 decimal places of accuracy but he could have kept going. The trouble is this is a very slow algorithm. As we saw with the first few iterations above, it takes a long time even to home in on the first digit being 3! Every new digit took a lot of extra work to get right. When calculating machines and then computers were invented it became easier to use slow algorithms like this, but even with a faster computer it is still better to have a faster algorithm. Now far faster algorithms have been invented and the world record at the time of writing gives pi accurate to 105,000,000,000,000 decimal places!

Mādhava would have needed to really like doing calculations (and have discovered the secret to eternal life) to have calculated pi that accurately. 600 years ago his world record for pi was still an amazing achievement.

– Paul Curzon, Queen Mary University of London

Related Magazines …

Issue 29 – Diversity

Exploring mazes, inventing algorithms (part I)

by Paul Curzon, Queen Mary University of London

A maze with mouse searching for cheese. — Image by CS4FN

Computer science research in part involves inventing new algorithms or improving new ones. But what does that mean. Let’s explore some mazes to explore algorithms.

What does computer science research involve? It is very varied: from interviewing people to find out what the real problems that need solving in their lives or jobs are; to running experiments to find out what works and what doesn’t; to writing programs to solve problems.

Improving algorithms

A core part of much research is coming up with new and better algorithms that solve particular problems. The kind of algorithm could be anything from a new more secure cryptographic protocol, or a better way to rank the results of a search engine, to a new more effective machine learning algorithm that is less likely to make things up, or perhaps can better explain how it came to its conclusions.

What does it mean to come up with a better algorithm though? Once a problem is solved, isn’t it solved? Let’s explore a simple problem to see. Let’s explore mazes. Solve the simple maze puzzle above before you go on. Find a route that gets the mouse to the cheese.

Wandering around mazes, finding algorithms

If you’ve ever been in a hedge maze in the garden of some stately home, or a corn field maze, the chances are you just dived in and wandered rather aimlessly. Perhaps you tried to remember which way you went at each junction, to avoid going down the same dead-ends more than once. How about solving the paper version of a maze puzzle like the one above? Now perhaps you looked ahead to spot dead-ends to avoid tracing wrong paths with your pencil.

Probably what you are doing is at least a little random. You could, in theory at least, end up going back over the same paths, never taking the right one and and never getting to the middle. Could we come up with an algorithm that guarantees to solve mazes? To be an algorithm it would need to guarantee you ended up finding a path to the centre of the maze if you followed the steps of the algorithm precisely. It should also work for any maze, or at least all mazes of a particular kind. Ideally, the algorithm gives you a path that can then be followed by anyone without them having to run the algorithm themselves. They can just follow the path generated by the algorithm for that maze.

Wall-following

In fact lots of maze algorithms have been invented. Perhaps the one most people have heard of, if they know of any maze algorithm, is called Wall-following. It is very simple to do, You just pick a wall at the entrance either to the left or right and then follow it, If in a garden maze, keep your hand on the hedge as you walk round. If doing a paper puzzle, draw the path sticking to the chosen wall. Try it on the following simple maze.

A simple maze with mouse and cheese. — Image by CS4FN

Simply connected

The wall-following algorithm will guarantee to get you to the centre of the maze, and back out again too, but only as long as the maze is what is called simply connected. That just means the maze is constructed from a single hedge (or one unbroken drawn line) not a series of unconnected hedges. If you look at both examples above you will see I created them by just drawing a single wiggly line.

If a maze is simply connected then it cannot have looping paths, so no going round in circles for ever. It will also only have one entrance/exit. That shows the first aspect of inventing algorithms that is important. They often only work for some situations, not all. You must be sure you know what situations they do and don’t work.

Often the earliest algorithms invented to solve a problem are like wall-following: they only work for simple situations. Other people then come along and find new algorithms that can cover more problems (here more mazes). Can you tweak the wall-following maze algorithm to work even if there are multiple exits from the maze, for example? As it stands our algorithm could just take you from the entrance straight out of another exit without exploring much of the maze at all! See the end for one simple way to tweak the algorithm. What if there are paths that take you round in circles? Can you come up with an algorithm to deal with that?

Some times the improvements invented just involve tweaking an existing algorithm as with dealing with multiple exits in a maze. Some times a whole new algorithm is needed.

Faster, higher, stronger?

Even for a simple constrained version of the problem, like simply connected mazes, people can invent better algorithms. What does better mean for a maze? Well one way you might have a better algorithm is if it is faster in coming up with a solution. Another is that the solution it comes up with is faster. For a maze that means a shorter (ideally the shortest) path to the centre. Wall following may get you in to the centre (and out again) but you probably will have discovered a very long path that takes you in and out of lots of dead-ends needlessly. You do find a path to the centre, but it may be a very long path. Can you come up with an algorithm that finds shorter paths?

We will explore an algorithm that does next.

More to come…

Some solutions

The result of wall following on our simple maze

A route for the mouse to follow that takes it to the cheese. — Image by CS4FN

One way to deal with multiple exits

To deal with a maze that has multiple exits, so multiple breaks in the outer wall, tweak the wall-following algorithm as follows. First mark the exit you use to enter the maze, so you know when you return to it. If you come to any other exit then pretend there is a gate there and keep following the wall as though it were unbroken and there were no exit.

Magazines …

Issue 29 – Diversity

Subscribe to be notified whenever we publish a new post to the CS4FN blog.

This blog is funded by EPSRC on research agreement EP/W033615/1.

Hallucinating chatbots

Why can’t you trust what an AI says?

by Paul Curzon, Queen Mary University of London

postcards of cuba in a rack — Image by Sunrise from Pixabay

Chatbots that can answer questions and write things for you are in the news at the moment. These Artificial Intelligence (AI) programs are very good now at writing about all sorts of things from composing songs and stories to answering exam questions. They write very convincingly in a human-like way. However, one of the things about them is that they often get things wrong. Apparently, they make “facts” up or as some have described it “hallucinate”. Why should a computer lie or hallucinate? What is going on? Writing postcards will help us see.

Write a postcard

We can get an idea of what is going on if we go back to one of the very first computer programs that generated writing. It was in the 1950s and written by Christopher Strachey a school teacher turned early programmer. He wrote a love letter writing program but we will look at a similar idea: a postcard writing program.

Postcards typically might have lots of similar sentences, like “Wish you were here” or “The weather is lovely”, “We went to the beach” or “I had my face painted with butterflies”. Another time you might write things like: The weather is beautiful”, “We went to the funfair” or “I had my face painted with rainbows”. Christopher Strachey’s idea was to write a program with template sentences that could be filled in by different words: “The weather is …”, “We went to the …”, “I had my face painted with …”. Then the program picks some sentence templates at random, and then picks words at random to go in their slots. In this way, applied to postcard writing it can write millions of unique postcards. It might generate something like the following, for example (where I’ve bolded the words it filled in):

Dear Gran,

I’m on holiday in Skegness. I’ve had a wonderful time. The weather is sunny, We went to the beach. I had my face painted with rainbows. I’ve eaten lots strawberry ice cream. Wish you were here!

Lots of love from Mo

but the next time you ask it to it will generate something completely different.

Do it yourself

You can do the same thing yourself. Write lots of sentences on strips of card, leaving gaps for words. Give each gap a number label and note whether it is an adjective (like ‘lovely’ or ‘beautiful’) or a noun (like ‘beach’ or ‘funfair’, ‘butterflies’ or ‘rainbows’). You could also have gaps for verbs or adverbs too. Now create separate piles of cards to fit in each gap. Write the number that labels the gap on one side and different possible words of the right kind for that gap on the other side of the cards. Then keep them in numbered piles.

To generate a postcard (the algorithm or steps for you to follow), shuffle the sentence strips and pick three or four at random. Put them on the table in front of you to spell out a message. Next, go to the numbered pile for each gap in turn, shuffle the cards in that pile and then take one at random. Place it in the gap to complete the sentence. Do this for each gap until you have generated a new postcard message. Add who it is to and from at the start and end. You have just followed the steps (the algorithm) that our simple AI program is following.

Making things up

When you write a postcard by following the steps of our AI algorithm, you create sentences for the postcard partly at random. It is not totally random though, because of the templates and because you chose words to write on cards for each pile that make sense there. The words and sentences are about things you could have done – they are possible – but that does not mean you did do them!

The AI makes things up that are untrue but sound convincing because even though it is choosing words at random, they are appropriate and it is fitting them into sentences about things that do happen on holiday. People talk of chatbots ‘hallucinating’ or ‘dreaming’ or ‘lying’ but actually, as here, they are always just making the whole thing up just as we are when following our postcard algorithm. They are just being a little more sophisticated in the way that they invent their reality!

Our simple way of generating postcards is far simpler than modern AIs, but it highlights some of the features of how AIs are built. There are two basic parts to our AI. The template sentences ensure that what is produced is grammatical. They provide a simple ‘language model‘: rules of how to create correct sentences in English that sound like a human would write. It doesn’t write like Yoda :

“Truly wonderful, the beach is.”

though it could with different templates.

The second part is the sets of cards that fit the gaps. They have to fit the holes left in the templates – only nouns in the noun gaps, adjectives in the adjectives gap, and also fit

Given a set of template sentences about what you might do on holiday, the cards provide data to train the AI to say appropriate things. The cards for the face paining noun slot need to be things that might be painted on your face. By providing different cards you would change the possible sentences. The more cards the more variety in the sentences it writes.

AIs also have a language model, the rules of the language and which words go sensibly in which places in a sentence. However, they also are trained on data that gives the possibilities of what is actually written. Rather than a person writing templates and thinking up words it is based on training data such as social media posts or other writing on the Internet and what is being learnt from this data is the likelihood of what words come next, rather than just filling in holes in a template. The language model used by AIs is also actually just based on the likelihood of words appearing in sentences (not actual grammar rules).

What’s the chances of that?

So, the chatbots are based on the likelihood of words appearing and that is based on statistics. What do we mean by that? We can add a simple version of it to our Postcard AI but first we would need to collect data. How often is each face paint design chosen at seaside resorts? How often do people go to funfairs when on holiday. We need statistics about these things.

As it stands any word we add to the stack of cards is just as likely to be used. If we add the card maggots to the face painting pile (perhaps because the face painter does gruesome designs at Halloween) then the chatbot could write

“I had my face painted with maggots”.

and that is just as likely as it writing

“I had my face painted with butterflies”.

If the word maggots is not written on a card it will never write it. Either it is possible or it isn’t. We could make the chatbot write things that are more realistic, however, by adding more cards of words that are about things that are more popular. So, if in every 100 people having their face painted, almost a third, 30 people choose to have butterflies painted on their face, then we create 30 cards out of 100 in the pack with the word BUTTERFLY on (instead of just 1 card). If 5 in a 100 people choose the rainbow pattern then we add five RAINBOW cards, and so on. Perhaps we would still have one maggot card as every so often someone who likes grossing people out picks it even on holiday. Then, over all the many postcards written this way by our algorithm, the claims will match statistically the reality of what humans would write overall if they did it themselves.

As a result, when you draw a card for a sentence you are now more likely to get a sentence that is true for you. However, it is still more likely to be wrong about you personally than right (you may have had your face painted with butterflies but 70 of the 100 cards still say something else). It is still being chosen by chance and it is only the overall statistics for all people who have their face painted that matches reality not the individual case of what is likely true for you.

Make it personal

How could we make it more likely to be right about you? You need to personalise it. Collect and give it (ie train it on) more information about you personally. Perhaps you usually have a daisy painted on your face because you like daisies (you personally choose a daisy pattern 70% of the time). Sometimes you have rainbows (20% of the time). You might then on a whim choose each of 10 other designs including the butterfly maybe 1 in a hundred times. So you make a pile of 70 DAISY cards, 20 RAINBOW cards and 1 card for each of the other designs, Now, its choices, statistically at least, will match yours. You have trained it about yourself, so it now has a model of you.

You can similarly teach it more about yourself generally, so your likely activities, by adding more cards about the things you enjoy – if you usually choose chocolate or vanilla ice cream then add lots of cards for CHOCOLATE and lots for VANILLA, and so on. The more cards the postcard generator has of a word, the more likely it is to use that word. By giving it more information about yourself, it is more likely to be able to get things about you right. However, it is of course still making it up so, while it is being realistic, on any given occasion it may or may not match reality that time.

Perfect personalisation

You could go a step further and train it on what you actually did do while on this holiday, so that the only cards in the packs are the ones you did actually do on this holiday. (You ate hotdogs and ice cream and chips and … so there are cards for HOTDOG, ICE CREAM, CHIPS …). You had one vanilla ice cream, two chocolate and one strawberry so have that number of each ice cream card. If it knows everything about you then it will be able to write a postcard that is true! That is why companies behind AIs want to collect every detail of your life. The more they know about you the more they get things right about you and so predict what you will do in future too.

Probabilities from the Internet

The modern chatbots work by choosing words at random based on how likely they are in a similar way to our personalised postcard writer. They pick the most likely words to write next based on probabilities of those words coming next in the data they have been trained on. Their training data is often conversations from the Internet. If the word is most likely to come next in all that training data, then the chatbot is more likely to use that word next. However, that doesn’t make the sentence it comes up with definitely true any more than with our postcard AI.

You can personalise the modern AIs too, by giving them more accurate information about yourself and then they are more likely to get what they write about you right. There is still always a chance of them picking the wrong words, if it is there as a possibility though, as they are still just choosing to some extent at random.

Never trust a chatbot

Artificial Intelligences that generate writing do not hallucinate just some of the time. They hallucinate all of the time, just with a big probability of getting it right. They make everything up. When they get things right it is just because the statistics of the data they were trained on made those words the most likely ones to be picked to follow what went before. Just as the Internet is full of false things, an Artificial Intelligence can get things wrong too.

If you use them for anything that matters, always double check that they are telling you the truth.

Related Magazines …

Issue 29 – Diversity

EPSRC supports this blog through research grant EP/W033615/1.

Playing the weighting game

by Paul Curzon, Queen Mary University of London

In the spotlight - bright lights dazzling in circles and stars — *Image by Gerd Altmann from Pixabay*

Imagine having a reality TV show where yet again Simon Cowell is looking for talent. This time it’s talent with a difference though, not stars to entertain us but ones with the raw ability to help find webpages. Yes, this time the budding stars are all words. Word Idol is here!

The format is simple. Each week Simon’s aim is to find talented words to create a new group: a group with star quality, a group with meaning. Like any talent competition, there are thousands of entries. Every word in every webpage out there wants to take part. They all have to be judged, but what do the specialist judges look for?

OK, we’re getting carried away. Simon Cowell may not be interested but there is big money in the idea. It’s a talent show that is happening all the time. The aim is to judge the words in each new webpage as it appears so that search engines can find it if ever someone goes looking. The real star of this show isn’t Simon Cowell but a Cambridge professor, Karen Spärck Jones. She came up with the way to judge words.

Karen worked out that to do this kind of judging a computer needs a thesaurus: a book of words. It just lists groups of words that mean the same thing. A computer, Karen realised, could use one to understand what words mean.

There is big money in the idea!

The fact that there are so many ways to say the same thing in human languages, makes it really hard for a computer to understand what we write. That is where a thesaurus comes in. If you ask a computer to search for web pages about whales, for example, it helps to know that, a page that talks about orcas is about whales too. Worse still, most words have more than one meaning, a fact that keeps crossword lovers in business.

Take the following example: “Leona is the new big star of the music business.”

The word ‘star’ here obviously means a celebrity, but how do you know? It could also mean a sun or a shape. The fact that it’s with the word ‘music’ helps you to work out which meaning is right even if you have no idea who or what Leona is. As Karen realised, a computer can also work out the intended meanings of words by the other words used with them. A thesaurus tells it what the critical groupings are, but what Karen wanted was a way a computer could work the thesaurus out for itself and now she had a way.

Her early approach was to write a program that takes lots and lots of documents and make lists of the words that keep appearing close together. If ‘music’ appears with ‘star’ lots then that is a new meaning. After building up a big collection of such lists of linked words, the program can then use it to decide which pages are talking about the same thing and so which ones to suggest when a search is done. So Karen had found the first way to judge whether a word has the right ‘talent’ to go in a group. The more often words appear together the higher the score or ‘weighting’ they should be given. Simple!

The only trouble is it doesn’t really work. That is where Karen’s big insight came. She realised that if two words appear together in a lot of different documents then, surprisingly perhaps, putting them together in a group isn’t actually that useful for finding documents! Do a search and they will just tell you that lots of web pages match. What you really want is to be told of the few web pages that contain the meaning you are looking for, not lots and lots that don’t.

The important word groupings are actually only in a small number of web pages. That suggests they give a very focused meaning. Word groups like that help you narrow down the search. So Karen now had a better way to judge word talent. Give high marks for pairs that do appear together but in as few web pages as possible. Rather than a talent show, it is more like a giant game of the quiz show Pointless where you win if you pick the words few other people did.

That idea was the big breakthrough and led to what is now called IDF weighting. It is the way to judge words, and is so good that it’s now used by pretty much every search engine out there. Playing the IDF weighting game may not make great TV but thanks to Karen it really does make for great web.

Related Magazines …

EPSRC supports this blog through research grant EP/W033615/1.

How does Santa do it?

Fast yuletide algorithms to visit all those chimneys in time

Lots of Santas in a line — Image by Thomas Ulrich from Pixabay

How does Santa do it? How does he visit all those children, all those chimneys, in just one night? My theory is he combines a special Scandinavian super-power with some computational wizardry.

There are about 2 billion children in the world and Santa visits them all. Clearly he has magic (flying reindeer remember) to help him do it but what kind of magic (beyond the reindeer)? And is it all about magic? Some have suggested he stops time, or moves through other dimensions, others that he just travels at an amazingly fast speed (Speedy Gonzales or The Flash style). Perhaps though he uses computer science too (though by that I don’t mean computer technology, just the power of computation).

The problem can be thought of as a computational one. The task is to visit, let’s say a billion homes (assuming an average of 2 children per household), as fast as possible. The standard solution assumes Santa visits them one at a time in order. This is what is called a linear algorithm and linear algorithms are slow. If there are n pieces of data to process (here, chimneys to descend) then we write this as having efficiency O(n). This way of writing about efficiency is called Big-O notation. O(n) just means as n increases the amount of work increases proportionately. Double the number of children and you double the workload for Santa. Currently the population doubles every 60 or 70 years or so, so clearly Santa needs to think in this way or he will eventually fail keep up, whatever magic he uses.

Perhaps, Santa uses teams of Elves as in the film Arthur Christmas, so that at each location he can deliver say presents to 1000 homes at once (though then it is the 1000 Elf helpers doing the delivering not Santa which goes against all current wisdom that Santa does it himself). It would speed things up apparently enormously to 1000 times faster. However, in computational terms that barely makes a difference. It is still a linear order of efficiency: it is still O(n) as the work still goes up proportionately with n. Double the population and Santa is in trouble still as his one night workload doubles too. O(2n) and O(1000n) both simplify to mean exactly the same as O(n). Computationally it makes little difference, and if their algorithms are to solve big problems computer scientists have to think in terms of dealing with data doubling, doubling and doubling again, just like Santa has had to over the centuries.

Divide and Conquer problem solving

When a computer scientist has a problem like this to solve, one of the first tools to reach for is called Divide and Conquer problem solving. It is a way of inventing lightening fast algorithms, that barely increase in work needed as the size of the problem doubles. The secret is to find a way to convert the problem into one that is half the size of the original, but (and this is key) that is otherwise exactly the same problem. If it is the same problem (just smaller) then that means you can solve those resulting smaller problems in the same way. You keep splitting the problem until the problems are so small they are trivial. That turns out to be a massively fast way to get a job done. It does not have to be computers doing the divide and conquer: I’ve used the approach for sorting piles of hundreds and hundreds of exam scripts into sorted order quickly, for example.

My theory is that divide and conquer is what Santa does, though it requires a particular superhero power too to work in his context, but then he is magical, so why not. How do I think it works? I think Santa is capable of duplicating himself. There is a precedent for this in the superhero world. Norse god Loki is able to copy himself to get out of scrapes, and since Santa is from the same part of the world it seems likely he could have a similar power.

If he copied himself twice then one of him could do the Northern Hemisphere and the other the Southern hemisphere. The problem has been split into an identical problem (delivering presents to lots of children) but that is half the size for each Santa (each has only half the world so half as many children to cover). That would allow him to cover the world twice as fast. However that is really no different to getting a couple of Elves to do the work. It is still O(n) in terms of the efficiency the work is done. As the population doubles he quickly ends up back in the same situation as before: too much work for each Santa. Likewise if he made a fixed number of 1000 copies of himself it would be similar to having 1000 Elves doing the deliveries. The work still increases proportional to the number of deliveries. Double the population and you still double the time it takes.

Double Santa and double again (and keep doubling)

So Santa needs to do better than that if he is to keep up with the population explosion. But divide and conquer doesn’t say halve the problem once, it says solve the new problem in the same way. So each new Santa has to copy themselves too! As they are identical copies to the original surely they can do that as easily as the first one could. Those new Santas have to do the same, and so on. They all split again and again until each has a simple problem to solve that they can just do. That might be having a single village to cover, or perhaps a single house. At that point the copying can stop and the job of delivering presents actually done. Each drops down a chimney and leaves the presents. (Now you can see how he manages to eat all those mince pies too!)

An important thing to remember is that that is not the end of it. The world is now full of Santas. Before the night is over and the job done, each Santa has to merge back with the one they split from, recursively all the way back to the original Santa. Otherwise come Christmas Day we wouldn’t be able to move for Santas. Better leave 30 minutes for that at the end!

Does this make a big difference? Well, yes (as long as all the copying can be done quickly and there is an organised way to split up the world). It makes a massive difference. The key is in thinking about how often the Santas double in number, so how often the problem is halved in size.

We start with 1 Santa who duplicates to 2, but now both can duplicate to 4, then to 8, 16, and after only 5 splittings there are already 32 Santas, then 64, 128, 256, 512 Santas, and after only 10 splittings we have over a 1000 Santas (1024 to be precise). As we saw that isn’t enough so they keep splitting. Following the same pattern, after 20 splittings we have over a million Santas to do the job. After only 30 rounds of splittings we have a billion Santas, so each can deal with a single family: a trivial problem for each.

So if a Santa can duplicate himself (along with the sleigh and reindeer) in a minute or so (Loki does it in a fraction of a second so probably this is a massive over-estimate and Santa can too), we have enough Santas to do the job in about half an hour, leaving each plenty of time to do the delivery to their destination. The splitting can also be done on the way so each Santa travels only as far as needed. Importantly this splitting process is NOT linear. It is O(log2 n) rather than O(n) and log2 n is massively smaller than n for large n. It means if we double the population of households to visit due to population explosion, the number of rounds of splitting does not need to double, the Santas just have to do one more round of splitting to cover it. The calculation log2 n (the logarithm to base 2 of n) is just a mathematicians way of saying how many times you can halve the number n before you get to 1 (or equivalently how many times you double from 1 before you get up to n). 1024 can be halved 10 times so (log2 1024) is 10. A billion can be halved about 30 times so (log2 1 billion) is about 30. Instead of a billion pieces of work we do only 30 for the splitting. Double the chimneys to 2 billion and you need only one more for a total of 31 splittings.

In computer terms divide and conquer algorithms involve methods (ie functions / procedures) calling themselves multiple times. Each call of the method, works on eg half the problem. So a method to sort data might first divide the data in half. One half is passed to one new call (copy) of the same method to sort in the same way, the other half is passed to the other call (copy). They do the same calling more copies to work on half of their part of the data, until eventually each has only one piece of data to sort (which is trivial). Work then has to be done merging the sorted halves back into sorted wholes. A billion pieces of data are sorted in only 30 rounds of recursive splitting. Double to 2 billion pieces of data and you need just 1 more round of splitting to get the sorting done.

Living in a simulation

If this mechanism for Santa to do deliveries all still seems improbable then consider that for all we know the reality of our universe may actually be a simulation (Matrix-like) in some other-dimensional computer. If so we are each just software in that simulation, each of us a method executing to make decisions about what we do in our virtual world. If that is the nature of reality, then Santa is also just a (special yuletide) software routine, and his duplicating feat is just a method calling itself recursively (as with the sort algorithm). Then the whole Christmas delivery done this way is just a simple divide and conquer algorithm running in a computer…

Given the other ways suggested for Santa to do his Christmas miracle seem even more improbable, that suggests to me that the existence of Santa provides strong evidence that we are all just software in a simulation. Not that that would make our reality, or Christmas, any less special.

Paul Curzon, Queen Mary University of London

	Music AI Kriss Kross… on The day the music didn’t …
	Music AI Kriss Kross… on Separate your stems
	Musical Algorithms… on You’ll be Bach! – create…
	The art of animatron… on I’m (not) a little …
	The Decline and Fall… on Victorian volunteers needed…

A swarm is born

Travel broadens the mind

State space: the final frontier

Where no particle has swarmed before

More on …

Related Magazines …

Our Books …

Linked Book …

Magazines …

More on …

Red nose binary

More on computers and comedy

Magazines …

More on

Related Magazines …

More on …

Improving algorithms

Wandering around mazes, finding algorithms

Wall-following

Simply connected

Faster, higher, stronger?

Some solutions

The result of wall following on our simple maze

One way to deal with multiple exits

More on …

Magazines …

Why can’t you trust what an AI says?

Write a postcard

Do it yourself

Making things up

What’s the chances of that?

Make it personal

Perfect personalisation

Probabilities from the Internet

Never trust a chatbot

More on …

Related Magazines …

More on …

Related Magazines …

Fast yuletide algorithms to visit all those chimneys in time

Divide and Conquer problem solving

Double Santa and double again (and keep doubling)

Living in a simulation

More on …