Lego computer science: representing numbers using position

Numbers represented with different sized common blocks

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…how do we represent numbers and how is it related to the representation Charles Babbage used in his design for a Victorian steam-powered computer?

We’ve seen there are lots of ways that human societies have represented numbers and that there are many ways we could represent numbers even just using lego. Computers store numbers using a different representation again called binary. Before we get to that though we need to understand how we represent bigger numbers ourselves and why it is so useful.

Numbers represented as colours.

Our number system was invented in India somewhere before the 4th century. It then spread, including to the west, via muslim scholars in Persia by the 9th century, so is called the Hindu-Arabic numeral system. Its most famous advocate was Muḥammad ibn Mūsā al-Khwārizmī. The word algorithm comes from the latin version of his name because of his book on algorithms for doing arithmetic with Hindu-arabic numbers.

The really clever thing about it is the core idea that a digit can have a different value depending on its position. In the number 555, for example, the digit 5 is representing the number five hundred, the number fifty and the number five. Those three numbers are added together to give the actual number being represented. Digit in the ‘ones’ column keep their value, those in the ‘tens’ column are ten times bigger, those in the ‘hundreds column a hundred times bigger than the digit, and so on. This was revolutionary differing from most previous systems where a different symbol was used for bigger number, and each symbol always meant the same thing. For example, in Roman numerals X is used to mean 10 and always means 10 wherever it occurs in a number. This kind of positional system wasn’t totally unique as the Babylonians had used a less sophisticated version and Archimedes also came up with a similar idea, those these systems didn’t get used elsewhere.

In the lego representations of numbers we have seen so far, to represent big numbers we would need ever more coloured blocks, or ever more different kinds of brick or ever bigger piles of bricks, to give a representation of those bigger numbers. It just doesn’t scale. However, this idea of position-valued numbers can be applied whatever the representation of digits used, not just with digits 0 to 9. So we can use the place number system to represent ever bigger numbers using our different versions of the way digits could be represented in lego. We only need symbols for the different digits, not for every number, of for every bigger numbers.

For example, if we have ten different colours of bricks to represent the 10 digits of our decimal system, we can build any number by just placing them in the right position, placing coloured bricks on a base piece.

The number 2301 represented in coloured blocks where black represents 0, red represents 1, blue represents 2 and where yellow represents 3

Numbers could be variable sized or fixed size. If as above we have a base plate, and so storage space, for four digits then we can’t represent larger numbers than 9999. This is what happens with the way computers store numbers. A fixed amount of space is allocated for each number in the computer’s memory, and if a number needs more digits then we get an “overflow error” as it can’t be stored. Rockets worth millions of pounds have exploded on take-off in the past because a programmer made the mistake of trying to store numbers too big for the space allocated for them. If we want bigger numbers, we need a representation (and algorithms) that extend the size of the number if we run out of space. In lego that means our algorithm for dealing with numbers would have to include extending the grey base plate by adding a new piece when needed (and removing it when no longer needed). That then would allow us to add new digits.

Unlike when we write numbers, where we write just as many digits as we need, with fixed-sized numbers like this, we need to add zeros on the end to fill the space. There is no such thing as an empty piece of storage in a computer. Something is always there! So the number 123 is actually stored as 0123 in a fixed 4-digit representation like our lego one.

The number 321 represented in coloured blocks where space is allocated for 4 digits as 0321: black represents 0, red represents 1, blue represents 2 and where yellow represents 3

Charles Babbage made use of this idea when inventing his Victorian machines for doing computation: had they been built would have been the first computers. Driven by steam power his difference engine and analytical engine were to have digits represented by wheels with the numbers 0-9 written round the edge, linked to the positions of cog-like teeth that turned them.

Wheels were to be stacked on top of each other to represent larger numbers in a vertical rather than horizontal position system. The equivalent lego version to Babbage’s would therefore not have blocks on a base plate but blocks stacked on top of each other.

The number 321 represented vertically in coloured blocks where space is allocated for 4 digits as 0321: black represents 0, red represents 1, blue represents 2 and where yellow represents 3

In Babbage’s machines different numbers were represented by their own column of wheels. He envisioned the analytical engine to have a room sized data store full of such columns of wheels.

Numbers stored as columns of wheels on the replica of Babbage’s Difference Engine at the Science Museum London. Carsten Ullrich: CC-BY-SA-2.5. From wikimedia.

So Babbage’s idea was just to use our decimal system with digits represented with wheels. Modern computers instead use binary … bit that is for next time.

This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.


Lego Computer Science

Part 1: Lego Computer Science: pixel pictures

Part 2: Lego Computer Science: compression algorithms

Part 3: Lego Computer Science: representing numbers

Part 4: Lego Computer Science: representing numbers using position

Lego computer science: representing numbers

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…what does number representation mean?

We’ve seen some different ways to represent images and how ultimately they can be represented as numbers but how about numbers themselves. We talk as though computers can store numbers as numbers but even they are represented in terms of simpler things in computers.

Lego numbers

But first what do we mean by a number and a representation of a number? If I told you to make the numbers 0 to 9 in lego (go on have a go) you may well make something like this…

But those symbols 0, 1, 2, … are just that. They are symbols representing numbers not the numbers themselves. They are arbitrary choices. Different cultures past and present use different symbols to mean the same thing. For example, the ancient Egyptian way of writing the number 1000 was a hieroglyph of a water lily. (Perhaps you can make that in lego!)

The ancient Egyptian way to write 1000 was a hieroglyph of a waterlily

What really are numbers? What is the symbol 2 standing for? It represents the abstract idea of twoness ie any collection, group or pile of two things: 2 pieces of lego, 2 ducks, 2 sprouts, … and what is twoness? … it is oneness with one more thing added to the pile. So if you want to get closer to the actual numbers then a closer representation using lego might be a single brick, two bricks, three bricks, … put together in any way you like.

Numbers represented by that number of lego bricks

Another way would to use different sizes of bricks for them. Use a lego brick with a single stud for 1, a 2-stud brick for two and so on (combining bricks where you don’t have a single piece with the right number of studs). In these versions 0 is the absence of anything just like the real zero.

Lego bricks representing numbers based on the number of studs showing.

Once we do it in bricks it is just another representation though – a symbol of the actual thing. You can actually use any symbols as long as you decide the meaning in advance, there doesn’t actually have to be any element of twoness in the symbol for two. What other ways can you think of representing numbers 0 to 9 in lego? Make them…

A more abstract set of symbols would be to use different coloured bricks – red for 1, blue for 2 and so on. Now 0 can have a direct symbol like a black brick. Now as long as it is the right colour any brick would do. Any sized red brick can still mean 1 (if we want it to). Notice we are now doing the opposite of what we did with images. Instead of representing a colour with a number, we are representing a number with a colour.

Numbers represented as colours.

Here is a different representation. A one stud brick means 1, a 2-stud brick means 2, a square 4 stud brick means 3, a rectangular 6 stud brick means 4 and so on. As long as we agreed that is what they mean it is fine. Whatever representation we choose it is just a convention that we have to then be consistent about and agree with others.

Numbers represented by increasing sized blocks

What has this to do with computing? Well if we are going to write algorithms to work with numbers, we need a way to store and so represent numbers. More fundamentally though, computation (and so at its core computer science) really is all about symbol manipulation. That is what computational devices (like computers) do. They just manipulate symbols using algorithms. We will see this more clearly when we get to creating a simple computer (a Turing Machine) out of lego (but that is for later).

We interpret the symbols in the inputs of computers and the symbols in the outputs with meanings and as a result they tell us things we wanted to know. So if we key the symbols 12+13= into a calculator or computer and it gives us back 25, what has happened is just that it has followed some rules (an algorithm for addition) that manipulated those input symbols and made it spew out the output symbols. It has no idea what they mean as it is just blindly following its rules about how to manipulate symbols. We also could have used absolutely any symbols for the numbers and operators as long as they were the ones the computer was programmed to manipulate. We are the ones that add the intelligence and give those symbols meanings of numbers and addition and the result of doing an addition.

This is why representations are important – we need to choose a representation for things that makes the symbol manipulation we intend to do easy. We already saw this with images. If we want to send a large image to someone else then a representation of images like run-length encoding that shrinks the amount of data is a good idea.

When designing computers we need to provide them with a representation of numbers so they can manipulate those numbers. We have seen that there are lots of representations we could choose for numbers and any in theory would do, but when we choose a representation of numbers for use to do computation, we want to pick one that makes the operations we are interested in doing easy. Charles Babbage for example chose to use cog-like wheels turned to particular positions to represent numbers as he had worked out how to create a mechanism to do calculation with them. But that is something for another time…


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.


Lego Computer Science

Part 1: Lego Computer Science: pixel pictures

Part 2: Lego Computer Science: compression algorithms

Part 3: Lego Computer Science: representing numbers

Lego computer science: compression algorithms

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…

A giraffe as a pixel image.
Colour look-up table
Black 0
Blue 1
Yellow 2
Green 3
Brown 4

We saw in the last post how images are stored as pixels – the equivalent of square or round lego blocks of different colours laid out in a grid like a mosaic. By giving each colour a number and drawing out a gird of numbers we give ourself a map to recreate the picture from. Turning that grid of numbers into a list (and knowing the size of the rectangle that is the image) we can store the image as a file of numbers, and send it to someone else to recreate.

Of course, we didn’t really need that grid of numbers at all as it is the list we really need. A different (possibly quicker) way to create the list of numbers is work through the picture a brick at a time, row by row and find a brick of the same colour. Then make a long line of those bricks matching the ones in the lego image, keeping them in the same order as in the image. That long line of bricks is a different representation of the image as a list instead of as a grid. As long as we keep the bricks in order we can regenerate the image. By writing down the number of the colour of each brick we can turn the list of bricks into another representation – the list of numbers. Again the original lego image can be recreated from the numbers.

The image as a list of bricks and numbers
Colour look-up table: Black 0: Blue 1: Yellow 2: Green 3: Brown 4

The trouble with this is for any decent size image it is a long list of numbers – made very obvious by the very long line of lego bricks now covering your living room floor. There is an easy thing to do to make them take less space. Often you will see that there is a run of the same coloured lego bricks in the line. So when putting them out, stack adjacent bricks of the same colour together in a pile, only starting a new pile if the bricks change colour. If eventually we get to more bricks of the original colour, they start their own new pile. This allows the line of bricks to take up far less space on the floor. (We have essentially compressed our image – made it take less storage space, at least here less floor space).

Now when we create the list of numbers (so we can share the image, or pack all the lego away but still be able to recreate the image), we count how many bricks are in each pile. We can then write out a list to represent the numbers something like 7 blue, 1 green, … Of course we can replace the colours by numbers that represent them too using our key that gives a number to each colour (as above).

If we are using 1 to mean blue and the line of bricks starts with a pile of seven black bricks then write down a pair of numbers 7 1 to mean “a pile of seven blue bricks”. If this is followed by 1 green bricks with 3 being used for green then we next write down 1 3, to mean a pile of 1 green bricks and so on. As long as there are lots of runs of bricks (pixels) of the same colour then this will use far less numbers to store than the original:

7 1 1 3 6 1 2 3 1 1 1 2 3 1 2 3 2 2 3 1 2 3 …

We have compressed our image file and it will now be much quicker to send to a friend. The picture can still be rebuilt though as we have not lost any information at all in doing this (it is called a lossless data compression algorithm). The actual algorithm we have been following is called run-length encoding.

Of course, for some images, it may take more not less numbers if the picture changes colour nearly every brick (as in the middle of our giraffe picture). However, as long as there are large patches of similar colours then it will do better.

There are always tweaks you can do to algorithms that may improve the algorithm in some circumstances. For example in the above we jumped back to the start of the row when we got to the end. An alternative would be to snake down the image, working along the adjacent rows in opposite directions. That could improve run-length encoding for some images because patches of colour are likely the same as the row below, so this may allow us to continue some runs. Perhaps you can come up with other ways to make a better image compression algorithm

Run-length encoding is a very simple compression algorithm but it shows how the same information can be stored using a different representation in a way that takes up less space (so can be shared more quickly) – and that is what compression is all about. Other more complex compression algorithms use this algorithm as one element of the full algorithm.

Activities

Make this picture in lego (or colouring in on squared paper or in a spreadsheet if you don’t have the lego). Then convert it to a representation consisting of a line of piles of bricks and then create the compressed numbered list.

An image of a camel to compress: Colour look-up table: Black 0: Blue 1: Yellow 2: Green 3: Brown 4

Make your own lego images, encode and compress them and send the list of numbers to a friend to recreate.


Find more about Lego Art at lego.com.

Find more pixel puzzles (no lego needed, just coloured pens or spreadsheets) at https://teachinglondoncomputing.org/pixel-puzzles/


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.

Lego computer science: pixel pictures

by Paul Curzon, Queen Mary University of London

It is now after Christmas. You are stuffed full of turkey, and the floor is covered with lego. It must be time to get back to having some computer science fun, but could the lego help? As we will see you can explore digital images, cryptography, steganography, data compression, models of computing, machine learning and more with lego (and all without getting an expensive robot set which is the more obvious way to learn computer science with lego though you do need lots of lego). Actually you could also do it all with other things that were in your stocking like a bead necklace making set and probably with all that chocolate, too.

First we are going to look at understanding digital images using lego (or beads or …)

Raster images

Digital images come in two types: raster (or bitmap) images and vector images. They are different kinds of image representation. Lego is good for experimenting with the former through pixel puzzles. The idea is to make mosaic-like pictures out of a grid of small coloured lego. Lego have recently introduced a whole line of sets called Lego Art should you want to buy rather amazing versions of this idea, and you can buy an “Art Project” set that gives you all the bits you need to make your own raster images. You can (in theory at least) make it from bits and pieces of normal lego too. You do need quite a lot though.

Raster images are the basic kind of digital image as used by digital cameras. A digital image is split into a regular grid of small squares, called pixels. Each pixel is a different colour.

To do it yourself with normal lego you need, for starters, to collect lots of the small circle or square pieces of different colours. You then need a base to put them on. Either use a flat plate piece if you have one or make a square base of lego pieces that is 16 by 16. Then, filling the base completely with coloured pieces to make a mosaic-like picture. That is all a digital image really is at heart. Each piece of lego is a pixel. Computer images just have very tiny pieces, so tiny that they all merge together.

Here is one of our designs of a ladybird.

A pixel image of a ladybird

The more small squares you have to make the picture, the higher the resolution of the image With only 16 x 16 pixels we have a low resolution image. If you only have enough lego for an 8×8 picture then you have lower resolution images. If you are lucky enough to have a vast supply of lego then you will be able to make higher resolution, so more accurate looking images.

Lego-by-numbers

Computers do not actually store colours (or lego for that matter). Everything is just numbers. So the image is stored in the computer as a grid of numbers. It is only when the image is displayed it is converted to actual colours. How does that work. Well you first of all need a key that maps colours to numbers: 0 for black, 1 for red and so on. The number of colours you have is called the colour depth – the more numbers and linked colours in your key, the higher the colour depth. So the more different coloured lego pieces you were able to collect the larger your colour depth can be. Then you write the numbers out on squared paper with each number corresponding to the colour at that point in your picture. Below is a version for our ladybird…

The number version of our ladybird picture

Now if you know this is a 16×16 picture then you can write it out (so store it) as just a list of numbers, listed one row after another instead: [5,5,4,4,…5,5,0,4,…4,4,7,2] rather than bothering with squared paper. To be really clear you could even make the first two numbers the size of the grid: [16,16,5,5,4,4,…5,5,0,4,…4,4,7,2]

That along with the key is enough to recreate the picture which has to be either agreed in advance or sent as part of the list of numbers.

You can store that list of numbers and then rebuild the picture anytime you wish. That is all computers are doing when they store images where the file storing the numbers is called an image file.

A computer display (or camera display or digital tv for that matter) is just doing the equivalent of building a lego picture from the list of numbers every time it displays an image, or changes an old one for something new. Computers are very fast at doing this and the speed they do so is called the frame rate – how many new pictures or frames they can show every second. If a computer has a frame rate of 50 frames per second, then it as though it can do the equivalent of make a new lego image from scratch 50 times every second! Of course it is a bit easier for a computer as it is just sending instructions to a display to change the colour shown in each pixels position rather than actually putting coloured lego bricks in place.

Sharing Images

Better still you can give that list of numbers to a friend and they will be able to rebuild the picture from their own lego (assuming they have enough lego of the right colours of course). Having shared your list of numbers, you have just done the equivalent of sending an image over the internet from one computer to another. That is all that is happening when images are shared, one computer sends the list of numbers to another computer, allowing it to recreate a copy of the original. You of course still have your original, so have not given up any lego.

So lego can help you understand simple raster computer images, but there is lots more you can learn about computer science with simple lego bricks as we will see…


Find more about Lego Art at lego.com.

Find more pixel puzzles (no lego needed, just coloured pens or spreadsheets) at https://teachinglondoncomputing.org/pixel-puzzles/


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.

Tantrix: P=NP?

You can find computer science everywhere – even in popular Solitaire games and puzzles. Most people have played games like Tetris, Battleships, Mastermind, Tantrix and Minesweeper at some point. In fact, all these games have a link to one of the deepest, fundamental problems remaining in Computer Science. They are all linked to a famous equation that is to do with the ultimate limitations of computers.

A 5 tile Tantrix puzzle to solve

The sciences have many iconic equations that represent something fundamental about the world. The most famous is of course Einstein’s E=mc², which even non-scientists have heard of. The most famous equation in computer science is ‘P=NP’. The only trouble is no one has yet proved whether it is true or not! There is even a million dollar prize up for grabs for anyone who does, not to mention great fame!

P=NP boils down to the difference between checking if someone’s answer to a puzzle is correct, as against having to come up with the answer in the first place.

You are so NP!

Computer Scientists call problems where it is easy to check answers ‘NP problems’. Ones where it’s also easy to come up with solutions are ‘P problems’. So P=NP, if it were true, would just mean that all problems that are easy to check are also easy to solve.

Let’s take Tantrix rotation puzzles to see what it is all about. Tantrix is a popular domino-type game using coloured hexagonal pieces like the ones in the image. The idea of a Tantrix rotation puzzle is that you place some tiles randomly on the table in a connected pattern. You are then not allowed to move the position of any piece. All you can do is rotate them on the spot. The problem is to rotate the pieces so that all the coloured lines match where tiles meet – red to red lines, blue to blue lines and so on.

Have a go at the Tantrix rotation puzzle above before you read on.

Easy to check?

A solution is here if you want to check it. In fact you can quickly check any claimed rotation puzzle ‘solution’. All you do (the checking ‘algorithm’) is look at each tile in turn and check each of its edges does match the edge of the tile it touches, if any. If you find a tile edge that doesn’t match then the ‘solution’ isn’t a solution after all. How long would that take to check? With say a 10 piece puzzle there are 10 pieces to check each with 6 edges so in total that is 10×6 = 60 things to check. That wouldn’t take too long. Even with 100 pieces it would be only 600 things to check – 10 minutes if you could check an edge a second. So Tantrix rotation puzzles are NP puzzles – they can be checked quickly.

But can you solve it?

The question is: “Are rotation puzzles P puzzles too?” Can they always be solved quickly if you or a computer were clever enough? You may have found that puzzle easy to solve. It is much harder to come up with a quick way that is guaranteed to solve any rotation puzzle I give you. One way would be to methodically work through every combination of tile rotations to see if it worked. That would take a long time though.

There are 6 positions for the first piece (ways to rotate it), but for each of those 6 positions the second piece could be in 6 positions too … and so on for each other piece. Altogether for a 10-tile puzzle there are 6x6x6x6x6x6x6x6x6x6 (i.e., over 60 million) positions to check looking for a solution (and we might have to check them all). If you could check one position a second, it would take you around 700 days non-stop (no eating or sleeping). That is just for a 10-tile puzzle…now for a 100-tile puzzle – I’ll leave you to work out how long that would take.

It is not what computer scientists call “quick”.

How clever do you have to be?

If P=NP is true it would mean there is a quick way of solving all Tantrix rotation puzzles out there, if only someone were clever enough to think of it. If P=NP is not true then it might just not be possible however clever you are. Trouble is no-one knows if it’s true or not…

Tantrix: Solve one, solve them all

We’ve seen how Tantrix rotation puzzles can show what we mean by the question “Is P=NP?” It turns out Tantrix rotation puzzles are also something called ‘NP-complete’. That means it is one of a special kind of problem.

NP-complete problems are the really hard ones. Some are also really important to be able to solve quickly. For example, suppose you are a taxi driver and wanted your SatNav to be able to quickly work out the fastest circular route that went via a whole series of places you wanted to visit (in any order), getting you back home. Simple as it sounds, it is an NP-complete problem. It can’t be done quickly even by a fast computer. The best that can be done is to come up with a good answer not a perfect one – using what’s known as a ‘heuristic’. There are lots of ways of coming up with such good answers – using Swarm Intelligence is one way, for example. The point is none can guarantee the best answer every time.

NP-completeness is important because if you could come up with a quick algorithm for solving one NP-complete problem, computer scientists know how to convert such an algorithm into one that solves all the other NP problems…P would equal NP…Trouble is no one has so far done it…

Paul Curzon, Queen Mary University of London, 2007

Play Tantrix online at the official site.

Emoticons and Emotions

Emoticons are a simple and easily understandable way to express emotions in writing using letters and punctuation without any special pictures, but why might Japanese emoticons be better than western ones? And can we really trust expressions to tell us about emotions anyway?

African woman smiling 
Image by Tri Le from Pixabay

The trouble with early online message board messages, email and text messages was that it was always more difficult to express subtleties, including intended emotions, than if talking to someone face to face. Jokes were often assumed to be serious and flame wars were the result. So when in 1982 Carnegie Mellon Professor Scott Fahlman suggested the use of the smiley : – ) to indicate a joke in message board messages, a step forward in global peace was probably made. He also suggested that since posts more often than not seemed to be intended as jokes then a sad face : – ( would be more useful to explicitly indicate anything that wasn’t a joke.

He wasn’t actually the first to use punctuation characters to indicate emotions though. The earliest apparently recorded use is in a poem in 1648 by Robert Herrick, an English poet in his poem “To Fortune”.

Tumble me down, and I will sit
Upon my ruins, (smiling yet:)

Whether this was intentional or not is disputed, as punctuation wasn’t consistently used then. Perhaps the poet intended it, perhaps it was just a coincidentall printing error, or perhaps it was a joke inserted by the printers. Either way it is certainly an appropriate use (why not write your own emoticon poem!)

You might think that everyone uses the same emoticons you are familiar with but different cultures use them in different ways. Westerners follow Fahlman’s suggestion putting them on their side. In Japan by contrast they sit the right way up and crucially the emotion is all in the eyes not the mouth which is represented by an underscore. In this style, happiness can be given by (^_^) and T or ; as an indication of crying, can be used for sadness: (T_T) or (;_;). In South Korea, the Korean alphabet is used so a different character set of letter are available (though their symbols are the right way up as with the Japanese version).

Automatically understanding people’s emotions is an important area of research, called sentiment analysis, whether analysing text, faces or other aspects that can be captured. It is amongst other things important for marketeers and advertisers to work out whether people like their products or what issues matter most to people in elections, so it is big business. Anyone who truly cracks it will be rich.

So in reality is the western version or the Eastern version more accurate: are emotions better detected in the shape of the mouth or the eyes? With a smile at least, it turns out that the eyes really give away whether someone is happy or not, not the mouth. When people put on a fake smile their mouth does curve just as with a natural smile. The difference between fake and genuine smiles that really shows if the person is happy is in the eyes. A genuine smile is called a Duchenne smile after Duchenne de Boulogne who in 1862 showed that when people find something actually funny the smile affects the muscles in their eyes. It causes a tell-tale crow’s foot pattern in the skin at the sides of the eyes. Some people can fake a Duchenne too though, so even that is not totally reliable.

As emoticons hint, because emotions are indicated in the eyes as much as in the mouth, sentiment analysis of emotions based on faces needs to focus on the whole face, not just the mouth. However, all may not be what it seems as other research shows that most of the time people do not actually smile at all when genuinely happy. Just like emoticons facial expressions are just a way we tell other people what we want them to think our emotions are, not necessarily our actual emotions. Expressions are not a window into our souls, but a pragmatic way to communicate important information. They probably evolved for the same reason emoticons were invented, to avoid pointless fights. Researchers trying to create software that works out what we really feel, may have their work cut out if their life’s work is to make them genuinely happy.

     ( O . O )
         0

– Paul Curzon, Queen Mary University of London, Summer 2021

Knitters and Coders: separated at birth?

People often say that computers are all around us, but you could still escape your phone and iPod and go out to the park, far away from the nearest circuit board if you wanted to. It’s a lot more difficult to get away from the clutches of computation itself though. For one thing, you’d have to leave your clothes at home. Queen Mary Electronic Engineer Karen Shoop tells us about the code hidden in knitting, and what might happen when computers learn to read it.

Boy with wool hat and jumper on snowy day

If you’re wearing something knitted look closely at it (if it’s a sunny day then put this article away till it gets colder). Notice how the two sides don’t look the same: some parts look like a raised ‘v’ and others like a wave pattern. These are made by the two knitting stitches: knit and purl. With knit you stick the needle through and then behind the knitting; with purl you stick the needle in the other direction, starting behind the knitting and then pointing at the knitter. Expert knitters know that there’s more to knitting than just these two stitches, but we’ll stick to knit and purl. As these stitches are combined, the wool is transformed from a series of waves or ‘v’s into a range of patterns: stretchy stripes (ribs), raised speckles (moss), knots and ropes (cable). It all depends on the number of purls and knits, how they are placed next to each other and how often things are repeated.

Knitters get very proficient at reading knitting patterns, which are just varying combinations of k (knits) and p (purls). So the simplest pattern of all, knitting a square, would look something like:

’30k (30 knit stitches), finish the line, then repeat this 20 times’.

A rib would look like: ‘5k, 5p, then repeat this [a certain number of times], then repeat the line [another number of times]’

To a computer scientist or electronic engineer all this looks rather like computer code or, to be precise, like the way of describing a pattern as a computer program.

How your jumper is like coding

So look again at your knitted hat/jumper/cardi and follow the pattern, seeing how it changes horizontally and vertically. Just as knitters give instructions for this in their knitting pattern, coders do the same when writing computer programs. Specifically programmers use things called regular expressions. They are just a standard way to describe patterns. For example a regular expression might be used to describe what an email address should look like (specifying rules such as that it has one ‘@’ character in the middle of other characters, no full-stops/periods immediately before the @ and so on), what a phone number looks like (digits/numbers, no letters, possibly brackets or hyphens) and now what a knitting pattern looks like (lots of ks and ps). Regular expressions use a special notation to precisely describe what must be included, what might possibly be included, what cannot be, and how many times things should be repeated. If you were going to teach a computer how to read knitting patterns, a regular expression would be just what you need.

Knitting a regular expression

Let’s look at how to write a knitting pattern as a regular expression. Let’s take moss or seed stitch as our example. It repeats a “knit one purl one” pattern for one line. The next line then repeats a “purl one knit one” pattern, so that every knit stitch has a purl beneath it and vice versa. These two lines are repeated for as long as is necessary. How might we write that both concisely and precisely so there is no room for doubt?

In knitting notation (assuming an even number of stitches) it looks like: Row 1: *k1, p1; rep from * Rows 2: *p1, k1; rep from * or Row 1: (K1, P1) rep to end Row 2: (P1, K1) rep to end Repeat these 2 rows for length desired.

piles of woollen clothes

All this is fine … if it’s being read by a human, but to write experimental knitting software the knitting notation we have to use a notation a computer can easily follow: regular expressions fit the bill. Computers do not understand the words we used in our explanation above: words like ‘row’, ‘repeat’, ‘rep’, ‘to’, ‘from’, ‘end’, ‘length’ and ‘desired’, for example. We could either write a program that makes sense of what it all means for the computer, or we could just write knitting patterns for computers in a language they can already do something with: regular expressions. If we wanted to convert from human knitting patterns to regular expressions we would then write a program called a compiler (see Smart translation) that just did the translation.

In a regular expression to give a series of actions we just name them. So kp is the regular expression for one knit stitch followed immediately by one purl. The knitting pattern would then say repeat or rep. In a regular expression we group actions that need to be repeated inside curved brackets, resulting in (kp). To say how many times we need to repeat, curly brackets are used, so kp repeated 10 times looks like this: (kp){10}.

Since the word ‘row’ is not a standard coding word we then use a special character, written, \n, to indicate that a new line (=row) has to start. The full regular expression for the row is then (kp){10}\n. Since the first line was made of kps the following line must be pks, or (pk){10}\n

These two lines have to be repeated a certain number of times themselves, say 20, so they are in turn wrapped up in yet more brackets, producing: ((kp){10}\n(pk){10}\n){20}.

If we want to provide a more general pattern, not fixing the number of kps in a row or the number of rows, the 10 and 20 can be replaced with what are called variables – x and y. They can each stand for any number, so the final regular expression is:

((kp){x}\n(pk){x}\n){y}

How would you describe a rib as a regular expression (remember, that’s the pattern that looks like stretchy stripes)? The regular expression would be ((kp){x}\n){y}.

Regular expressions end up saying exactly the same thing as the standard knitting patterns, but more precisely so that they cannot be misunderstood. Describing knitting patterns in computer code is only the start, though. We can use this to write code that makes new patterns, to find established ones or to alter patterns, like you’d need to do if you were using thicker wool, for example. An undergraduate student at Queen Mary, Hailun Li, who likes knitting, used her knowledge to write an experimental knitting application that lets users enter their own combination of ps and ks and find out what their pattern looks like. She took her hobby and saw how it related to computing.

Look at your woolly jumper again…it’s full of computation!

– Karen Shoop, Queen Mary University of London, Summer 2014

Die another Day? Or How Madonna crashed the Internet

A lone mike under bright stage lights

From the cs4fn archive …

When pop star Madonna took to the stage at Brixton Academy in 2001 for a rare appearance she made Internet history and caused more that a little Internet misery. Her concert performance was webcast; that is it was broadcast real time over the Internet. A record-breaking audience of 9 million tuned in, and that’s where the trouble started…

The Internet’s early career

The Internet started its career as a way of sending text messages between military bases. What was important was that the message got through, even if parts of the network were damaged say, during times of war. The vision was to build a communications system that could not fail; even if individual computers did, the Internet would never crash. The text messages were split up into tiny packets of information and each of these was sent with an address and their position in the message over the wire. Going via a series of computer links it reached its destination a bit like someone sending a car home bit by bit through the post and then rebuilding it. Because it’s split up the different bits can go by different routes.

Express yourself (but be polite please)

To send all these bits of information a set of protocols (ways of communicating between the computers making up the Internet) were devised. When passing on a packet of information the sending machine first asks the receiving machine if it is both there and ready. If it replies yes then the packet is sent. Then, being a polite protocol, the sender asks the receiver if the packets all arrived safely. This way, with the right address, the packets can find the best way to go from A to B. If on the way some of the links in the chain are damaged and don’t reply, the messages can be sent by a different route. Similarly if some of the packets gets lost in transit between links and need to be resent, or packets are delayed in being sent because they have to go by a round about route, the protocol can work round it. It’s just a matter of time before all the packets arrive at the final destination and can be put back in order. With text the time taken to get there doesn’t really matter that much.

The Internet gets into the groove

The problem with live pop videos, like a Madonna concert, is that it’s no use if the last part of the song arrives first, or you have to wait half an hour for the middle chorus to turn up, or the last word in a sentence vanishes. It needs to all arrive in real time. After all, that is how it’s being sung. So to make web casting work there needs to be something different, a new way of sending the packets. It needs to be fast and it needs to deal with lots more packets as video images carry a gigantic amount of data. The solution is to add something new to the Internet, called an overlay network. This sits on top of the normal wiring but behaves very differently.

The Internet turns rock and roll rebel

So the new real time transmission protocol gets a bit rock and roll, and stops being quite so polite. It takes the packets and throws them quickly onto the Internet. If the receiver catches them, fine. If it doesn’t, then so what? The sender is too busy to check like in the old days. It has to keep up with the music! If the packets are kept small, an odd one lost won’t be missed. This overlay network called the Mbone, lets people tune into the transmissions like a TV station. All these packages are being thrown around and if you want to you can join in and pick them up.

Crazy for you

Like dozens of cars

all racing to get through

a tunnel there were traffic jams.

It was Internet gridlock.

The Madonna webcast was one of the first real tests of this new type of approach. She had millions of eager fans, but it was early days for the technology. Most people watching had slow dial-up modems rather than broadband. Also the number of computers making up the links in the Internet were small and of limited power. As more and more people tuned in to watch, more and more packets needed to be sent and more and more of the links started to clog up. Like dozens of cars all racing to get through a tunnel there were traffic jams. Packets that couldn’t get through tried to find other routes to their destination … which also ended up blocked. If they did finally arrive they couldn’t get through onto the viewers PC as the connection was slow, and if they did, very many were too late to be of any use. It was Internet gridlock.

Who’s that girl?

Viewers suffered as the pictures and sound cut in and out. Pictures froze then jumped. Packets arrived well after their use by date, meaning earlier images had been shown missing bits and looking fuzzy. You couldn’t even recognise Madonna on stage. Some researchers found that packets had, for example, passed over seven different networks to reach a PC in a hotel just four miles away. The packets had taken the scenic route round the world, and arrived too late for the party. It wasn’t only the Madonna fans who suffered. The broadcast made use of the underlying wiring of the Internet and it had filled up with millions of frantic Madonna packets. Anyone else trying to use the Internet at the time discovered that it had virtually ground to a halt and was useless. Madonna’s fans had effectively crashed the Internet!

Webcasts in Vogue

Today’s webcasts have moved on tremendously using the lessons learned from the early days of the Madonna Internet crash. Today video is very much a part of the Internet’s day-to-day duties: the speed of the computer links of the Internet and their processing power has increased massively; more homes have broadband so the packets can get to your PC faster; satellite uplinks now allow the network to identify where the traffic jams are and route the data up and over them; extra links are put into the Internet to switch on at busy times; there are now techniques to unnoticeably compress videos down to small numbers of packets, and intelligent algorithms have been developed to reroute data effectively round blocks. We can also now combine the information flowing to the viewers with information coming back from them so allowing interactive webcasts. With the advent of digital television this service is now in our homes and not just on our PC’s.

Living in a material world

It’s because of thousands of scientists working on new and improved technology and software that we can now watch as the housemate’s antics stream live from the Big Brother house, vote from our armchair for our favourite talent show contestant or ‘press red’ and listen to the director’s commentary as we watch our favourite TV show. Like water and electricity the Internet is now an accepted part of our lives. However, as we come up with even more popular TV shows and concerts, strive to improve the quality of sound and pictures, more people upgrade to broadband and more and more video information floods the Internet … will the Internet Die another Day?

Peter W. McOwan and Paul Curzon, Queen Mary University of London, 2006

Read more about women in computing in the cs4fn special issue “The Woman are Here”.

The Emoji Crystal Ball

Fairground fortune tellers claim to be able to tell a lot about you by staring into a crystal ball. They could tell far more about you (that wasn’t made up) by staring at your public social media profile. Even your use of emojis alone gives away something of who you are.

Reflective ball with dots of lights
Image by Hier und jetzt endet leider meine Reise auf Pixabay  from Pixabay

Walid Magdy’s research team at Edinburgh University are interested in how much people unknowingly give away about themselves when they use social media. They have found that it’s possible to work out an awful lot about you from your social media activity. One of their experiments involved exploring emojis. About a fifth of posts on Twitter include emojis, so they wondered if anything could be predicted about people, ignoring what they wrote and just looking at the emojis they used in their tweets. They found that the way people use emojis in twitter posts alone gives away whether they are male or female and their ethnic background.

They started by taking a large number of tweets known to be written by either men or women and stripped out the words, leaving only the emojis they used. They then counted how often each group used different emojis. The differences in use of each emoji seemed to be revealing as there was clearly a different pattern of use overall of each emoji by men and by women. Men and women each use some emojis much more than others.

Next, they used emoji data for some of the people to train a machine learning system (creating what is known as a classifier). The classifier was given all the emojis used by a person and told which were by men and which by women. It built up a detailed pattern of what a man’s emoji profile was like and similarly what a woman’s was like. 

Given a new set of tweets from a single person the classifier could then try to predict man or woman based on whether that profile was closer to the male pattern or closer to the female pattern of emoji use. Walid’s team found their emoji classifier’s predictions were right about 80% of the time – essentially it was as accurate as doing a similar thing based on the words they wrote. When they tried a similar experiment with ethnicity (was the person black, white or of another ethnicity) the predictions were even more accurate getting it right 84% of the time.

A lot can be worked out about you from apparently innocuous information that is publicly available as a result of your social media use. Even emojis give away something of who you are 😦

Paul Curzon, Queen Mary University of London, Spring 2021

– Based on a talk given by Walid Magdy at QMUL, May 2021.

Back (page) to health

Improvements in technology and decision making are transforming the way we look after our health. Here are some more interesting ideas to keep people alive and well.

Woman wearing VR headset looking at the sky.
Image by Pexels from Pixabay

The future is in your poo

You’ve heard of telling a person’s future from reading their tea leaves. Scientists believe an effective way of seeing a town’s future may be in the poo. By looking for infection in the waste at sewerage works it’s possible to get fast and accurate local knowledge of where infection rates are high and where low to feed into decision making tools.

Health advice: Stay in the toilet, Stay safe. Help the NHS.

Virtually breaking quarantine

The game, World of Warcraft, a multi-user dungeon game, helped virologists understand how people might behave in pandemics. The game’s developers released a plague that could be passed between avatars. The game’s contaminated area was quarantined. Rather than dying out, the virus escaped – because people broke into the quarantined areas to gawk, then left taking the virus with them.

Health advice: Your avatar should obey quarantine rules too!

The missing bullet holes

To stay healthy in a war, avoid being hit by a bullet. In World War II, many aircraft returned badly damaged. Abraham Wald studied them to decide where better armour was needed. There were more bullet holes in the fuselage than the engines. Where would you add the armour? Abraham added it where there were no bullet holes. He reasoned that the lack of holes in places like engines on returning planes meant that being hit there brought the plane down. Being hit elsewhere did not kill the pilots as those planes made it home!

Health advice: Dodge bullets by making good decisions …

Cybersick of virtual reality

The AI can detect puke-inducing movement and automatically correct the image.

A problem with virtual reality is that wearing a headset can be so immersive that it makes some people actually sick. This happens if you move about when watching a 3D video that was shot from a single place. Artificial intelligence software has come to the rescue, detecting puke-inducing movement and automatically correcting the image.

Health advice: If no bucket, always keep an AI handy.

Shining light on cancers

Cancer treatments like chemotherapy and radiotherapy make patients ill. Some drugs make cancer sensitive to light, allowing tumours to be killed by painlessly shining light on them instead. Sadly, that’s not easy when cancers are inside the body. A new Japanese solution is an LED chip, based on the technology used by contactless payment cards to provide power from a distance. Surgeons place it under the skin and leave it there. They glue it in place using a sticky protein from the feet of mussels. It shines low-intensity green light on the cancer, shrinking it.

Health advice: Stick a chip to your tumour

Smart sometimes means no gadgets

Being smart about health doesn’t have to be high-tech or even involve drugs. Exercise, for example, can be as effective helping with depression as taking medicine. Being out in nature can help too, so sometimes it’s worth leaving the gadgets behind and just going for a walk to enjoy the beauty of nature.

Health advice: Walk weekly in the woods

Paul Curzon, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.