Lego computer science: Gray code

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…how might we represent numbers using only two symbols?

We build numbers out of 10 different symbols: our digits 0-9. Charles Babbage’s victorian computer design represented numbers using the same decimal system (see Part 4: Lego Computer Science: representing numbers using position). That was probably an obvious choice for Babbage, but as we have already seen, there are lots of different ways numbers could be represented.

Modern computers use a different representation. The reason is they are based on a technology of electrical signals that are either there or not, switches that are on or off. Those two choices are used as two symbols to represent data. It is as though all data will be built of two lego coloured blocks: red and blue, say.

A naive way of using two different symbols (red and blue blocks) to represent numbers.

How might that then be done? There are still lots of ways that could be chosen.

Count the red blocks

One really obvious way would be to just pick one of the two coloured bricks (say red) to mean 1 and then to represent a number like 2 say you would have 2 of that colour block, filling the other spaces allocated for the number with the other colour. So if you were representing numbers with storage space for three blocks, two of them would be red and one would be blue for the number 2. All would be red for the number 3.

This is actually just a variation of unary, that we have seen earlier, just with a fixed amount of storage. It isn’t a very good representation as you need lots of storage space to represent large numbers because it is not using all possible combinations of the two symbols. In particular, far more numbers can be represented with a better representation. In the above example, 3 places are available on the lego base to put the blocks we are using and we have been able to represent 4 different numbers (0 to 3). However, information theory tells us we should be able to store up to 8 different numbers in the space, given two symbols and using them the right way, with the right representation.

A random code for numbers

How do we use all 8 possibilities? Just allocate a different combination to each pattern with blocks either red or blue, and allocate a different number to each pattern. Here is one random way of doing it.

A code for numbers chosen at random

Having a random allocation of patterns to numbers isn’t a very good representation though as it doesn’t even let us count easily. There is no natural order. There is no simple way to know what comes next other than learning the sequence. It also doesn’t easily expand to larger numbers. A good representation is one that makes the operations we are trying to do easy. This doesn’t.

Gray Code

Before we get to the actual binary representation computers use, another representation of numbers has been used in the past that isn’t just random. Called Gray code it is a good example of choosing a representation to make a specific task easier. In particular, it is really good if you want to create an electronic gadget that counts through a sequence.

Also called a a reflected binary code, Gray code is a sequence where you change only one bit (so the colour of one lego block) at a time as you move to the next number.

If you are creating an electronic circuit to count, perhaps as an actual counter or just to step through different states of a device (eg cycling through different modes like stopwatch, countdown timer, normal watch), then numbers would essentially be represented by electronic switches being on or off. A difficulty with this is that it is highly unlikely that two switches would change at exactly the same time. If you have a representation like our random one above, or actual binary, to move between some numbers you have to change lots of digits.

You can see the problem with lego. For example, to move from 0 to 1 in our sequence above you have to change all three lego blocks for new ones of the other colour. Similarly, to go from 1 to 2 you need to change two blocks. Now, if you swap one block from the number first and then the other, there is a point in time when you actually have a different (so wrong) number! To change the number 1 to 2, for example, we must swap the first and third bricks. Suppose we swap the first brick first and then the third brick. For a short time we are actually holding the number 3. Only when we change the last brick do we get to the real next number 2. We have actually counted 1, 3, 2, not 1, 2 as we wanted to. We have briefly been in the wrong state, which could trigger the electronics to do things associated with that state we do not want (like display the wrong number in a counter).

Mistaken counting using our random representation. To get from 1 to 2 we need to swap the first and third brick. If we change the first brick first, there is a brief time when our number has become three, before the third brick is changed. We have counted 1, 3, 2 by mistake.

Just as it is hard to swap several blocks at precisely the same time, electronic switches do not switch at exactly the same time, meaning that our gadget could end up doing the wrong thing, because it briefly jumps to the wrong state. This led to the idea of having a representation that used a sequence of numbers where only one bit of the number needs to be changed to get to the next number.

A Gray code in lego bricks. To move from one number in the sequence to the next, you only need to change one lego brick.

There are lots of ways to do this and the version above is the one introduced by physicist Frank Gray. Gray codes of this kind have been used in all sorts of situations: a Gray code sequence was used to represent characters in Émile Baudot’s telegraph communication system, for example. More recently they have been used to make it easier to correct errors in streams of data in digital TV.

Computers do not need to worry about this timing problem of when things change as they use clocks to determine when values are valid. Data is only read when the tick of the clock signal says it is safe too. This is slower, but gives time for all the digital switches to settle into their final state before the values are read, meaning faulty intermediate values are ignored. That means computers are free to use other representations of numbers and in particular use a binary system equivalent to our decimal system. That is important as while Gray code is good for counting, and stepping through states, amongst other things, it is not very convenient for doing more complicated arithmetic.


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.



Lego Computer Science

Part 1: Lego Computer Science: pixel pictures

Part 2: Lego Computer Science: compression algorithms

Part 3: Lego Computer Science: representing numbers

Part 4: Lego Computer Science: representing numbers using position

Part 5: Lego Computer Science: Gray code

Lego computer science: representing numbers using position

Numbers represented with different sized common blocks

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…how do we represent numbers and how is it related to the representation Charles Babbage used in his design for a Victorian steam-powered computer?

We’ve seen there are lots of ways that human societies have represented numbers and that there are many ways we could represent numbers even just using lego. Computers store numbers using a different representation again called binary. Before we get to that though we need to understand how we represent bigger numbers ourselves and why it is so useful.

Numbers represented as colours.

Our number system was invented in India somewhere before the 4th century. It then spread, including to the west, via muslim scholars in Persia by the 9th century, so is called the Hindu-Arabic numeral system. Its most famous advocate was Muḥammad ibn Mūsā al-Khwārizmī. The word algorithm comes from the latin version of his name because of his book on algorithms for doing arithmetic with Hindu-arabic numbers.

The really clever thing about it is the core idea that a digit can have a different value depending on its position. In the number 555, for example, the digit 5 is representing the number five hundred, the number fifty and the number five. Those three numbers are added together to give the actual number being represented. Digit in the ‘ones’ column keep their value, those in the ‘tens’ column are ten times bigger, those in the ‘hundreds column a hundred times bigger than the digit, and so on. This was revolutionary differing from most previous systems where a different symbol was used for bigger number, and each symbol always meant the same thing. For example, in Roman numerals X is used to mean 10 and always means 10 wherever it occurs in a number. This kind of positional system wasn’t totally unique as the Babylonians had used a less sophisticated version and Archimedes also came up with a similar idea, those these systems didn’t get used elsewhere.

In the lego representations of numbers we have seen so far, to represent big numbers we would need ever more coloured blocks, or ever more different kinds of brick or ever bigger piles of bricks, to give a representation of those bigger numbers. It just doesn’t scale. However, this idea of position-valued numbers can be applied whatever the representation of digits used, not just with digits 0 to 9. So we can use the place number system to represent ever bigger numbers using our different versions of the way digits could be represented in lego. We only need symbols for the different digits, not for every number, of for every bigger numbers.

For example, if we have ten different colours of bricks to represent the 10 digits of our decimal system, we can build any number by just placing them in the right position, placing coloured bricks on a base piece.

The number 2301 represented in coloured blocks where black represents 0, red represents 1, blue represents 2 and where yellow represents 3

Numbers could be variable sized or fixed size. If as above we have a base plate, and so storage space, for four digits then we can’t represent larger numbers than 9999. This is what happens with the way computers store numbers. A fixed amount of space is allocated for each number in the computer’s memory, and if a number needs more digits then we get an “overflow error” as it can’t be stored. Rockets worth millions of pounds have exploded on take-off in the past because a programmer made the mistake of trying to store numbers too big for the space allocated for them. If we want bigger numbers, we need a representation (and algorithms) that extend the size of the number if we run out of space. In lego that means our algorithm for dealing with numbers would have to include extending the grey base plate by adding a new piece when needed (and removing it when no longer needed). That then would allow us to add new digits.

Unlike when we write numbers, where we write just as many digits as we need, with fixed-sized numbers like this, we need to add zeros on the end to fill the space. There is no such thing as an empty piece of storage in a computer. Something is always there! So the number 123 is actually stored as 0123 in a fixed 4-digit representation like our lego one.

The number 321 represented in coloured blocks where space is allocated for 4 digits as 0321: black represents 0, red represents 1, blue represents 2 and where yellow represents 3

Charles Babbage made use of this idea when inventing his Victorian machines for doing computation: had they been built would have been the first computers. Driven by steam power his difference engine and analytical engine were to have digits represented by wheels with the numbers 0-9 written round the edge, linked to the positions of cog-like teeth that turned them.

Wheels were to be stacked on top of each other to represent larger numbers in a vertical rather than horizontal position system. The equivalent lego version to Babbage’s would therefore not have blocks on a base plate but blocks stacked on top of each other.

The number 321 represented vertically in coloured blocks where space is allocated for 4 digits as 0321: black represents 0, red represents 1, blue represents 2 and where yellow represents 3

In Babbage’s machines different numbers were represented by their own column of wheels. He envisioned the analytical engine to have a room sized data store full of such columns of wheels.

Numbers stored as columns of wheels on the replica of Babbage’s Difference Engine at the Science Museum London. Carsten Ullrich: CC-BY-SA-2.5. From wikimedia.

So Babbage’s idea was just to use our decimal system with digits represented with wheels. Modern computers instead use binary … bit that is for next time.

This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.


Lego Computer Science

Part 1: Lego Computer Science: pixel pictures

Part 2: Lego Computer Science: compression algorithms

Part 3: Lego Computer Science: representing numbers

Part 4: Lego Computer Science: representing numbers using position