Making sense of squishiness – 3D modelling the natural world

by Paul Curzon, Queen Mary University of London

Look out the window at the human-made world. It’s full of hard, geometric shapes – our buildings, the roads, our cars. They are made of solid things like tarmac, brick and metal that are designed to be rigid and stay that way. The natural world is nothing like that though. Things bend, stretch and squish in response to the forces around them. That provides a whole bunch of fascinating problems for computer scientists like Lourdes Agapito of Queen Mary, University of London to solve.

Computer scientists interested in creating 3-dimensional models of the world have so far mainly concentrated on modelling the hard things. Why? Because they are easier! You can see the results in computer-animated films like Toy Story, and the 3D worlds like Second Life your avatar inhabits. Even the soft things tend to be rigid.

Lourdes works in this general area creating 3D computer models, but she wants to solve the problems of creating them automatically just from the flat images in videos and is specifically interested in things that deform – the squishy things.

Look out the window and watch the world go by. As you watch a woman walk past you have no problem knowing that you are looking at the same person as you were a second ago – even if she becomes partially hidden as she walks behind the post box and turns to post a letter. The sun goes behind a cloud and the scene is suddenly darker. It starts to rain and she opens an umbrella. You can still recognise her as the same object. Your brain is pulling some amazing tricks to make this seem so mundane. Essentially it is creating a model of the world – identifying all the 3-dimensional objects that you see and tracking them over time. If we can do it, why can’t a computer?

Unlike hard surfaces, deformable ones don’t look the same from one still to the next. You don’t have to just worry about changes in lighting, them being partially hidden, and that they appear different from a different angle. The object itself will be a different shape from one still to the next. That makes it far harder to work out which bits of one image are actually the same as the ones in the next. Lourdes has taken on a seriously hard problem.

Existing vision systems that create 3D objects have made things easier for themselves by using existing models. If a computer already has a model of a cube to compare what it sees with, then spotting a cube in the image stream is much easier than working it out from scratch. That doesn’t really generalise to deformable objects though because they vary too much. Another approach, used by the film industry, is to put highly visible markers on objects so that those markers can be tracked. That doesn’t help if you just want to point a camera out the window at whatever passes by though.

Software from Lourdes’ team creates a model of the human face as it deforms. A looping gif of a man’s face making different expressions next to a cartoon version which copies him. Red dots on his features are mapped to red dots on the cartoon face

Lourdes aim is to be able to point a camera at a deformable object and have a computer vision system be able to create a 3D model simply by analysing the images. No markers, no existing models of what might be there, not even previous films to train it with, just the video itself. So far her team have created a system that can do this in some situations such as with faces as a person changes their expression. Their next goal is to be able to make their system work for a whole person as they are filmed doing arbitrary things. It’s the technical challenge that inspires Lourdes the most, though once the problems of deformable objects are solved there are applications of course. One immediately obvious area is in operating theatres. Keyhole surgery is now very common. It involves a surgeon operating remotely, seeing what they are doing by looking at flat video images from a fibre optic probe inside the body of the person being operated on. The image is flat but the inside of the person that the surgeon is trying to make cuts in is 3-dimensional. It would be far less error prone if what the surgeon was looking at was an accurate 3D model of the video feed rather than just a flat picture. Of course the inside of your body is made of exactly the kind of squishy deformable surfaces that Lourdes is interested in. Get the computer science right and technologies like this will save lives.

At the same time as tackling seriously hard if squishy computer science problems, Lourdes is also a mother of three. A major reason she can fit it all in, as she points out, is that she has a very supportive partner who shares in the childcare. Without him it would be impossible to balance all the work involved in leading a top European research team. It’s also important to get away from work sometimes. Running regularly helps Lourdes cope with the pressures and as we write she is about to run her first half marathon.

Lourdes may or may not be the person who turns her team’s solutions into the applications that in the future save lives in operating theatres, spot suspicious behaviour in CCTV footage or allow film-makers to quickly animate the actions of actors. Whoever does create the applications, we still need people like Lourdes who are just excited about solving the fundamental problems in the first place.


This article was originally published on the CS4FN website in ~2011. You can read more about Women in Computing here.


This blog is funded through EPSRC grant EP/W033615/1.

Recognising (and addressing) bias in facial recognition tech – the Gender Shades Audit #BlackHistoryMonth ^JB

The five shades used for skin tone emojis

Some people have a neurological condition called face blindness (also known as ‘prosopagnosia’) which means that they are unable to recognise people, even those they know well – this can include their own face in the mirror! They only know who someone is once they start to speak but until then they can’t be sure who it is. They can certainly detect faces though, but they might struggle to classify them in terms of gender or ethnicity. In general though, most people actually have an exceptionally good ability to detect and recognise faces, so good in fact that we even detect faces when they’re not actually there – this is called pareidolia – perhaps you see a surprised face in this picture of USB sockets below.

A unit containing four sockets, 2 USB and 2 for a microphone and speakers.
Happy, though surprised, sockets

What if facial recognition technology isn’t as good at recognising faces as it has sometimes been claimed to be? If the technology is being used in the criminal justice system, and gets the identification wrong, this can cause serious problems for people (see Robert Williams’ story in “Facing up to the problems of recognising faces“).

In 2018 Joy Buolamwini and Timnit Gebru shared the results of research they’d done, testing three different commercial facial recognition systems. They found that these systems were much more likely to wrongly classify darker-skinned female faces compared to lighter- or darker-skinned male faces. In other words, the systems were not reliable.

“The findings raise questions about how today’s neural networks, which … (look for) patterns in huge data sets, are trained and evaluated.”

Study finds gender and skin-type bias in commercial artificial-intelligence systems
(11 February 2018) MIT News

The Gender Shades Audit

Facial recognition systems are trained to detect, classify and even recognise faces using a bank of photographs of people. Joy and Timnit examined two banks of images used to train facial recognition systems and found that around 80 per cent of the photos used were of people with lighter coloured skin. 

If the photographs aren’t fairly balanced in terms of having a range of people of different gender and ethnicity then the resulting technologies will inherit that bias too. Effectively the systems here were being trained to recognise light-skinned people.

The Pilot Parliaments Benchmark

They decided to create their own set of images and wanted to ensure that these covered a wide range of skin tones and had an equal mix of men and women (‘gender parity’). They did this by selecting photographs of members of various parliaments around the world which are known to have a reasonably equal mix of men and women, and selected parliaments from countries with predominantly darker skinned people (Rwanda, Senegal and South Africa) and from countries with predominantly lighter-skinned people (Iceland, Finland and Sweden). 

They labelled all the photos according to gender (they did have to make some assumptions based on name and appearance if pronouns weren’t available) and used the Fitzpatrick scale (see Different shades, below) to classify skin tones. The result was a set of photographs labelled as dark male, dark female, light male, light female with a roughly equal mix across all four categories – this time, 53 per cent of the people were light-skinned (male and female).

A composite image showing the range of skin tone classifications with the Fitzpatrick scale on top and the skin tone emojis below.

Different shades

The Fitzpatrick skin tone scale (top) is used by dermatologists (skin specialists) as a way of classifying how someone’s skin responds to ultraviolet light. There are six points on the scale with 1 being the lightest skin and 6 being the darkest. People whose skin tone has a lower Fitzpatrick score are more likely to burn in the sun and not tan, and are also at greater risk of melanoma (skin cancer). People with higher scores have darker skin which is less likely to burn and they have a lower risk of skin cancer. 

Below it is a variation of the Fitzpatrick scale, with five points, which is used to create the skin tone emojis that you’ll find on most messaging apps in addition to the ‘default’ yellow. 

Testing three face recognition systems

Joy and Timnit tested the three commercial face recognition systems against their new database of photographs – a fair test of a wide range of faces that a recognition system might come across – and this is where they found that the systems were less able to correctly identify particular groups of people. The systems were very good at spotting lighter-skinned men, and darker skinned men, but were less able to correctly identify darker-skinned women, and women overall.  

These tools, trained on sets of data that had a bias built into them, inherited those biases and this affected how well they worked. Joy and Timnit published the results of their research and it was picked up and discussed in the news as people began to realise the extent of the problem, and what this might mean for the ways in which facial recognition tech is used. 

“An audit of commercial facial-analysis tools found that dark-skinned faces are misclassified at a much higher rate than are faces from any other group. Four years on, the study is shaping research, regulation and commercial practices.”

The unseen Black faces of AI algorithms (19 October 2022) Nature

There is some good news though. The three companies made changes to improve their facial recognition technology systems and several US cities have already banned the use of this tech in criminal investigations, and more cities are calling for it too. People around the world are becoming more aware of the limitations of this type of technology and the harms to which it may be (perhaps unintentionally) put and are calling for better regulation of these systems.

Further reading

Study finds gender and skin-type bias in commercial artificial-intelligence systems (11 February 2018) MIT News
Facial recognition software is biased towards white men, researcher finds (11 February 2018) The Verge
Go read this special Nature issue on racism in science (21 October 2022) The Verge

More technical articles

• Joy Buolamwini and Timnit Gebru (2018) Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, Proceedings of Machine Learning Research 81:1-15.
The unseen Black faces of AI algorithms (19 October 2022) Nature News & Views


See more in ‘Celebrating Diversity in Computing

We have free posters to download and some information about the different people who’ve helped make modern computing what it is today.

Screenshot showing the vibrant blue posters on the left and the muted sepia-toned posters on the right

Or click here: Celebrating diversity in computing


This blog is funded through EPSRC grant EP/W033615/1.

Lego computer science: compression algorithms

Continuing a series of blogs on what to do with all that lego scattered over the floor: learn some computer science…

A giraffe as a pixel image.
Colour look-up table
Black 0
Blue 1
Yellow 2
Green 3
Brown 4

We saw in the last post how images are stored as pixels – the equivalent of square or round lego blocks of different colours laid out in a grid like a mosaic. By giving each colour a number and drawing out a gird of numbers we give ourself a map to recreate the picture from. Turning that grid of numbers into a list (and knowing the size of the rectangle that is the image) we can store the image as a file of numbers, and send it to someone else to recreate.

Of course, we didn’t really need that grid of numbers at all as it is the list we really need. A different (possibly quicker) way to create the list of numbers is work through the picture a brick at a time, row by row and find a brick of the same colour. Then make a long line of those bricks matching the ones in the lego image, keeping them in the same order as in the image. That long line of bricks is a different representation of the image as a list instead of as a grid. As long as we keep the bricks in order we can regenerate the image. By writing down the number of the colour of each brick we can turn the list of bricks into another representation – the list of numbers. Again the original lego image can be recreated from the numbers.

The image as a list of bricks and numbers
Colour look-up table: Black 0: Blue 1: Yellow 2: Green 3: Brown 4

The trouble with this is for any decent size image it is a long list of numbers – made very obvious by the very long line of lego bricks now covering your living room floor. There is an easy thing to do to make them take less space. Often you will see that there is a run of the same coloured lego bricks in the line. So when putting them out, stack adjacent bricks of the same colour together in a pile, only starting a new pile if the bricks change colour. If eventually we get to more bricks of the original colour, they start their own new pile. This allows the line of bricks to take up far less space on the floor. (We have essentially compressed our image – made it take less storage space, at least here less floor space).

Now when we create the list of numbers (so we can share the image, or pack all the lego away but still be able to recreate the image), we count how many bricks are in each pile. We can then write out a list to represent the numbers something like 7 blue, 1 green, … Of course we can replace the colours by numbers that represent them too using our key that gives a number to each colour (as above).

If we are using 1 to mean blue and the line of bricks starts with a pile of seven black bricks then write down a pair of numbers 7 1 to mean “a pile of seven blue bricks”. If this is followed by 1 green bricks with 3 being used for green then we next write down 1 3, to mean a pile of 1 green bricks and so on. As long as there are lots of runs of bricks (pixels) of the same colour then this will use far less numbers to store than the original:

7 1 1 3 6 1 2 3 1 1 1 2 3 1 2 3 2 2 3 1 2 3 …

We have compressed our image file and it will now be much quicker to send to a friend. The picture can still be rebuilt though as we have not lost any information at all in doing this (it is called a lossless data compression algorithm). The actual algorithm we have been following is called run-length encoding.

Of course, for some images, it may take more not less numbers if the picture changes colour nearly every brick (as in the middle of our giraffe picture). However, as long as there are large patches of similar colours then it will do better.

There are always tweaks you can do to algorithms that may improve the algorithm in some circumstances. For example in the above we jumped back to the start of the row when we got to the end. An alternative would be to snake down the image, working along the adjacent rows in opposite directions. That could improve run-length encoding for some images because patches of colour are likely the same as the row below, so this may allow us to continue some runs. Perhaps you can come up with other ways to make a better image compression algorithm

Run-length encoding is a very simple compression algorithm but it shows how the same information can be stored using a different representation in a way that takes up less space (so can be shared more quickly) – and that is what compression is all about. Other more complex compression algorithms use this algorithm as one element of the full algorithm.

Activities

Make this picture in lego (or colouring in on squared paper or in a spreadsheet if you don’t have the lego). Then convert it to a representation consisting of a line of piles of bricks and then create the compressed numbered list.

An image of a camel to compress: Colour look-up table: Black 0: Blue 1: Yellow 2: Green 3: Brown 4

Make your own lego images, encode and compress them and send the list of numbers to a friend to recreate.


Find more about Lego Art at lego.com.

Find more pixel puzzles (no lego needed, just coloured pens or spreadsheets) at https://teachinglondoncomputing.org/pixel-puzzles/


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.

Lego Computer Science

Part of a series featuring featuring pixel puzzles,
compression algorithms, number representation,
gray code, binaryand computation.

Lego Computer Science


Part 1: Lego Computer Science: pixel picture

Part 2: Lego Computer Science: compression algorithms

Part 3: Lego Computer Science: representing numbers

Part 4: Lego Computer Science: representing numbers using position

Part 5: Lego Computer Science: Gray code

Part 6: Lego Computer Science: Binary

Part 7: Lego Computer Science: What is computation (simple cellular automata)?

Lego computer science: pixel pictures

by Paul Curzon, Queen Mary University of London

It is now after Christmas. You are stuffed full of turkey, and the floor is covered with lego. It must be time to get back to having some computer science fun, but could the lego help? As we will see you can explore digital images, cryptography, steganography, data compression, models of computing, machine learning and more with lego (and all without getting an expensive robot set which is the more obvious way to learn computer science with lego though you do need lots of lego). Actually you could also do it all with other things that were in your stocking like a bead necklace making set and probably with all that chocolate, too.

First we are going to look at understanding digital images using lego (or beads or …)

Raster images

Digital images come in two types: raster (or bitmap) images and vector images. They are different kinds of image representation. Lego is good for experimenting with the former through pixel puzzles. The idea is to make mosaic-like pictures out of a grid of small coloured lego. Lego have recently introduced a whole line of sets called Lego Art should you want to buy rather amazing versions of this idea, and you can buy an “Art Project” set that gives you all the bits you need to make your own raster images. You can (in theory at least) make it from bits and pieces of normal lego too. You do need quite a lot though.

Raster images are the basic kind of digital image as used by digital cameras. A digital image is split into a regular grid of small squares, called pixels. Each pixel is a different colour.

To do it yourself with normal lego you need, for starters, to collect lots of the small circle or square pieces of different colours. You then need a base to put them on. Either use a flat plate piece if you have one or make a square base of lego pieces that is 16 by 16. Then, filling the base completely with coloured pieces to make a mosaic-like picture. That is all a digital image really is at heart. Each piece of lego is a pixel. Computer images just have very tiny pieces, so tiny that they all merge together.

Here is one of our designs of a ladybird.

A pixel image of a ladybird

The more small squares you have to make the picture, the higher the resolution of the image With only 16 x 16 pixels we have a low resolution image. If you only have enough lego for an 8×8 picture then you have lower resolution images. If you are lucky enough to have a vast supply of lego then you will be able to make higher resolution, so more accurate looking images.

Lego-by-numbers

Computers do not actually store colours (or lego for that matter). Everything is just numbers. So the image is stored in the computer as a grid of numbers. It is only when the image is displayed it is converted to actual colours. How does that work. Well you first of all need a key that maps colours to numbers: 0 for black, 1 for red and so on. The number of colours you have is called the colour depth – the more numbers and linked colours in your key, the higher the colour depth. So the more different coloured lego pieces you were able to collect the larger your colour depth can be. Then you write the numbers out on squared paper with each number corresponding to the colour at that point in your picture. Below is a version for our ladybird…

The number version of our ladybird picture

Now if you know this is a 16×16 picture then you can write it out (so store it) as just a list of numbers, listed one row after another instead: [5,5,4,4,…5,5,0,4,…4,4,7,2] rather than bothering with squared paper. To be really clear you could even make the first two numbers the size of the grid: [16,16,5,5,4,4,…5,5,0,4,…4,4,7,2]

That along with the key is enough to recreate the picture which has to be either agreed in advance or sent as part of the list of numbers.

You can store that list of numbers and then rebuild the picture anytime you wish. That is all computers are doing when they store images where the file storing the numbers is called an image file.

A computer display (or camera display or digital tv for that matter) is just doing the equivalent of building a lego picture from the list of numbers every time it displays an image, or changes an old one for something new. Computers are very fast at doing this and the speed they do so is called the frame rate – how many new pictures or frames they can show every second. If a computer has a frame rate of 50 frames per second, then it as though it can do the equivalent of make a new lego image from scratch 50 times every second! Of course it is a bit easier for a computer as it is just sending instructions to a display to change the colour shown in each pixels position rather than actually putting coloured lego bricks in place.

Sharing Images

Better still you can give that list of numbers to a friend and they will be able to rebuild the picture from their own lego (assuming they have enough lego of the right colours of course). Having shared your list of numbers, you have just done the equivalent of sending an image over the internet from one computer to another. That is all that is happening when images are shared, one computer sends the list of numbers to another computer, allowing it to recreate a copy of the original. You of course still have your original, so have not given up any lego.

So lego can help you understand simple raster computer images, but there is lots more you can learn about computer science with simple lego bricks as we will see…


Find more about Lego Art at lego.com.

Find more pixel puzzles (no lego needed, just coloured pens or spreadsheets) at https://teachinglondoncomputing.org/pixel-puzzles/


This post was funded by UKRI, through grant EP/K040251/2 held by Professor Ursula Martin, and forms part of a broader project on the development and impact of computing.

Lego Computer Science

Part of a series featuring featuring pixel puzzles,
compression algorithms, number representation,
gray code, binaryand computation.