How far can you hear? Modelling distant birdsong.


by Dan Stowell, Queen Mary University of London

Sunrise blackbird image by No-longer-here from Pixabay

How do we know how many birds there are out there: in the countryside, and in the city? Usually, it’s because people have been sent out to count the birds – by sight but especially by sound. Often you can hear a bird singing even when it’s hidden from sight so listening can be a much more effective way of counting.

In the UK, volunteers have been out counting birds for decades, co-ordinated by organisations such as the British Trust for Ornithology (BTO). But pretty quickly they came up against a problem: you can’t always detect every bird around you, even if you’re an expert at it. Birds get harder to detect the further away they are. To come up with good numbers, the BTO estimates what fraction of the birds you are likely to miss, according to how far away you are, and uses that to improve the estimate from the volunteer surveys.

But, Alison Johnston and others at the BTO noticed that it’s even more complicated than that: you can hear some types of bird very clearly over a long distance, while other birds make a sound that disappears into the background easily. If a pigeon is cooing in the forest, maybe you can’t hear it beyond a few metres. Whereas the twit-twoo of an owl might carry much further. So they measured how likely it is that one of their volunteers will hear each species, at different distances.

They created mathematical models that took into account these factors. Implemented in programs the models can then adjust the reports coming in from the volunteers doing the counting. This is how volunteers and computers are combined in ‘citizen science’ work which gathers observations from people all around the country. Sightings and numbers are collected, but the raw numbers themselves don’t give you the correct picture – they need to be adjusted using mathematical models that help fill in the gaps.


You can perfect your own recognition of British birdsong with the audio clips here.


This article was first published on the original CS4FN site and a copy is available on page 14 of Issue 21 (Computing Sounds Wild) of the CS4FN magazine. You can download a free PDF copy below along with all of our free material from our downloads site.


Related Magazine …


EPSRC supports this blog through research grant EP/W033615/1.

Threads & Yarns – textiles and electronics

by Paul Curzon, Queen Mary University of London, from June 2011

At first sight nothing could be more different than textiles and electronics. Put opposites together and you can maybe even bring historical yarns to life. That’s what Queen Mary’s G.Hack team helped do. They are an all-woman group of electronic engineering and computer science research students and they helped build an interactive art installation combining textiles and personal stories about health.

In June 2011 the G.Hack team was asked by Jo Morrison and Rebecca Hoyes from Central Saint Martins College of Art and Design to help make their ‘Threads & Yarns‘ artwork interactive. It was commissioned by the Wellcome Trust as a part of their 75th Anniversary celebrations. They wanted to present personal accounts about the changes that have taken place in health and well-being over the 75 years since they were founded.

A flower from a Threads and Yarns event Photo credit: Jo Morrison
Continue reading “Threads & Yarns – textiles and electronics”

3D models in motion

by Paul Curzon, Queen Mary University of London
based on a 2016 talk by Lourdes Agapito

The cave paintings in Lascaux, France are early examples of human culture from 15,000 BC. There are images of running animals and even primitive stop motion sequences – a single animal painted over and over as it moves. Even then, humans were intrigued with the idea of capturing the world in motion! Computer scientist Lourdes Agapito is also captivated by moving images. She is investigating whether it’s possible to create algorithms that allow machines to make sense of the moving world around them just like we do. Over the last 10 years her team have shown, rather spectacularly, that the answer is yes.

People have been working on this problem for years, not least because the techniques are behind the amazing realism of CGI characters in blockbuster movies. When we see the world, somehow our brain turns all that information about colour and intensity of light hitting our eyes into a scene we make sense of – we can pick out different objects and tell which are in front and which behind, for example. In the 1950s psychophysics* researcher Gunnar Johansson showed how our brain does this. He dressed people in black with lightbulbs fastened around their bodies. He then filmed them walking, cycling, doing press-ups, climbing a ladder, all in the dark … with only the lightbulbs visible. He found that people watching the films could still tell exactly what they were seeing, despite the limited information. They could even tell apart two people dancing together, including who was in front and who behind. This showed that we can reconstruct 3D objects from even the most limited of 2D information when it involves motion. We can keep track of a knee, and see it as the same point as it moves around. It also shows that we use lots of ‘prior’ information – knowledge of how the world works – to fill in the gaps.

A motion capture system, image credit: T-tus at English Wikipedia

Shortcuts

Film-makers already create 3D versions of actors, but they use shortcuts. The first shortcut makes it easier to track specific points on an actor over time. You fix highly visible stickers (equivalent to Johansson’s light bulbs) all over the actor. These give the algorithms clear points to track. This is a bit of a pain for the actors, though. It also could never be used to make sense of random YouTube or CCTV footage, or whatever a robot is looking at.

The second shortcut is to surround the action with cameras so it’s seen from lots of angles. That makes it easier to track motion in 3D space, by linking up the points. Again this is fine for a movie set, but in other situations it’s impractical.

A third shortcut is to create a computer model of an object in advance. If you are going to be filming an elephant, then hand-create a 3D model of a generic elephant first, giving the algorithms something to match. Need to track a banana? Then create a model of a banana instead. This is fine when you have time to create models for anything you might want your computer to spot.

It is all possible for big budget film studios, if a bit inconvenient, but it’s totally impractical anywhere else.

No Shortcuts

Lourdes took on a bigger challenge than the film industry. She decided to do it without the shortcuts: to create moving 3D models from single cameras, applied to any traditional 2D footage, with no pre-placed stickers or fixed models created in advance.

When she started, a dozen or so years ago, making any progress looked incredibly difficult. Now she has largely solved the problem. Her team’s algorithms are even close to doing it all in real time, so making sense of the world as it happens, just like us. They are able to make really accurate models down to details like the subtle movements of their face as a person talks and changes expression.

There are several secrets to their success, but Johansson’s revelation that we rely on prior knowledge is key. One of the first breakthroughs was to come up with ways that individual points in the scene like the tip of a person’s nose could be tracked from one frame of video to the next. Doing this well relies on making good use of prior information about the world. For example, points on a surface are usually well-behaved in that they move together. That can be used to guess where a point might be in the next frame, given where others are.

The next challenge was to reconstruct all the pixels rather than just a few easy to identify points like the tip of a nose. This takes more processing power but can be done by lots of processors working on different parts of the problem. Key to this was to take account of the smoothness of objects. Essentially a virtual fine 3D mesh is stuck over the object – like a mask over a face – and the mesh is tracked. You can then even stick new stuff on top of the mesh so they move together – adding a moustache, or painting the face with a flag, for example, in a way that changes naturally in the video as the face moves.

Once this could all be done, if slowly, the challenge was to increase the speed and accuracy. Using the right prior information was again what mattered. For example, rather than assuming points have constant brightness, taking account of the fact that brightness changes, especially on flexible things like mouths, mattered. Other innovations were to split off the effect of colour from light and shade.

There is lots more to do, but already the moving 3D models created from YouTube videos are very realistic, and being processed almost as they happen. This opens up amazing opportunities for robots; augmented reality that mixes reality with the virtual world; games, telemedicine; security applications, and lots more. It’s all been done a little at a time, taking an impossible-seeming problem and instead of tackling it all at once, solving simpler versions. All the small improvements, combined with using the right information about how the world works, have built over the years into something really special.

*psychophysics is the “subfield of psychology devoted to the study of physical stimuli and their interaction with sensory systems.”


This article was first published on the original CS4FN website and a copy appears on pages 14 and 15 in “The women are (still) here”, the 23rd issue of the CS4FN magazine. You can download a free PDF copy by clicking on the magazine’s cover below, along with all of our free material.

Another article on 3D research is Making sense of squishiness – 3D modelling the natural world (21 November 2022).


Related Magazine …


EPSRC supports this blog through research grant EP/W033615/1.

Frequency Analysis for Fun

by Paul Curzon, Queen Mary University of London

Frequency Analysis, a technique beloved by spies for centuries, and that led to the execution of at least one Queen, also played a part in the development of the game Scrabble, over a hundred million copies of which have been sold worldwide.

Frequency Analysis was invented by Al-Kindi, a 9th Century Muslim, Arabic Scholar, as a way of cracking codes. He originally described it in his “A Manuscript on Deciphering Cryptographic Messages“. Frequency analysis just involves taking a large amount of normal text written in the language of interest and counting how often each letter appears. For example in English, the letter E is the most common. With simple kinds of cyphers that is enough information to be able to crack them, just by counting the frequency of the letters in the code you want to crack. Now large numbers of everyday people do frequency analysis just for fun, solving Cross Reference puzzles.

The link between frequency analysis and puzzles goes back earlier. When the British were looking for potential code breakers to staff their secret code breaking establishment at Bletchley Park in World War II, they needed people with the skills of frequency analysis and problem solving skills. They did this by setting up Crossword competitions and offering those who were fastest jobs at Bletchley: possibly the earliest talent competition with career changing prizes!

Earlier still, in the 1930s, Architect Alfred Mosher Butts, hit on the idea of a new game that combined crosswords and anagrams, which were both popular at the time. The result was Scrabble. However, when designing the game he had a problem in that he needed to decide how many of each letter the game should have and also how to assign the scores. He turned to frequency analysis of the front page of the New York Times to give the answers. He broke the pattern of his frequency analysis though, including fewer letter Ss (the second most common word in English) than there should be so the game wasn’t made too easy because of plurals.

Sherlock Holmes, of course, was a master of frequency analysis as described in the 1903 story “The Adventure of the Dancing Men”. Sir Arthur Conan Doyle wasn’t the first author to use it as a plot device though. Edgar Alan Poe had based a short story called “The Gold Bug” around frequency analysis in 1843. It was Poe who originally popularised frequency analysis with the general public rather than just with spymasters. Poe had discovered how popular the topic was as a result of having set a challenge in a magazine for people to send in cyphers – that he would then crack, giving the impression at the time that he had near supernatural powers. The way it was done was then described in detail in “The Gold Bug”.


This article was first published on the original CS4FN website. We have lots of free magazines and magic booklets that you can download.

Frequency analysis also appears in: The dark history of algorithms.

For little kids we have some fun free kriss-kross puzzles – they’re like crosswords but you’re given the words and you have to fit them into the crossword shape. You need to think like a computer scientist and use logical thinking, pattern matching and computational thinking to complete them. (For even younger kids these can also be used as a way of practising spelling, phonics and writing out words).

Example of one of our kriss-kross puzzles, for younger readers

EPSRC supports this blog through research grant EP/W033615/1.

Keeping secrets on the Internet – encryption keeps your data safe

By Ben Stephenson, University of Calgary

How do modern codes keep your data safe online? Ben Stephenson of the University of Calgary explains

When Alan Turing was breaking codes, the world was a pretty dangerous place. Turing’s work helped uncover secrets about air raids, submarine locations and desert attacks. Daily life might be safer now, but there are still threats out there. You’ve probably heard about the dangers that lurk online – scams, identity theft, viruses and malware, among many others. Shady characters want to know your secrets, and we need ways of keeping them safe and secure to make the Internet work. How is it possible that a network with so many threats can also be used to securely communicate a credit card number, allowing you to buy everything from songs to holidays online?

The relay race on the Internet

When data travels over the Internet it is passed from computer to computer, much like a baton is passed from runner to runner in a relay race. In a relay race, you know who the other runners will be. The runners train together as a team, and they trust each other. On the Internet, you really don’t know much about the computers that will be handling your data. Some may be owned by companies that you trust, but others may be owned by companies you have never heard of. Would you trust your credit card number to a company that you didn’t even know existed?

The way we solve this problem is by using encryption to disguise the data with a code. Encrypting data makes it meaningless to others, so it is safe to transfer the data over the Internet. You can think of it as though each message is locked in a chest with a combination lock. If you don’t have the combination you can’t read the message. While any computer between us and the merchant can still view or copy what we send, they won’t be able to gain access to our credit card number because it is hidden by the encryption. But the company receiving the data still needs to decrypt it – open the lock. How can we give them a way to do it without risking the whole secret? If we have to send them the code a spy might intercept it and take a copy.

Keys that work one way only

The solution to our problem is to use a relatively new encryption technique known as public key cryptography. (It’s actually about 40 years old, but as the history of encryption goes back thousands of years, a technique that’s only as old as Victoria Beckham counts as new!) With this technique the code used to encrypt the message (lock the chest) is not able to decrypt it (unlock it). Similarly, the key used to decrypt the message is not able to encrypt it. This may sound a little bit odd. Most of the time when we think about locking a physical object like a door, we use the same key to lock it that we will use to unlock it later. Encryption techniques have also followed this pattern for centuries, with the same key used to encrypt and decrypt the data. However, we don’t always use the same key for encrypting (locking) and decrypting (unlocking) doors. Some doors can be locked by simply closing them, and then they are later unlocked with a key, access card, or numeric code. Trying to shut the door a second time won’t open it, and similarly, using the key or access code a second time won’t shut it. With our chest, the person we want to communicate with can send us a lock only they know the code for. We can encrypt by snapping the lock shut, but we don’t know the code to open it. Only the person who sent it can do that.

We can use a similar concept to secure electronic communications. Anyone that wants to communicate something securely creates two keys. The keys will be selected so that one can only be used for encryption (the lock), and the other can only be used for decryption (the code that opens it). The encryption key will be made publicly available – anyone that asks for it can have one of our locks. However, the decryption key will remain private, which means we don’t tell anyone the code to our lock. We will have our own public encryption key and private decryption key, and the merchant will have their own set of keys too. We use one of their locks, not ours, to send a message to them.

Turning a code into real stuff

So how do we use this technique to buy stuff? Let’s say you want to buy a book. You begin by requesting the merchant’s encryption key. The merchant is happy to give it to you since the encryption key isn’t a secret. Once you have it, you use it to encrypt your credit card number. Then you send the encrypted version of your credit card number to the merchant. Other computers listening in might know the merchant’s public encryption key, but this key won’t help them decrypt your credit card number. To do that they would need the private decryption key, which is only known to the merchant. Once your encrypted credit card number arrives at the merchant, they use the private key to decrypt it, and then charge you for the goods that you are purchasing. The merchant can then securely send a confirmation back to you by encrypting it with your public encryption key. A few days later your book turns up in the post.

This encryption technique is used many millions of times every day. You have probably used it yourself without knowing it – it is built into web browsers. You may not imagine that there are huts full of codebreakers out there, like Alan Turing seventy years ago, trying to crack the codes in your browser. But hackers do try to break in. Keeping your browsing secure is a constant battle, and vulnerabilities have to be patched up quickly once they’re discovered. You might not have to worry about air raids, but codes still play a big role behind the scenes in your daily life.


Here’s another article from the author, Ben Stephenson: 100,000 frames – quick draw: how computers help animators create.

The ‘Keeping secrets on the internet‘ article was first published on the original CS4FN website and there’s a copy on pages 4-5 of the Alan Turing issue (#14) of the CS4FN magazine. You can download a free PDF of the magazine below, along with all of our other free material at our downloads site.


Related Magazine …


EPSRC supports this blog through research grant EP/W033615/1.

Composing from Compression

by Geraint Wiggins, Queen Mary University of London

Computers compress files to save space. But it also allows them to create music!

Recoloured Cranium head abstract image by Gordon Johnson from Pixabay

Music is special. It’s one of the things, like language, that makes us human, separating us from animals. It’s also special as art, because it doesn’t exist as an object in the world – it depends on human memory. “But what about CDs? They’re objects in the world”, you might say and you’d be right, but the CD is not the music. The CD contains data files of numbers. Those numbers are translated by electronics into the movements in a loudspeaker, to create sound waves. Even the sound waves aren’t music! They only become music when a human hears them, because understanding music is about noticing repetition, variation and development in its structure. That’s why songs have verses and choruses: so we can find a starting point to understand its structure. In fact, we’re so good at understanding musical structure, we don’t even notice we’re doing it. What’s more, music affects us emotionally: we get excited (using the same chemicals that get us excited when we’re in love or ready to flee danger) when we hear the anthem section of a trance track, or recognise the big theme returning at the end of a symphony.

Surprisingly, brains seem to understand musical structure in a way that’s like the algorithms computer scientists use to compress data. It’s better to store data compressed than uncompressed, because it takes less storage space. We think that’s why brains do it too.

Even more surprisingly, brains also seem to be able to learn the best way to store compressed music data. Computers use bits as their basic storage unit, but we can make groups of bits work like other things (numbers, words, pictures, angry birds…); brains seem to do something similar. For example, pitch (high vs. low notes) in sequence is an important part of music: we build melodies by lining up notes of different pitch one after the other. As we learn to hear music (starting before birth, and continuing throughout life), we learn to remember pitch in ever more efficient ways, giving our compression algorithms better and better chances to compress well. And so we remember music better.

Our team use compression algorithms to understand how music works in the human mind. We have discovered that, when our programs compress music, they can sometimes predict musical structures, even if neither they nor a human have “heard” them before. To compress something, you find large sections of repeated data and replace each with a label saying “this is one of those”. It’s like labelling a book with its title: if you’ve read Lord of the Rings, when I say the title you know what I mean without me telling the story. If we do this to the internal structure of music, there are little repetitions everywhere, and the order that they appear is what makes up the music’s structure.

If we compress music, but then decompress it in a different way, we can get a new piece of music in a similar style or genre. We have evidence that human composers do that too!

What our programs are doing is learning to create new music. There’s a long way to go before they produce music you’ll want to dance to – but we’re getting there!


This article was first published on the original CS4FN website and a copy can be found on page 12 in Issue 18 of the CS4FN magazine: Machines that are creative. You can download a free PDF copy below, along with all of our other free magazines and booklets at our downloads site.


Related Magazine …

EPSRC supports this blog through research grant EP/W033615/1.