Engineering a cloak of invisibility: manipulating light with metamaterials

by Akram Alomainy and Paul Curzon, QMUL

You pull a cloak around you and disappear! Reality or science fiction? Harry Potter’s invisibility cloak is surely Hogwarts’ magic that science can’t match. Even in Harry Potter’s world it takes powerful magic and complicated spells to make it work. Turns out even that kind of magic can be done with a combination of materials science and computer science. Professor Susumu Tachi of the University of Tokyo has developed a cloak made of thousands of tiny beads. Cameras video what is behind you and a computer system then projects the appropriate image onto the front of the cloak. The beads are made of a special material called retro-reflectrum. It is vital to give the image a natural feel – normal screens give too flat a look, losing the impression of seeing through the person. Now you see me, now you don’t at the flick of a switch.

But could an invisibility cloak, without tiny screens on it, ever be a reality? It sounds impossible especially if you understand how light behaves. It bounces off the things around us, travelling in straight lines. You see them when that reflected light eventually reaches your eyes. I can see the red toy over there because red light bounced from it to me. For it to be invisible, no light from it must reach my eyes, while at the same time light from everything else around should. How could that be possible? Akram Alomainy of Queen Mary, University of London, tells us more.

Well maybe things aren’t quite that simple…halls of mirrors, rainbows, polar bears and desert mirages all suggest some odd things can happen with light! They show that manipulating light is possible and that we may even be able to bend it in a way that alters the way things look – even humans.

Light fantastic

Have you ever wondered how the hall of mirrors in a fun fair distorts your reflection? Some make us look short and fat while others make us tall and slim! It’s all about controlling the behaviour of light. The light rays still travel in straight lines, but the mirrors deceive the eye. The light seems to arrive from a different place to reality because the mirrors are curved, not flat, making the light bounce at odd angles.

A rainbow is an object we see that isn’t really there. They occur because white light doesn’t actually exist. It is just coloured light all mixed up. When it hits a surface it separates back into individual colours. The colour of an object you see depends on which colours pass through or get reflected, and which get absorbed. The light is white when it hits the raindrops, but then comes out as the whole spectrum of colours. They head off at slightly different angles, which is why they appear in the different rainbow positions.

What about polar bears? Did you know that they have black skins and semi-transparent hair? You see them as white because of the way the hollow hairs reflect sunlight.

So what does this have to do with invisibility? Well, it suggests that with light all is not as it seems. Perhaps we can manipulate it to do anything we want.

Water! Water!

Now for the clincher – mirages! They show that invisibility cloaks ought to be a possibility. Light from the sun travels in a straight line through the sky. That means we see everything as it is. Except not quite. In places like deserts where the temperature is very high at noon, apparently weird things happen to the light. The difference between the temperature, and thus the difference in density between the higher air layers and the levels closer to the ground can be quite large. That temperature difference makes light coming from the sky change direction as it passes through each layer. It bends rather than just travelling in a straight line to us. It is that image of the sky that looks like the pool of water – the mirage. Our brains assume the light travelled in a straight line, so they misinterpret its location. Now, to make something invisible we just need to make light bend round it. That invisibility cloak is a possibility if we can just engineer what mirages do – bend light!

Nano-machines

That is the basic idea and it is an area of science called ‘transformation optics’ that makes it possible. The science tells us about the properties that each point of an object must have to make light waves travel in any particular way we wish through it. To make it happen engineers must then create special materials with those properties. These materials are known as metamaterials. Their properties are controlled using electromagnetism, which is where the electronic engineers come in! You can think of them as being made of vast numbers of tiny electrical machines built into big human-scale structures. Each tiny machine is able to control how light passes through it, even bending light in a way no natural material could. If the machines are small enough – ‘nanotechnology’ as small as the wavelength of light – and their properties can be controlled really precisely to match the science’s prediction, then we can make light passing through them do anything we want. For invisibility, the aim is to control those properties so the light bends as it passes through a metamaterial cloak. If the light comes out the other side of the cloak unchanged and travelling in the same direction as it entered, while avoiding objects in the middle, then those objects will be invisible.

Now you see it…

Simple cloaking devices that work this way have already been created but they are still very limited. One of the major challenges is the range of light they can work with. At the moment it’s possible to make a cloak that bends a single colour frequency, but not all light. As Yang Hao, a professor working in this area at Queen Mary, notes: “The obstacle engineers face is the complex manufacturing techniques needed to build devices that can bend light across the whole visible light spectrum. However, with the progress being made in nanotechnologies this could become a possibility in the near future”.

Perhaps we should leave the last word to J.K. Rowling: “A suspicious object like that, it was clearly full of Dark Magic.” So while we should appreciate the significance of such an invention we should perhaps be careful about the negative consequences!


More on …

Related Magazines…


EPSRC supports this blog through research grant EP/W033615/1.

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


Solving problems you care about

by Patricia Charlton and Stefan Poslad, Queen Mary University of London Queen Mary University of London

The best technology helps people solve real problems. To be a creative innovator you need not only to be able to create a solution that works but also to spot a need in the first place and be able to come up with creative solutions. Over the summer a group of sixth formers on internships at Queen Mary had a go at doing this. Ultimately their aim was to build something from a programmable gadget such as a BBC micro:bit or Raspberry Pi. They therefore had to learn about the different possible gadgets they could use, how to program them and how to control the on-board sensors available. They were then given the design challenge of creating a device to solve a community problem.

Hearing the bus is here

Tai Kirby wanted to help visually impaired people. He knew that it’s hard for someone with poor sight to tell when a bus is arriving. In busy cities like London this problem is even worse as buses for different destinations often arrive at once. His solution was a prototype that announces when a specific bus is arriving, letting the person know which was which. He wrote it in Python and it used a Raspberry pi linked to low energy Bluetooth devices.

The fun spell

Filsan Hassan decided to find a fun way to help young kids learn to spell. She created a gadget that associated different sounds with different letters of the alphabet, turning spelling words into a fun, musical experience. It needed two micro:bits and a screen communicating with each other using a radio link. One micro:bit controlled the screen while the other ran the main program that allowed children to choose a word, play a linked game and spell the word using a scrolling alphabet program she created. A big problem was how to make sure the combination of gadgets had a stable power supply. This needed a special circuit to get enough power to the screen without frying the micro:bit and sadly we lost some micro:bits along the way: all part of the fun!

Remote robot

Jesus Esquivel Roman developed a small remote-controlled robot using a buggy kit. There are lots of applications for this kind of thing, from games to mine-clearing robots. The big challenge he had to overcome was how to do the navigation using a compass sensor. The problem was that the batteries and motor interfered with the calibration of the compass. He also designed a mechanism that used the accelerometer of a second micro:bit allowing the vehicle to be controlled by tilting the remote control.

Memory for patterns

Finally, Venet Kukran was interested in helping people improve their memory and thinking skills. He invented a pattern memory game using a BBC micro:bit and implemented in micropython. The game generates patterns that the player has to match and then replicate to score points. The program generates new patterns each time so every game is different. The more you play the more complex the patterns you have to remember become.

As they found you have to be very creative to be an innovator, both to come up with real issues that need a solution, but also to overcome the problems you are bound to encounter in your solutions


Related Magazine …


EPSRC supports this blog through research grant EP/W033615/1.

Stretching your keyboard – getting more out of QWERTY

by Jo Brodie, Queen Mary University of London

A QWERTY keyboard after smartphone keyboards starting with keys q w e r t y on the top row
A smartphone’s on-screen keyboard layout, called QWERTY after the first six letters on the top line. Image by CS4FN after smartphone QWERTY keyboards.

If you’ve ever sent a text on a phone or written an essay on a computer you’ve most likely come across the ‘QWERTY’ keyboard layout. It looks like this on a smartphone.

This layout has been around in one form or another since the 1870s and was first used in old mechanical typewriters where pressing a letter on the keyboard caused a hinged metal arm with that same letter embossed at the end to swing into place, thwacking a ribbon coated with ink, to make an impression on the paper. It was quite loud!

The QWERTY keyboard isn’t just used by English speakers but can easily be used by anyone whose language is based on the same A,B,C Latin alphabet (so French, Spanish, German etc). All the letters that an English-speaker needs are right there in front of them on the keyboard and with QWERTY… WYSIWYG (What You See Is What You Get).  There’s a one-to-one mapping of key to letter: if you tap the A key you get a letter A appearing on screen, click the M key and an M appears. (To get a lowercase letter you just tap the key but to make it uppercase you need to tap two keys; the up arrow (‘shift’) key plus the letter).

A French or Spanish speaking person could also buy an adapted keyboard that includes letters like É and Ñ, or they can just use a combination of keys to make those letters appear on screen (see Key Combinations below). But what about writers of other languages which don’t use the Latin alphabet? The QWERTY keyboard, by itself, isn’t much use for them so it potentially excludes a huge number of people from using it.

In the English language the letter A never alters its shape depending on which letter goes before or comes after it. (There are 39 lower case letter ‘a’s and 3 upper case ‘A’s in this paragraph and, apart from the difference in case, they all look exactly the same.) That’s not the case for other languages such as Arabic or Hindi where letters can change shape depending on the adjacent letters. With some languages the letters might even change vertical position, instead of being all on the same line as in English.

Early attempts to make writing in other languages easier assumed that non-English alphabets could be adapted to fit into the dominant QWERTY keyboard, with letters that are used less frequently being ignored and other letters being simplified to suit. That isn’t very satisfactory and speakers of other languages were concerned that their own language might become simplified or standardised to fit in with Western technology, a form of ‘digital colonialism’.

But in the 1940s other solutions emerged. The design for one Chinese typewriter avoided QWERTY’s ‘one key equals one letter’ (which couldn’t work for languages like Chinese or Japanese which use thousands of characters – impossible to fit onto one keyboard, see picture at the end!).

Rather than using the keys to print one letter, the user typed a key to begin the process of finding a character. A range of options would be displayed and the user would select another key from among them, with the options narrowing until they arrived at the character they wanted. Luckily this early ‘retrieval system’ of typing actually only took a few keystrokes to bring up the right character, otherwise it would have taken ages.

This is a way of using a keyboard to type words rather than letters, saving time by only displaying possible options. It’s also an early example of ‘autocomplete’ now used on many devices to speed things up by displaying the most likely word for the user to tap, which saves them typing it.

For example in English the letter Q is generally* always followed by the letter U to produce words like QUAIL, QUICK or QUOTE. There are only a handful of letters that can follow QU – the letter Z wouldn’t be any use but most of the vowels would be. You might be shown A, E, I or O and if you selected A then you’ve further restricted what the word could be (QUACK, QUARTZ, QUARTET etc).

In fact one modern typing system, designed for typists with physical disabilities, also uses this concept of ‘retrieval’, relying on a combination of letter frequency (how often a letter is used in the English language) and probabilistic predictions (about how likely a particular letter is to come next in an English word). Dasher is a computer program that lets someone write text without using a keyboard, instead a mouse, joystick, touchscreen or a gaze-tracker (a device that tracks the person’s eye position) can be used.

Letters are presented on-screen in alphabetic order from top to bottom on the right hand side (lowercase first, then upper case) and punctuation marks. The user ‘drives’ through the word by first pushing the cursor towards the first letter, then the next possible set of letters appear to choose from, and so on until each word is completed. You can see it in action in this video on the Dasher Interface.

Key combinations

The use of software to expand the usefulness of QWERTY keyboards is now commonplace with programs pre-installed onto devices which run in the background. These IMEs or Input Method Editors can convert a set of keystrokes into a character that’s not available on the keyboard itself. For example, while I can type SHIFT+8 to display the asterisk (*) symbol that sits on the 8 key there’s no degree symbol (as in 30°C) on my keyboard. On a Windows computer I can create it using the numeric keypad on the right of some keyboards, holding down the ALT key while typing the sequence 0176. While I’m typing the numbers nothing appears but once I complete the sequence and release the ALT key the ° appears on the screen.

English language keyboard image by john forcier from Pixabay, showing the numeric keypad highlighted in yellow with the two Alt keys and the 'num lock' key highlighted in pink. Num lock ('numeric lock') needs to be switched on for the keypad to work, then use the Alt key plus a combination of letters on the numeric keypad to produce a range of additional 'alt code' characters.
English language keyboard image by john forcier from Pixabay highlighted by CS4FN, showing the numeric keypad highlighted in yellow with the two Alt keys and the ‘num lock’ key highlighted in pink. Num lock (‘numeric lock’) needs to be switched on for the keypad to work, then use the Alt key plus a combination of letters on the numeric keypad to produce a range of additional ‘alt code‘ characters.

When Japanese speakers type they use the main ‘ABC’ letters on the keyboard, but the principle is the same – a combination of keys produces a sequence of letters that the IME converts to the correct character. Or perhaps they could use Google Japan’s April Fool solution from 2010, which surrounded the user in half a dozen massive keyboards with hundreds of keys a little like sitting on a massive drum kit!

*QWERTY is a ‘word’ which starts with a Q that’s not followed by a U of course…

Watch …

More on …

The ‘retrieval system’ of typing mentioned above, which lets the user get to the word or characters more quickly, is similar to the general problem solving strategy called ‘Divide and Conquer’. You can read more about that and other search algorithms in our free booklet ‘Searching to Speak‘ (PDF) which explores how the design of an algorithm could allow someone with locked-in syndrome to communicate. Locked-in syndrome is a condition resulting from a stroke where a person is totally paralysed. They can see, hear and think but cannot speak. How could a person with Locked-in syndrome write a book? How might they do it if they knew some computational thinking?


EPSRC supports this blog through research grant EP/W033615/1.

A machine wrote this post – OpenAi’s ChatGPT

Robot emerging from a laptop screen
ChatGPT image AI Generated by Alexandra_Koch from Pixabay

Blog post by OpenAI’s ChatGPT. Introduction by Jo Brodie

1. Jo wrote this…

ChatGPT is an AI chatbot which can have a conversation with you using everyday (natural) language. It has access to the huge amounts of digital information found on the internet and its ability to use language to write good answers in response to questions has been tested and improved by giving it feedback. It learns, and refines its output.

You don’t need to programme it, you can just ask it a question (giving it a prompt) and, after a moment’s ‘thinking’ it will produce an answer. Lots of people have been testing ChatGPT by asking it questions that we already know the answer to, to see how well it can do. It’s actually quite impressive! If you think it’s made a mistake or misunderstood your prompt you can reword your question or give it more information to help it improve its answer.

Can ChatGPT write an article that’s suitable for our magazine or website?

I gave ChatGPT the following prompt: “Please write a 200 to 500 word article suitable for the CS4FN magazine” and moments later it generated a completely new article about machine learning. That told me it knew (or at least could find out) that CS4FN has something to do with computer science. I’m not surprised that it picked machine learning as a topic given that it is a machine that can learn. I didn’t tell it what to write about, it came up with that idea itself.

The article it gave me contained some technical words and not all of them were explained, so I asked it to rewrite the article for a younger audience (who might not have come across some of the words yet).

The result is below – what do you think?

2. ChatGPT wrote this…

From Logic to Learning: The Incredible Advancements of Artificial Intelligence

Artificial Intelligence (AI) has come a long way since it was first invented in the 1950s. At the beginning, AI was based on logic and rules. Experts in a specific field would give the computer a set of instructions to help it make decisions. But, these rules were limited and couldn’t handle complex situations.

Today, AI has seen a huge shift towards machine learning. This means that instead of following rules, AI learns from data. This shift happened because computers got stronger, and more data became available. The breakthroughs in machine learning algorithms helped too.

Machine learning has two categories: supervised learning and unsupervised learning. In supervised learning, the computer learns from labeled data. This means that the correct answer is given to the computer for each input. Once the computer has learned enough, it can predict the answer for new data. Supervised learning is useful for applications like image and speech recognition.

In unsupervised learning, the computer learns from unlabeled data. This means that there is no correct answer provided. The goal of unsupervised learning is to find patterns in the data. This type of learning is useful in applications like anomaly detection and clustering.

Deep learning is a special type of machine learning that uses neural networks. Neural networks are inspired by the human brain and have layers of connected nodes. Each node applies a mathematical function to its inputs and sends the result to the next layer of nodes. Deep learning has revolutionized AI, leading to breakthroughs in areas like natural language processing, computer vision, and game playing.

Even though machine learning has achieved so much, there are still challenges to overcome. One challenge is the lack of transparency in machine learning models. Unlike rule-based systems, it can be hard to understand how a machine learning model made a decision. This can be a problem, especially in areas like healthcare where the decision-making process needs to be trustworthy and explainable.

Another challenge is the need for a lot of data to train machine learning models. While data is easy to find in some areas, like social media, it can be hard to come by in other areas like medical imaging. This lack of data can make machine learning less effective in some applications.

In conclusion, AI has come a long way since the 1950s. We’ve gone from rule-based systems to machine learning, which has led to breakthroughs in many areas. Although there are still challenges to overcome, the future of AI is exciting!

The first attempt is below…

The Evolution of Artificial Intelligence: From Rule-Based to Machine Learning

Artificial Intelligence (AI) has come a long way since its inception in the 1950s. Initially, AI was developed based on rule-based systems, where experts in a specific domain would provide rules to the system to help it make decisions. While these systems worked well in some areas, they were limited in their ability to handle complex problems and adapt to new situations.

In recent years, AI has seen a significant shift towards machine learning, where systems learn from data rather than relying on pre-defined rules. This shift has been driven by advances in computer hardware, the availability of large amounts of data, and breakthroughs in machine learning algorithms.

Machine learning has two main categories: supervised learning and unsupervised learning. In supervised learning, the system is trained on labeled data, where the correct answer is provided for each input. The system uses this labeled data to learn a function that maps inputs to outputs. Once trained, the system can make predictions on new, unseen data. This approach has been successful in a wide range of applications, such as image and speech recognition.

In unsupervised learning, the system is trained on unlabeled data, where no correct answer is provided. The goal of unsupervised learning is to discover patterns and structure in the data. This approach has been useful in areas such as anomaly detection and clustering.

Deep learning is a subset of machine learning that uses neural networks to learn from data. Neural networks are inspired by the structure of the human brain and consist of layers of interconnected nodes. Each node applies a mathematical function to its inputs and passes the result to the next layer of nodes. Deep learning has revolutionized the field of AI, enabling breakthroughs in areas such as natural language processing, computer vision, and game playing.

Despite the many successes of machine learning, there are still limitations and challenges to overcome. One challenge is the lack of interpretability of machine learning models. Unlike rule-based systems, it can be difficult to understand how a machine learning model arrived at a particular decision. This lack of transparency can be problematic in areas such as healthcare, where the decision-making process needs to be explainable and trustworthy.

Another challenge is the need for large amounts of data to train machine learning models. While data is abundant in some domains, such as social media, it can be scarce in others, such as medical imaging. This scarcity of data can limit the effectiveness of machine learning approaches in certain applications.

In conclusion, the evolution of AI from rule-based systems to machine learning has enabled breakthroughs in many areas and holds tremendous potential for the future. However, there are still challenges to overcome, particularly in the areas of interpretability and data scarcity. Addressing these challenges will be critical to realizing the full potential of AI in the years to come.

More on …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Celebrating Jean Bartik: 1940s programmer

Two of the ENIAC programmers, are preparing the computer for Demonstration Day in February 1946. “U.S. Army Photo” from the archives of the ARL Technical Library. Left: Betty Jennings (later Bartik), right: Frances Bilas (Spence) – Image via Wikipedia. Public Domain.

Jean Bartik (born Betty Jean Jennings) was one of six women who programmed “ENIAC” (the Electronic Numerical Integrator and Computer), one of the earliest electronic programmable computers. The work she and her colleagues did in the 1940s had a huge impact on computer science however their contribution went largely unrecognised for 40 years. 

Jean Bartik – born 27 December 1924; died on this day, 23 March 2011

Born in Missouri USA in December 1924 to a family of teachers in Betty (as she was then known) showed promise in Mathematics, graduating from her high school in the summer of 1941 aged 16 with the highest marks in maths ever seen at her school. She began her degree in Maths and English at her local teachers’ college (which is now Northwest Missouri State University) but everything changed dramatically a few months in when the US became involved in the Second World War. The men (teachers and students) were called up for war service leaving a dwindling department and her studies were paused, resuming only in 1943 when retired professors were brought in to teach; she graduated in January 1945, the only person in her year to graduate in Maths.

Although her family encouraged her to become a local maths teacher she decided to seek more distant adventures. The University of Pennsylvania in Philadelphia (~1,000 miles away) had put out a call for people with maths skills to help with the war effort, she applied and was accepted. Along with over 80 other women she was employed to calculate, using advanced maths including differential calculus equations, accurate trajectories of bullets and bombs (ballistics) for the military. She and her colleagues were ‘human computers’ (people who did calculations before the word meant what it does today) creating range tables, columns of information that told the US army where they should point their guns to be sure of hitting their targets. This was complex work that had to take account of weather conditions as well as more obvious things like distance and size of the gun barrel.

Even with 80-100 women working on every possible combination of gun size and angle it still took over a week to generate one data table so the US Army was obviously keen to speed things up as much as possible. They had previously given funding in 1943 to John Mauchly (a physicist) and John Presper Eckert (an electrical engineer) to build a programmable electronic calculator – ENIAC – which would automate the calculations and give them a huge speed advantage. By 1945 the enormous new machine, which took up a room (as computers tended to do in those days) consisted of several thousand vacuum tubes, weighed 30 tonnes and was held together with several million soldered joints. It would be programmed with punched cards with holes punched at different positions in each card allowing a current to pass (or not pass, if no hole present) through a particular set of cables connected through a plugboard (like old-fashioned telephone exchanges). 

From the now 100 women working as human computers in the department six were selected to become the machine’s operators – a role that was exceptional. There were no manuals available and ‘programming’, as we know it today, didn’t yet exist – it was much more physical. Not only did the ‘ENIAC six’ have to correctly wire each cable they had to fully understand the machine’s underlying blueprints and electronic circuits to make it work as expected. Repairs could involve crawling into the machine to fix a broken wire or vacuum tube. 

World War 2 actually ended in September 1945 before ENIAC was brought into full service, but being programmable (which meant rewiring the cables) it would soon be put to other uses. Jean really enjoyed her time working on ENIAC and said later that she’d “never since been in as exciting an environment. We knew we were pushing back frontiers” but she was working at a time when men’s jobs and achievements were given more credit than women’s.

In February 1946 ENIAC was unveiled to the press with its (male) inventors demonstrating its impressive calculating speeds and how much time could be saved compared with people performing the calculations with mechanical desk calculators. While Jean and some of the other women were in attendance (and appear in press photographs of the time) the women were not introduced, their work wasn’t celebrated, they were not always correctly identified in the photographs and were even not invited to the celebratory dinner after the event – as Jean said in a later interview (see the second video (YouTube) below) “We were sort of horrified!”.

In December 1946 she married William Bartik (an engineer) and over the next few years was instrumental in the programming and development of other early computers. She also taught others how to program them (an early computer science teacher!). She often worked with her husband too, following him to different cities for work. However her husband took on a new role in 1951 and the company’s policy was that wives were not allowed to work in the same place. Frustrated, Jean left computing for a while and also took a career break to raise her family. 

In the late 1960s she returned to the field of computer science and for several years she blended her background in Maths and English, writing technical reports on the newer ‘minicomputers’ (still quite large compared to modern computers but you could fit more of them in a room). However the company she worked for was sold off and she was made redundant in 1985 at the age of 60. She couldn’t find another job in the industry which she put down to age discrimination and she spent her remaining career working in real estate (selling property or land). She died, aged 86 on 23 March 2011. 

Jean’s contribution to computer science remained largely unknown to the wider world until 1986 when Kathy Kleinman (an author, law professor and programmer) decided to find out who the women in these photographs were and rediscovered the pioneering work of the ENIAC six.

The ENIAC six women were Kathleen McNulty Mauchly Antonelli, Jean Jennings Bartik, Frances (Betty) Snyder Holberton, Marlyn Wescoff Meltzer, Frances Bilas Spence, and Ruth Lichterman Teitelbaum.

Jo Brodie, Queen Mary University of London.

Watch…

More on …



EPSRC supports this blog through research grant EP/W033615/1.

What’s that bird? Ask your phone – birdsong-recognition apps

Could your smartphone automatically tell you what species of bird is singing outside your window? If so how?

Mobile phones contain microphones to pick up your voice. That means they should be able to pick up the sound of birds singing too, right? And maybe even decide which bird is which?

Smartphone apps exist that promise to do just this. They record a sound, analyse it, and tell you which species of bird they think it is most likely to be. But a smartphone doesn’t have the sophisticated brain that we have, evolved over millions of years to understand the world around us. A smartphone has to be programmed by someone to do everything it does. So if you had to program an app to recognise bird sounds, how would you do it? There are two very different ways computer scientists have devised to do this kind of decision making and they are used by researchers for all sorts of applications from diagnosing medical problems to recognising suspicious behaviour in CCTV images. Both ways are used by phone apps to recognise bird song that you can already buy.

The sound of the European robin (Erithacus rubecula) better known as robin redbreast, Recorded by
Vladimir Yu. Arkhipov, Arkhivov CC BY-SA 3.0 via wikimedia

Write down all the rules

Blackbird singing
Blackbird Image by Ian Lindsay from Pixabay

If you ask a birdwatcher how to identify a blackbird’s sound, they will tell you specific rules. “It’s high-pitched, not low-pitched.” “It lasts a few seconds and then there’s a silent gap before it does it again.” “It’s twittery and complex, not just a single note.” So if we wrote down all those rules in a recipe for the machine to follow, each rule a little program that could say “Yes, I’m true for that sound”, an app combining them could decide when a sound matches all the rules and when it doesn’t.

This is called an ‘expert system’ approach. One difficulty is that it can take a lot of time and effort to actually write down enough rules for enough birds: there are hundreds of bird species in the UK alone! Each would need lots of rules to be hand crafted. It also needs lots of input from bird experts to get the rules exactly right. Even then it’s not always possible for people to put into words what makes a sound special. Could you write down exactly what makes you recognise your friends’ voices, and what makes them different from everyone else’s? Probably not! However, this approach can be good because you know exactly what reasons the computer is using when it makes decisions.

The sound of a European blackbird (Turdus merula) singing merrily in Finland, from Wikipedia (song 1). Public Domain via wikimedia

This is very different from the other approach which is…

Show it lots of examples

A lot of modern systems use the idea of ‘machine learning’, which means that instead of writing rules down, we create a system that can somehow ‘learn’ what the correct answer should be. We just give it lots of different examples to learn from, telling it what each one is. Once it has seen enough examples to get it right often enough, we let it loose on things we don’t know in advance. This approach is inspired by how the brain works. We know that brains are good at learning, so why not do what they do!

One difficulty with this is that you can’t always be sure how the machine comes up with its decisions. Often the software is a ‘black box’ that gives you an answer but doesn’t tell you what justifies that answer. Is it really listening to the same aspects of the sound as we do? How would we know?

On the other hand, perhaps that’s the great thing about this approach: a computer might be able to give you the right answer without you having to tell it exactly how to do that!

It means we don’t need to write down a ‘recipe’ for every sound we want to detect. If it can learn from examples, and get the answer right when it hears new examples, isn’t that all we need?

Which way is best?

There are hundreds of bird species that you might hear in the UK alone, and many more in tropical countries. Human experts take many years to learn which sound means which bird. It’s a difficult thing to do!

So which approach should your smartphone use if you want it to help identify birds around you? You can find phone apps that use one approach or another. It’s very hard to measure exactly which approach is best, because the conditions change so much. Which one works best when there’s noisy traffic in the background? Which one works best when lots of birds sing together? Which one works best if the bird is singing in a different ‘dialect’ from the examples we used when we created the system?

One way to answer the question is to provide phone apps to people and to see which apps they find most useful. So companies and researchers are creating apps using the ways they hope will work best. The market may well then make the decision. How would you decide?

Dan Stowell, Queen Mary University of London

More on …


Related Magazine …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Inspiring Wendy Hall

This article is inspired by a keynote talk Wendy Hall gave at the ITiCSE conference in Madrid, 2008.

What inspires researchers to dedicate their lives to study one area? In the case of computer scientist Dame Wendy Hall it was a TV programme called Hyperland starring former Dr Who Tom Baker and writer Douglas Adams of Hitchhiker’s Guide to the Galaxy fame that inspired her to become one of the most influential researchers of her area.

A pioneer and visionary in the area of web science, many of Dame Wendy’s ideas have started to appear in the next generation web: the ‘great web that is yet to come’ (as Douglas Adams might put it), otherwise known as the semantic web. She has stacked up a whole bunch of accolades for her work. She is a Professor at the University of Southampton, a former President of the British Computer Society and now the first non-US President of the most influential body in computer science, the Association for Computing Machinery. She is also a Fellow of the Royal Academy of Engineering and this year she topped it all and gaining her most impressive sounding title for sure by being made a Dame Commander of the British Empire.

So how did that TV programme set her going?

Douglas Adams and Tom Baker acted out a vision of the future, a vision of how TV was going to change. At the time the web didn’t exist and TV was just something you sat in front of and passively watched. The future they imagined was interactive TV. TV that was personal. TV that did more than just entertain but served all your information needs.

In the programme Douglas Adams was watching TV, vegetating in front of it…and then Tom Baker appeared on Douglas’s screen. He started asking him questions…and then he stepped out of the TV screen. He introduced himself as a software agent, someone who had all the information ever put into digital format at his fingertips. More than that he was Douglas’s personal agent. He would use that information to answer any questions Douglas had. Not just to bring back documents (Google-style) that had something to do with the question and leave you to work out what to do with it all, but actually answer the question. He was an agent that was servant and friend, an agent whose character could even be changed to fit his master’s mood.

Wendy was inspired…so inspired that she decided she was going to make that improbable vision a reality. Reality hasn’t quite caught up yet, but she is getting there.

Most people who think about it at all believe that Tim Berners-Lee invented the idea of the web and of hypertext, the links that connect web pages together. He was the one that kick-started it into being a global reality, making it happen, but actually lots of people had been working in research labs round the world on the same ideas for years before, Wendy included, with her Microcosm hypermedia system. Tim’s version of hypermedia – interactive information – was a simple version, one simple enough to get the idea off the ground. Its time is coming to an end now though.

What is coming next? The semantic web: and it will be much more powerful. It is a version of the web much closer to that TV program, a version where the web’s data is not just linked to other data but where words, images, pictures, videos are all tagged with meaning: tags that the software agents of the future can use to understand.

The structure is now there for it to happen. What is needed is for people to start to use it, to write their web pages that way, to actually make it everyday reality. Then the web programmers will be able to start innovating with new ideas, new applications that use it, and the web scientists like Wendy will be able to study it: to work out what works for people, what doesn’t and why.

Then maybe it’s your turn to be inspired and drive the next leap forward.

Paul Curzon, Queen Mary University of London


Related Magazine …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Marissa Mayer: Lemons Linking 41 Shades of Blue – A/B Testing

A bunch of lemons turned blue
Lemons image by Richard John from Pixabay – colour changed to blue

Google, one of the most powerful companies in the world, is famous for being founded by Larry Page and Sergey Brin, but a key person, the 20th person employed, was engineer, programmer and believer in detail, Marissa Mayer. Her attention to detail made a gigantic difference to Google’s success. She was involved in most of their successful products, from the search engine to Gmail to Adwords and if she wasn’t convinced about a new feature or change, then it didn’t happen. When a designer suggested a new shade of blue for the links of ads, for example, she had to be persuaded. But how could she be sure she did make the right decisions? She used a centuries old idea from medicine, first used to help cure scurvy in sailors, and applied it to software design: the randomized controlled trial.

Randomized controlled trials revolutionized medicine. They could revolutionize many other aspects of our lives too, from education to prison reform, if they were used more. Computer Scientists realized that, and more trials are now used on software than medicines. It’s part of the Big Data revolution and is the way to avoid relying on hunches, instead relying on scientific method to find out what the right answer really is.

But what if …?

The problem with the way we do most things is “what-if”. We make decisions, but never know what would have happened if we took the other choice. If things go well we pat ourselves on the back and tell ourselves we are right. But things might have gone even better had we only made the other decision. We will never know. However good or bad it seems, there is no way of knowing actually if our decision was the right one, if all we do is make it. We then delude ourselves, and so keep doing bad things, over and over. That’s why illness was treated by getting leeches to suck blood for centuries!

Controlled trials overcome this. The big idea boils down to making sure you do both alternatives! Not only do you make the change, you also leave things alone too! That sounds impossible, but it’s simple. Split your population (patients, users, prisoners, students, …) into two groups at random. Apply the change to one group, but leave the other group alone. Then at the end of a suitable period, long enough so you can see any difference, compare the results. You see not only the result of making the change, but also what would have happened if you didn’t. Only then, with hard data about the consequences of both possibilities, do you take the decision.

The first medical trial like this involved sailors who were ill with scurvy – a disease that killed more wartime sailors than enemy action in the 18th century. Scottish Navy surgeon James Lind waited until his ship had been at sea long enough for many sailors to get scurvy. He then split a dozen into 6 pairs: one group had oranges and lemons on top of the normal food, and the others were given different alternatives like cider or vinegar instead. Within a week, the two eating fruit were virtually recovered. More to the point, there was no difference in any of the others apart from an improvement in the pair given cider. Eating fruit was clearly the right decision to cure scurvy. All new drugs are now tested in trials like this to find out if they really do make patients better or not. Because you know what happens to those not given the new treatment, you know any improvement wouldn’t have happened any way.

So how do computer scientists use this sort of trial? The way Marissa Mayer’s team did it is a classic example. One of Google’s designers was suggesting they use a slightly different shade of blue for the links on ads in Google’s mail program. Rather than take his word that it was an improvement, they ran a trial. They created a version of the program that had multiple colours possible for the links, each a different shade of blue. They then split all the users of the program into groups and gave each a different shade of blue for their links, tracking the results. One particular shade led to more clicks on the ads than any other. That was the shade Marissa chose (and it wasn’t the shade the designer had suggested!)

Software trials like this are called A/B Testing. They have become the mainstay of hi-tech companies wanting an edge. It actually leads to a new way of developing software. Rather than get a perfect product at the outset you get something basic that works well enough quickly. Then you set to work running trials on lots of small details, making what are called ‘marginal gains’, as soon as possible. One small detail may not make a big difference, but when you pile them up, each known to be a definite improvement because of the trial, then very quickly your software improves. Trials can give better results than intelligent design!

Does it make a difference? Well the one decision about that shade of blue of Marissa’s team supposedly made Google $200 million a year, as a result of more people clicking on ads. Google now run tens of thousand of trails like this each year. Add the benefits of lots and lots of small improvements and you get one of the most powerful companies on the planet.

Little Gains in Life

The idea of developing software through marginal gains is actually based on the process used by nature: evolution by natural selection. Each species of animal seems perfectly designed for its environment, not because they were designed, but because only the fittest individuals survive to have babies. Any small improvement in a baby that gives it a better ability to survive means the genes responsible for that improvement are passed on. Over many generations the marginal gains add up to give creatures perfectly adapted to their environment.

Paul Curzon, Queen Mary University of London


Related Magazine …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Joyce Wheeler: The Life of a Star

Exploding star
Star image by Dieter from Pixabay

The first computers transformed the way research is done. One of the very first computers, EDSAC (Electronic Delay Storage Automatic Calculator), contributed to the work of three Nobel prize winners: in Physics, Chemistry and Medicine. Astronomer, Joyce Wheeler was an early researcher to make use of the potential of computers to aid the study of other subjects in this way. She was a Cambridge PhD student in 1954 investigating the nuclear reactions that keep stars burning. This involved doing lots of calculations to work out the changing behaviour and composition of the star.

Joyce had seen EDSAC on a visit to the university before starting her PhD, and learnt to program it from its basic programming manual so that she could get it to do the calculations she needed. She would program by day and let EDSAC number crunch using her programs every Friday night, leaving her to work on the results in the morning, and then start the programming for the following week’s run. EDSAC not only allowed her to do calculations accurately that would otherwise have been impossible, it also meant she could run calculations over and over, tweaking what was done, refining the accuracy of the results, and checking the equations quickly with sample numbers. As a result EDSAC helped her to estimate the age of stars.

– Paul Curzon, Queen Mary University of London


More on …

Magazines …

Front cover of CS4FN issue 29 - Diversity in Computing

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

The Devil is in the Detail: Lessons from Animal Welfare? (Temple Grandin)

What can Computer Scientists learn from a remarkable woman and the improvements she made to animal welfare and the meat processing industry?

Temple Grandin is an animal scientist – an animal welfare specialist and a remarkable innovator on top. She has extraordinary abilities that allow her to understand animals in ways others can’t. As a result her work has reduced the suffering of countless farm animals. She has designed equipment, for example, to restrain animals. It makes it easier to give them shots because, in contrast to the equipment it replaces, it does not discomfort the animals as they enter. By being able to see the detail that an animal perceives she is able to design to overcome the problems. Paradoxically perhaps for someone who cares so much about animals, she works with slaughter houses – Meat Processing factories like those of McDonalds.

Her aim, given people do eat meat, is to ensure the animals are treated humanely throughout the process of rearing an animal until its death. Her work has been close to miraculous in the changes she has brought about to ensure that farm animals do not suffer. She is good for business too. If cattle are spooked by something as they enter the processing factory (also known as a ‘plant’), whether by the glint of metal or a deep shadow, the plant’s efficiency drops. Fewer animals are processed per hour and that is a big problem for managers.

As a result of her work she has turned round plants, both in welfare terms and in terms of rescuing plants that might otherwise have been shut down. Suddenly plants she audits are treating their livestock humanely.

See the Bigger Picture

Where do Temple’s extraordinary abilities come from? In fact she was originally labelled as being mentally disabled. She is actually autistic. As a result her brain doesn’t quite work the way most people’s do. Autistic people as a result of these brain differences often have difficulties socialising with others. They can find it very hard to understand the nuances of human-human communication that the rest of us take for granted. This is in part because autistic people perceive the world differently. A non-autistic person misses vast amounts of the detail in front of their eyes. Instead just a bigger picture of what they are seeing is passed to their conscious selves. An autistic person doesn’t have that sub-conscious ability to filter out detail, but instead perceives every small thing all at once. That is why autistics can sometimes be overcome by their surroundings, finding the world too much to cope with. They think in terms of a series of pictures full of detail, not abstractly in words.

Temple Grandin argues that that is what makes her special when it comes to understanding farm animals. In some ways they see the world very much like she does. Just as a cow does, she notices the shadows and the glint of metal, the bright patch on the floor from the overhead lights or the jacket laid over the fence that is spooking it. The plant managers and animal handlers don’t even register them never mind see them as a problem.

Who ya gonna call?

Because of this ability to quickly spot the problems everyone else has missed, Temple gained a reputation for being the person to call when a problem seemed intractable. She has also turned it into a career as an animal welfare auditor, checking processing plants to ensure their standards are sufficiently high. This is where she has helped force through the biggest improvements, and it all boils down to checklists.


Tick that box

Checking that lists of guidelines are being adhered to is a common way to audit quality in many areas of life. Checklists are used in a computer science context as checks for usability (for example that a new version of some application is easy to use) and accessibility (could a blind person, or for that matter someone who was autistic, successfully use a website say). Checklists tend to be very long. After all it must be the case that the more you are checking, the higher the quality of the result, mustn’t it? Surprisingly that turns out not always to be true! That is why Temple Grandin has been so successful. Rather than have a checklist with hundreds of things to check she boiled her own set of questions to ask down to just 10.

Traditional animal welfare audits have checklist questions such as “Is the flooring slippery?” and “Is the electric prod used as little as possible?”. Even apart from the number to work through this kind of checklist can be very hard to follow, not least due to the vagueness.

Ouch!

Temple’s checklist includes questions like: “Do all animals remain unconscious after being stunned?”, “Do no more than 3% of animals vocalise during handling or stunning?” (a “Moo” in this situation means “Ouch”) They are precise, with little room for dispute – it isn’t left to the inspectors judgement. That also means everyone knows the target they are working towards. The fact that there are only 10 also means it is easy for everyone involved to know them all well. Perhaps most importantly they do not focus on the state of the factory, or the way things are done. Instead, they focus on the end results – that animals are humanely treated. The point is that one item covers a multitude of sins that could be causing it. If too many animals are crying out in pain then you have to fix ALL the causes, even if it is something new that no-one thought of putting on a checklist before.

Temple’s 10 point approach to checklists can apply to more than just animal welfare of course. The principles behind it could just as well apply to other areas like usability and accessibility of websites.

Some usability evaluation techniques do follow similar principles. Cognitive Walkthrough, a method of auditing that systems are easy to use on first encounter, has some of the features of this kind of approach. The original version involved a longish set of questions that an expert was to ask him/herself about a system under evaluation. After early trials the developers of the method Cathleen Wharton, John Rieman, Clayton Lewis and Peter Polson quickly realised this wasn’t very practical and replaced it by a 4 question version. It has since then even been replaced by a 3-question walkthrough. One of the questions, to be asked of each step in achieving a task, is: “Will a user know what to try and do at this point?” This has some of the flavour of the Grandin approach – it is about the end result not about some specific thing going wrong.

Let’s look at accessibility. Currently, where web designers think about it at all (UK law requires them to) the long checklist approach tends to be followed. Typical items to check are things like “Ensure that all information conveyed with colour is also available without colour”. Automatic systems are often used to do audits. That is good in one sense as the criteria have then to be very precise for a mere computer to make the decision. On the other hand it encourages items in the checklist to just be things a computer can check. It also encourages the long list of fine detail approach that Temple rejected. Worse, it also can lead to people conforming to the checklist without deeply understanding what the point actually is. A classic example is a web designer adding as the last item on a web page “If you are partially sighted click here”. As far as an automatic checker is concerned they may have done everything right – even providing alternative facilities that are clearly available (if you can see them). A partially sighted person however would only get to that instruction on the screen after they have struggled through the rest of the page. The designer got the right idea but missed the point.

Temple Grandin’s approach would suggest instead having checklists that ask about the outcomes of using the page: “Do 97% of partially-sighted people successfully complete their objective in using the site?” for example. That is why “user testing” is so important, at least as one of the evaluation approaches you follow. User testing involves people from a wide variety of backgrounds actually trying using your prototype software or web pages before they are released. It allows you to focus on the big picture. Of course if you are trying to ensure a web page is accessible your users must include people with different kinds of disabilities.


The Big Picture

One of Temple Grandin’s main messages is that the big advantage that arises as a result of her autism is that she thinks in concrete pictures not in abstract words. Whilst thinking verbally is good in some situations it seems to make us treat small things as though they were just as important as the big issues.

So whatever you are doing, whether looking after animals or designing accessible websites, don’t get lost in the detail. Focus on the point of it all.

Paul Curzon, Queen Mary University of London


More on …


EPSRC supports this blog through research grant EP/W033615/1.