Scéalextric Stories

If you watch a lot of movies you’ve probably noticed some recurring patterns in the way that popular cinematic stories are structured. Every hero or heroine needs a goal and a villain to thwart that goal. Every goal requires travel along a path that is probably blocked with frustrating obstacles. Heroes may not see themselves as heroes, and will occasionally take the wrong fork in the path, only to return to the one true way before story’s end. We often speak of this path as if it were a race track: a fast-paced story speeds towards its inevitable conclusion, following surprising “twists” and “turns” along the way. The track often turns out to be a circular one, with the heroine finally returning to the beginning, but with a renewed sense of appreciation and understanding. Perhaps we can use this race track idea as a basis for creating stories.

Building a track

If you’ve ever played with a Scalextric set, you will know that the curviest tracks make for the most dramatic stories, by providing more points at which our racing cars can fly off at a tight bend. In Scalextric you build your own race circuits by clicking together segments of prefabricated track, so the more diverse the set of track parts, the more dramatic your circuit can be. We can think of story generation as a similar kind of process. Imagine if you had a large stock of prefabricated plot segments, each made up of three successive bits of story action. A generator could clip these segments together to create a larger story, by connecting the pieces end-to-end. To keep the plot consistent we would only link up sections if they have overlapping actions. So If D-E-F is a segment comprising the actions D, E, and F, we could create the story B-C-D-E-F-G-H by linking the section B-C-D on to the left of D-E-F and F-G-H on its right.

Use a kit

At University College Dublin (UCD) we have created a set of rich public resources that make it easy for you to build your own automated story generator. We call the bundle of resources Scéalextric, from scéal (the Irish word for story) and Scalextric. You can download the Scéalextric resources from our Github but an even better place to start is our blog for people who want to build creative systems of any kind, called Best Of Bot Worlds.

In Artificial Intelligence we often represent complex knowledge structures as ‘graphs’. These graphs consists of lots of labeled lines (called edges) that show how labeled points (called nodes) are connected. That is what our story pieces essentially are. We have several agreed ways for storing these node-relation-node triples, with acronyms hiding long names, like XML (eXtensible Markup Language), RDF (Resource Description Framework) and OWL (Web Ontology Language), but the simplest and most convenient way to create and maintain a large set of story triples is actually just to use a spreadsheet! Yes, the boring spreadsheet is a great way to store and share knowledge, because every cell lies at the intersection of a row and a column. These three parts give us our triples.

Scéalextric is a collection of easy-to-browse spreadsheets that tell a machine how actions connect to form action sequences (like D-E-F above), how actions causally interconnect to each other (via and, then, but), how actions can be “rendered” in natural idiomatic English, and so on.

Adding Character

Automated storytelling is one of the toughest challenges for a researcher or hobbyist starting out in artificial intelligence, because stories require lots of knowledge about causality and characterization. Why would character A do that to character B, and what is character B likely to do next? It helps if the audience can identify with the characters in some way, so that they can use their pre-existing knowledge to understand why the characters do what they do. Imagine writing a story involving Donald Trump and Lex Luthor as characters: how would these characters interact, and what parts of their personalities would they reveal to us through their actions?

Scéalextric therefore contains a large knowledge-base of 800 famous people. These are the cars that will run on our tracks. The entry for each one has triples describing a character’s gender, fictive status, politics, marital status, activities, weapons, teams, domains, genres, taxonomic categories, good points and bad points, and a lot more besides. A key challenge in good storytelling, whether you are a machine or a human, is integrating character and plot so that one informs the other.

A Twitterbot plot

Let’s look at a story created and tweeted by our Twitterbot @BestOfBotWorlds over a series of 12 tweets. Can you see where the joins are in our Scéalextric track? Can you recognize where character-specific knowledge has been inserted into the rendering of different actions, making the story seem funny and appropriate at the same time? More importantly, can you see how you might connect the track segments differently, choose characters more carefully, or use knowledge about them more appropriately, to make better stories and to build a better story-generator? That’s what Scéalextric is for: to allow you to build your own storytelling system and to explore the path less trodden in the world of computational creativity. It all starts with a click.

An unlikely tale generated by the Twitter storybot.

Tony Veale, University College Dublin


Further reading

Christopher Strachey came up with the first example of a computer program that could create lines of text (from lists of words). The CS4FN developed a game called ‘Program A Postcard’ (see below) for use at festival events.


Related Magazine …

Sea sounds sink ships

You might think that under the sea things are nice and quiet, but something fishy is going on down there. Our oceans are filled with natural noise. This is called ambient noise and comes from lots of different sources: from the sound of winds blowing waves on the surface, rain, distant ships and even underwater volcanoes. For undersea marine life that relies on sonar or other acoustic ways to communicate and navigate all the extra ocean noise pollution that human activities, such as undersea mining and powerful ships sonars, have caused, is an increasing problem. But it’s not only the marine life that is affected by the levels of sea sounds, submarines also need to know something about all that ambient noise.

In the early 1900s the aptly named ‘Submarine signal company’ made their living by installing undersea bells near lighthouses. The sound of these bells were a warning to mariners about the impending navigation hazards: an auditory version of the lighthouse light.

The Second World War led to scientists taking undersea ambient noise more seriously as they developed deadly acoustic mines. These are explosive mines triggered by the sound of a passing ship. To make the acoustic trigger work reliably the scientists needed to measure ambient sound, or the mines would explode while simply floating in the water. Measurements of sound frequencies were taken in harbours and coastal waters, and from these a mathematical formula was computed that gave them the ‘Knudsen curves’. Named after the scientist who led the research these curves showed how undersea sound frequencies varies with surface wind speed and wave height. They allowed the acoustic triggers to be set to make the mines most effective.

– Peter McOwan, Queen Mary University of London


Related Magazine …

See also


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

A sound social venture: recognising birds

Dan Stowell was a researcher at Queen Mary University of London when he founded an early version of what is now known as a Social Venture: a company created to do social good. With Florence Wilkinson, he turned birdsong into a tech-based social good.

A Eurasian Wren singing on the end of a branch
A Eurasian Wren: Image by Siegfried Poepperl from Pixabay

His research is about designing methods that computers can use to make sense of bird sounds. One day he met Florence Wilkinson, who works with businesses and young people, and they discovered they both had the same idea: “What if we could make an app that recognises bird sounds?” They decided to create a startup company, Warblr, to make it happen. However, unlike many research driven startups its main aim was not to make money but to do a social good. Dan and FLorence built this into their company mission statement:

…to reconnect people with the natural world through technology. We want to get as many people outdoors as possible, learning about the wildlife on their doorstep and how to protect it.

Dan brought the technical computer science skills needed to create the app, and Florence brought the marketing and communication skills needed to ensure people would hear about it. Together, they persuaded Queen Mary University of London’s innovation unit to give them a start-up grant. As a result their app Warblr exists and even gained some press coverage.

It can help people connect with nature by helping recognise birds – after all one of the problems with bird watching is they are so damned hard to spot and lots that flit by just look like little brown things! However, they are far easier to hear. Once you know what is out there then it adds incentive to try to actually spot it. However, the app has another purpose too. It collects data about the birds spotted, recording the species and where and when it was seen, with that data then made freely available to researchers.

Social ventures are a relatively new idea that universities are now supporting to help their researchers do social good that is sustainable and not just something that lasts until the grants run out. As Dan and Florence showed though, as a researcher you do not need to commit to do everything. To be a successful innovator you need more than technical skills, though. You need the ability to be part of a great team and to recognise a sound deal!

Updated from the archive, written by Paul Curzon, Queen Mary University of London.

More on …

Magazines …

The front cover of issue 21 of CS4FN called Computing Sounds Wild

Our Books …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.



EPSRC supports this blog through research grant EP/W033615/1. 

Oh no! Not again…

What a mess. There’s flour all over the kitchen floor. A fortnight ago I opened the cupboard to get sugar for my hot chocolate. As I pulled out the sugar, it knocked against the bag of flour which was too close to the edge… Luckily the bag didn’t burst and I cleared it up quickly before anyone found out. Now it’s two weeks later and exactly the same thing just happened to my brother. This time the bag did burst and it went everywhere. Now he’s in big trouble for being so clumsy!

Flour cascading everywhere
Cropped image of that by Anastasiya Komarova from Pixabay
Image by Anastasiya Komarova from Pixabay

In safety-critical industries, like healthcare and the airline industry, especially, it is really important that there is a culture of reporting incidents including near misses. It also, it turns out, is important that the mechanisms of reporting issues is appropriately designed, and that there is a no blame culture especially so that people are encouraged to report incidents and do so accurately and without ambiguity.

Was the flour incident my brother’s fault? Should he have been more careful? He didn’t choose to put the sugar in a high cupboard with the flour. Maybe it was my fault? I didn’t choose to put the sugar there either. But I didn’t tell anyone about the first time it happened. I didn’t move the sugar to a lower cupboard so it was easier to reach either. So maybe it was my fault after all? I knew it was a problem, and I didn’t do anything about it. Perhaps thinking about blame is the wrong thing to do!

Now think about your local hospital.

James is a nurse, working in intensive care. Penny is really ill and is being given insulin by a machine that pumps it directly into her vein. The insulin is causing a side effect though – a drop in blood potassium level – and that is life threatening. They don’t have time to set up a second pump, so the doctor decides to stop the insulin for a while and to give a dose of potassium through a second tube controlled by the same pump. James sets up the bag of potassium and carefully programs the pump to deliver it, then turns his attention to his next task. A few minutes later, he glances at the pump again and realises that he forgot to release the clamp on the tube from the bag of potassium. Penny is still receiving insulin, not the potassium she urgently needs. He quickly releases the clamp, and the potassium starts to flow. An hour later, Penny’s blood potassium levels are pretty much back to normal: she’s still ill, but out of danger. Phew! Good job he noticed in time and no-one else knows about the mistake!

Two weeks later, James’ colleague, Julia, is on duty. She makes a similar mistake treating a different patient, Peter. Except that she doesn’t notice her mistake until the bag of insulin has emptied. Because it took so long to spot, Peter needs emergency treatment. It’s touch-and-go for a while, but luckily he recovers.

Julia reports the incident through the hospital’s incident reporting system, so at least it can be prevented from happening again. She is wracked with guilt for making the mistake, but also hopes fervently that she won’t be blamed and so punished for what happened

Don’t miss the near misses

Why did it happen? There are a whole bunch of problems that are nothing to do with Julia or James. Why wasn’t it standard practice to always have a second pump set up for critically ill patients in case such emergency treatment is needed? Why can’t the pump detect which bag the fluid is being pumped from? Why isn’t it really obvious whether the clamp is open or closed? Why can’t the pump detect it. If the first incident – a ‘near miss’ – had been reported perhaps some of these problems might have been spotted and fixed. How many other times has it happened but not reported?

What can we learn from this? One thing is that there are lots of ways of setting up and using systems, and some may well make them safer. Another is that reporting “near misses” is really important. They are a valuable source of learning that can alert other people to mistakes they might make and lead to a search for ways of making the system safer, perhaps by redesigning the equipment or changing the way it is used, for example – but only if people tell others about the incidents. Reporting near-misses can help prevent the same thing happening again.

The above was just a story, but it’s based on an account of a real incident… one that has been reported so it might just save lives in the future.

Report it!

The mechanisms used to do it, as well as culture around reporting incidents can make a big difference to whether incidents are reported. However, even when incidents are reported, the reporting systems and culture can help or hinder the learning that results.

Chrystie Myketiak at Queen Mary analysed actual incident reports for the kind of language used by those writing them. She found that the people doing the reporting used different strategies in they way they wrote the reports depending on the kind of incident it was. In situations where there was no obvious implication that a person made a mistake (such as where sterilization equipment had not successfully worked) they used one kind of language. Where those involved were likely to be seen to be responsible, so blamed, (eg when a wrong number had been entered in a medical device, for example) they used a different kind of language.

In the former, where “user errors” might have been involved, those doing the reporting were more likely to write in a way that hid the identity of any person involved, eg saying “The pump was programmed” or writing about he or she rather than a named person. They were also more likely to write in a way that added ambiguity. For example, in the user error reports it was less clear whether the person making the report was the one involved or whether someone else was writing it such as a witness or someone not involved at all.

Writing in the kinds of ways found, and the fact that it differed to those with no one likely to be blamed, suggests that those completing the reports were aware that their words might be misinterpreted by those who read them. The fact that people might be blamed hung over the reporting.

The result of adding what Christie called “precise ambiguity” might mean important information was inadvertently concealed making it harder to understand why the incident happened so work out how best to avoid it. As a result, patient safety might then not be improved even though the incident was reported. This shows one of the reasons why a strong culture of no-fault reporting is needed if a system is to be made as safe as possible. In the airline industry, which is incredibly safe, there is a clear system of no fault reporting, with pilots, for example, being praised for reporting near-misses of plane crashes rather than being punished for any mistake that led to the near miss.

This work was part of the EPSRC funded CHI+MED research project led by Ann Blandford at UCL looking at design for patient safety. In separate work on the project, Alexis Lewis, at Swansea University, explored how best to design the actual incident reporting forms as part of her PhD. A variety of forms are used in hospitals across the UK and she examined more than 20 different ones. Many had features that would make it harder than necessary for nurses and doctors to report incidents accurately even if they wanted to openly so that hospital staff would learn as much as possible from the incidents that did happen. Some forms failed to ask about important facts and many didn’t encourage feedback. It wasn’t clear how much detail or even what should be reported. She used the results to design a new reporting form that avoided the problems and that could be built into a system that encourages the reporting of incidents . Ultimately her work led to changes to the reporting form and process used within at least one health board she was working with.

People make mistakes, but safety does not come from blaming those that make them. That just discourages a learning culture. To really improve safety you need to praise those that report near misses, as well as ensuring the forms and mechanisms they must use to do so helps them provide the information needed.

Updated from the archive, written by the CHI+MED team.

More on …

Magazines …

Our Books …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.



EPSRC supports this blog through research grant EP/W033615/1. 

I wandered lonely as a mass of dejected vapour – try some AI poetry

by Jane Waite, Queen Mary University of London

A single fluffy white cloud brightly lit by the sun, against a deep blue sky
Image by Enrique from Pixabay

Ever used an online poem generator, perhaps to get started with an English assignment? They normally have a template and some word lists you can fill in, with a simple algorithm that randomly selects from the word lists to fill out the template. “I wandered lonely as a cloud” might become “I zoomed destitute as a rainbow” or I danced homeless as a tree”. It would all depend on those word lists. Artificial Intelligence and machine learning researchers are aiming to be more creative.

Stanford University, the University of Massachusetts and Google have created works that look like poems, by accident. They were using a machine learning Artificial Intelligence they had previously ‘trained’ on romantic novels to research the creation of captions for images, and how to translate text into different languages. They fed it a start and end sentence, and let the AI fill in the gap. The results made sense though were ‘rather dramatic’: for example

he was silent for a long moment
he was silent for a moment
it was quiet for a moment
it was dark and cold
there was a pause
it was my turn

Is this a real poem? What makes a poem a poem is in itself an area of research, with some saying that to create a poem, you need a poet and the poet should do certain things in their ‘creative act’. Researchers from Imperial College London and University College Dublin used this idea to evaluate their own poetry system. They checked to see if the poems they generated met the requirements of a special model for comparing creative systems. This involved things like checking whether the work formed a concept, and including measures such as flamboyance and lyricism.

Read some poems written by humans and compare them to poems created by online poetry generators. What makes it creativity? Maybe that’s up to you!


See also this article about Christopher Strachey, from our LGBT portal 🏳️‍🌈, who came up with the first example of a computer program that could create lines of text (from lists of words).


Jane’s article was first published on the original CS4FN website and there’s a copy on page 17 of issue 22 (“Creative Computing”) of the CS4FN magazine which you can download FREE by clicking on the link or the image of the front cover below.


Related Magazine …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Claude Shannon: Inventing for the fun of it

Image by Paul Curzon

Claude Shannon, inventor of the rocket powered Frisbee, gasoline powered pogo stick, a calculator that worked using roman numerals, and discoverer of the fundamental equation of juggling! Oh yeah, and founder of the most important theory underpinning all digital communication: information theory.

Claude Shannon is perhaps one of the most important engineers of the 20th century, but he did it for fun. Though his work changed the world, he was always playing with and designing things, simply because it amused him. Like his contemporary Richard Feynman, he did it for ‘the pleasure of finding things out.’

As a boy, Claude liked to build model planes and radio-controlled boats. He once built a telegraph system to a friend’s house half a mile away, though he got in trouble for using the barbed wires around a nearby pasture. He earned pocket money delivering telegrams and repairing radios.

He went to the University of Michigan, and then worked on his Masters at MIT. While there, he thought that the logic he learned in his maths classes could be applied to the electronic circuits he studied in engineering. This became his Masters thesis, published in 1938. It was described as ‘one of the most important Master’s theses ever written… helped to change digital circuit design from an art to a science.’

Claude Shannon is known for his serious research, but a lot of his work was whimsical. He invented a calculator called THROBAC (Thrifty Roman numerical BACkward looking computer), that performs all its operations in the Roman numeral system. His home was full of mechanical turtles that would wander around, turning at obstacles; a gasoline-powered pogostick and rocket-powered Frisbee; a machine that juggled three balls with two mechanical hands; a machine to solve the Rubik’s cube; and the ‘Ultimate Machine’, which was just a box that when turned on, would make an angry, annoyed sound, reach out a hand and turn itself off. As Claude once explained with a smile, ‘I’ve spent lots of time on totally useless things.’

A lot of the early psychology experiments used to involve getting a mouse to run through a maze to reach some food at the end. By performing these experiments over and over in different ways, they could figure out how a mouse learns. So Claude built a mouse-shaped robot called Theseus. Theseus could search a maze until he solved it, and then use this knowledge to find its way through the maze from any starting point.

Oh, and there’s one other paper of his that needs mentioning. No, not the one on the science of juggling, or even the one describing his ‘mind reading’ machine. In 1948 he published ‘A mathematical theory of communication.’ Quite simply, this changed the world, and changed how we think about information. It laid the groundwork for a lot of important theory used in developing modern cryptography, satellite navigation, mobile phone networks… and the internet.

– Paul Curzon, Queen Mary University of London.


More on …


Related Magazine …

This article was first published on the original CS4FN website and there is a copy on page 19 of the 2nd issue of the EE4FN (Electronic Engineering For Fun) magazine, which you can download below – along with all of our back issues.


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Scilly cable antics

CS4FN Banner

by Paul Curzon, Queen Mary University of London (from the archive)

Sunset over the Scilly Isles with a sailing boat in the foreground
Image by Mike Palmer from Pixabay

Undersea telecommunications cables let the world communicate and led to the world spanning Internet. It was all started by the Victorians. Continents were connected, but closer islands were too including the Scilly Isles.

Autumn 1869. There were great celebrations as the 31 mile long telecommunications cable was finally hauled up the shore and into the hut. The Scilly Isles now had a direct cable communication link to the mainland. But would it work? Several tests messages were sent and it was announced that all was fine. The journalists filed their story. The celebrations could begin.

Except it didn’t actually work! The cable wasn’t connected at all. The ship laying the cable had gone off course. Either that or someone’s maths had been shaky. The cable had actually run out 5 miles off the islands. Not wanting to spoil the party, the captain ordered the line to be cut. Then, unknown to the crowd watching, they just dragged the cut off end of the cable up the beach and pretended to do the tests. The Scilly Isles weren’t actually connected to Cornwall until the following year.

More on …


Magazines …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.



This blog is funded through EPSRC grant EP/W033615/1.

Sarah Angliss: Hugo is no song bird

by Jane Waite, Queen Mary University of London

What was the first technology for recording music: CDs? Records? 78s, The phonograph? No. Trained songbirds came before all of them.

Composer, musician, engineer and visiting fellow at Goldsmiths University, Sarah Angliss, usually has a robot on stage performing live with her. These robots are not slick high tech cyber-beings, but junk modelled automata. One, named Hugo, sports a spooky ventriloquist dolls head! Sarah builds and programs her robots, herself.

She is also a sound historian, and worked on a Radio 4 documentary, ‘The Bird Fancyer’s Delight‘, uncovering how birds have been used to provide music across the ages. During the 1700’s people trained songbirds to sing human invented tunes in their homes. You could buy special manuals showing how to train your pet bird. By playing young birds a tune over and over again, and in the absence of other birds to put them right, they would adopt that song as their own. Playing the recorder was one way to train them, but special instruments were also invented to do the job automatically.

With the invention of the phonograph, home songbird popularity plummeted but it didn’t completely die out. Blackbirds, thrushes, canaries, budgies, bullfinches and other songbirds have continued to be schooled to learn songs that they would never sing in the wild.


This article was first published on our archived CS4FN site, and a copy is also on page 9 of issue 21 of the CS4FN magazine “Computing Sounds Wild”. You can download a free PDF copy from the link below, along with all of our free material.


Related Magazine …


More from Sarah Angliss


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Much ado about nothing

by Paul Curzon, Queen Mary University of London

A blurred image of a hospital ward
Image by Tyli Jura from Pixabay

The nurse types in a dose of 100.1 mg [milligrams] of a powerful drug and presses start. It duly injects 1001 mg into the patient without telling the nurse that it didn’t do what it was told. You wouldn’t want to be that patient!

Designing a medical device is difficult. It’s not creating the physical machine that causes problems so much as writing the software that controls everything that that machine does. The software is complex and it has to be right. But what do we mean by “right”? The most obvious thing is that when a nurse sets it to do something, that is exactly what it does.

Getting it right is subtler than that though. It must also be easy to use and not mislead the nurse: the human-computer interface has to be right too. It is the software that allows you to interact with a gadget – what buttons you press to get things done and what feedback you are given. There are some basic principles to follow when designing interfaces. One is that the person using it should always be clearly told what it is doing.

Manufacturers need ways to check their devices meet these principles: to know that they got it right.

It’s not just the manufacturers, though. Regulators have the job of checking that machines that might harm people are ‘right’ before they allow them to be sold. That’s really difficult given the software could be millions of lines long. Worse they only have a short time to give an answer.

Million to one chances are guaranteed to happen.

Problems may only happen once in a million times a device is used. They are virtually impossible to find by having someone try possibilities to see what happens, the traditional way software is checked. Of course, if a million devices are bought, then a million to one chance will happen to someone, somewhere almost immediately!

Paolo Masci at Queen Mary University of London, has come up with a way to help and in doing so found a curious problem. He’s been working with the US regulator for medical devices – the FDA – and developed a way to use maths to find problems. It involves creating a mathematical description of what critical parts of the interface program do. Properties, like the user always knowing what is going on, can then be checked using maths. Paolo tried it out on the code for entering numbers of a real medical device and found some subtle problems. He showed that if you typed in certain numbers, the machine actually treated them as a number ten times bigger. Type in a dose of 100.1 and the machine really did set the dose to be 1001. It ignored the decimal point because on such a large dose it assumed small fractions are irrelevant. However another part of the code allows you to continue typing digits. Worse still the device ignores that decimal point silently. It doesn’t make any attempt to help a nurse notice the change. A busy nurse would need to be extremely vigilant to see the tiny decimal point was missing given the lack of warning.

A useful thing about Paolo’s approach is that it gives you the button presses that lead to the problem. With that you can check other devices very quickly. He found that medical devices from three other manufacturers had exactly the same problem. Different teams had all programmed in the same problem. None had thought that if their code ignored a decimal point, it ought to warn the nurse about it rather than create a number ten times bigger. It turns out that different programmers are likely to think the same way and so make the same mistakes (see ‘Double or Nothing‘).

Now the problem is known, nurses can be warned to be extra careful and the manufacturers can update the software. Better still they and the regulators now have an easy way to check their programmers haven’t made the same mistake in future devices. In future, whether vigilant or not, a nurse won’t be able to get it wrong.


Further reading

This article was first published on the CS4FN website (archived copy) and there is a copy on page 8 of issue 17 of the CS4FN magazine which you can download below.


Related Magazine …


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Double or nothing: an extra copy of your software, just in case

by Paul Curzon, Queen Mary University of London

Ariane 5 on the launchpad
Ariane 5 on the launch pad. Photo Credit: (NASA/Chris Gunn) Public Domain via Wikimedia Commons.

If you spent billions of dollars on a gadget you’d probably like it to last more than a minute before it blows up. That’s what happened to a European Space Agency rocket. How do you make sure the worst doesn’t happen to you? How do you make machines reliable?

A powerful way to improve reliability is to use redundancy: double things up. A plane with four engines can keep flying if one fails. Worried about a flat tyre? You carry a spare in the boot. These situations are about making physical parts reliable. Most machines are a combination of hardware and software though. What about software redundancy?

You can have spare copies of software too. Rather than a single version of a program you can have several copies running on different machines. If one program goes wrong another can take over. It would be nice if it was that simple, but software is different to hardware. Two identical programs will fail in the same way at the same time: they are both following the same instructions so if one goes wrong the other will too. That was vividly shown by the maiden flight of the Ariane 5 rocket. Less than 40 seconds from launch things went wrong. The problem was to do with a big number that needed 64 bits of storage space to hold it. The program’s instructions moved it to a storage place with only 16 bits. With not enough space, the number was mangled to fit. That led to calculations by its guidance system going wrong. The rocket veered off course and exploded. The program was duplicated, but both versions were the same so both agreed on the same wrong answers. Seven billion dollars went up in smoke.

Can you get round this? One solution is to get different teams to write programs to do the same thing. The separate teams may make mistakes but surely they won’t all get the same thing wrong! Run them on different machines and let them vote on what to do. Then as long as more than half agree on the right answer the system as a whole will do the right thing. That’s the theory anyway. Unfortunately in practice it doesn’t always work. Nancy Leveson, an expert in software safety from MIT, ran an experiment where different programmers were given programs to write. She found they wrote code that gave the same wrong answers. Even if it had used independently written redundant code it’s still possible Ariane 5 would have exploded.

Redundancy is a big help but it can’t guarantee software works correctly. When designing systems to be highly reliable you have to assume things will still go wrong. You must still have ways to check for problems and to deal with them so that a mistake (whether by human or machine) won’t turn into a disaster.


Related Magazine …


Further reading


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos