Hiding in Skype: cryptography and steganography

Magic book with sparkly green and purple colours

by Paul Curzon, Queen Mary University of London

Computer Science isn’t just about using language, sometimes it’s about losing it. Sometimes people want to send messages so no one even knows they exist and a great place to lose language is inside a conversation.

Cryptography is the science of making messages unreadable. Spymasters have used it for a thousand years or more. Now it’s a part of everyday life. It’s used by the banks every time you use a cash point and by online shops when you buy something over the Internet. It’s used by businesses that don’t want their industrial secrets revealed and by celebrities who want to be sure that tabloid hackers can’t read their texts.

Cryptography stops messages being read, but sometimes just knowing that people are having a conversation can reveal more than they want even if you don’t know what was said. Knowing a football star is exchanging hundreds of texts with his team mate’s girlfriend suggests something is going on, for example. Similarly, CIA chief David Petraeus whose downfall made international news, might have kept his secret and his job if the emails from his lover had been hidden. David Bowie kept his 2013 comeback single ‘Where are we now?’ a surprise until the moment it was released. It might not have made him the front page news it did if a music journalist had just tracked who had been talking to who amongst the musicians involved in the months before.

That’s where steganography comes in – the science of hiding messages so no one even knows they exist. Invisible ink is one form of steganography used, for example, by the French resistance in World War II. More bizarre forms have been used over the years though – an Ancient Greek slave had a message tattooed on his shaven head warning of Persian invasion plans. Once his hair had grown back he delivered it with no one on the way the wiser.

Digital communication opens up new ways to hide messages. Computers store information using a code of 0s and 1s: bits. Steganography is then about finding places to hide those bits. A team of Polish researchers led by Wojciech Mazurczyk have now found a way to hide them in a Skype conversation.

When you use Skype to make a phone call, the program converts the sounds you make to a long series of bits. They are sent over the Internet and converted back to sound at the other end. At the same time more sounds as bits stream back from the person you are talking to. Data transmitted over the Internet isn’t sent all in one go, though. It’s broken into packets: a bit like taking your conversation and tweeting it one line at a time.

Why? Imagine you run a crack team of commandos who have to reach a target in enemy territory to blow it up – a stately home where all the enemy’s Generals are having a party perhaps. If all the commandos travel together in one army truck and something goes wrong along the way probably no one will make it – a disaster. If on the other hand they each travel separately, rendezvousing once there, the mission is much more likely to be successful. If a few are killed on the way it doesn’t matter as the rest can still complete the mission.

The same applies to a Skype call. Each packet contains a little bit of the full conversation and each makes its own way to the destination across the Internet. On arriving there, they reform into the full message. To allow this to happen, each packet includes some extra data that says, for example, what conversation it is part of, how big it is and also where it fits in the sequence. If some don’t make it then the rest of the conversation can still be put back together without them. As long as too much isn’t missing, no one will notice.

Skype does something special with its packets. The size of the packets changes depending on how much data needs to be transmitted. If the person is talking each packet carries a lot of information. If the person is listening then what is being transmitted is mainly silence. Skype then sends shorter packets. The Polish team realised they could exploit this for steganography. Their program, SkyDe, intercepts Skype packets looking for short ones. Any found are replaced with packets holding the data from the covert message. At the destination another copy of SkyDe intercepts them and extracts the hidden message and passes it on to the intended recipient. As far as Skype is concerned some packets just never arrive.

There are several properties that matter for a good steganographic technique. One is its bandwidth: how much data can be sent using the method. Because Skype calls contain a lot of silence SkyDe has a high bandwidth: there are lots of opportunities to hide messages. A second important property is obviously undetectability. The Polish team’s experiments have shown that SkyDe messages are very hard to detect. As only packets that contain silence are used and so lost, the people having the conversation won’t notice and the Skype receiver itself can’t easily tell because what is happening is no different to a typical unreliable network. Packets go missing all the time. Because both the Skype data and the hidden messages are encrypted, someone observing the packets travelling over the network won’t see a difference – they are all just random patterns of bits. Skype calls are now common so there are also lots of natural opportunities for sending messages this way – no one is going to get suspicious that lots of calls are suddenly being made.

All in all SkyDe provides an elegant new form of steganography. Invisible ink is so last century (and tattooing messages on your head so very last millennium). Now the sound of silence is all you need to have a hidden conversation.

A version of this article was originally published on the CS4FN website and a copy also appears on pages 10-11 of Issue 16 of the magazine (see Related magazines below).

You can also download PDF copies of all of our free magazines.


Related Magazines …


This blog is funded through EPSRC grant EP/W033615/1.

CS4FN Advent – Day 1 – Woolly jumpers, knitting and coding

Welcome to the first ‘window’ of the CS4FN Christmas Computing Advent Calendar. The picture on the ‘box’ was a woolly jumper with a message in binary, three letters on the jumper itself and another letter split across the arms. Can you work out what it says? (Answer at the end).

Come back tomorrow for the next instalment in our Advent series.

Cartoon of a green woolly Christmas jumper with some knitted stars and a message “knitted” in binary (zeroes and ones). Also the symbol for wifi on the cuffs.

Wrap up warm with our first festive CS4FN article, from Karen Shoop, which is all about the links between knitting patterns and computer code. Find out about regular expressions in her article: Knitters and Coders: separated at birth?

Click to read Karen’s article

Image credit: Regular Expressions by xkcd

Further reading

Dickens Knitting in Code – this CS4FN article, by Paul Curzon, is about Charles Dickens’ book A Tale of Two Cities. One of the characters, Madame Defarge, takes coding to the next level by encoding hidden information into her knitting, something known as steganography (basically hiding information in plain sight). We have some more information on the history of steganography and how it is used in computing in this CS4FN article: Hiding in Elizabethan binary.

In Craft, Culture, and Code Shuchi Grover also considers the links between coding and knitting, writing that “few non-programming activities have such a close parallels to coding as knitting/crocheting” (see section 4 in particular, which talks about syntax, decomposition, subroutines, debugging and algorithms).

Something to print and colour in

This is a Christmas-themed thing you might enjoy eating, if you’ve any room left of course. Puzzle solution tomorrow. This was designed by Elaine Huen.

Solving the Christmas jumper code

The jumper’s binary reads

01011000

01001101

01000001

01010011

What four letters might be being spelled out here? Each binary number represents one letter and you can find out what each letter is by looking at this binary-to-letters translator. Have a go at working out the word using the translator (but the answer is at the end of this post).

Keep scrolling

Bit more

The Christmas jumper says… XMAS

Hiding in Elizabethan Binary

The great Tudor and Stuart philosopher Sir Francis Bacon was a scientist, a statesman and an author. He was also a pretty decent computer scientist. He published* a new form of cipher, now called Bacon’s Cipher, invented when he was a teenager. Its core idea is the foundation for the way all messages are stored in computers today.

From Pixabay

The Tudor and Stuart eras were a time of plot and intrigue. Perhaps the most famous is the 1605 Gunpowder plot where Guy Fawkes tried to assassinate King James I by blowing up the Houses of Parliament. Secrets mattered! In his youth Bacon had worked as a secret agent for Elizabeth I’s spy chief, Walsingham, so knew all about ciphers. Not content with using those that existed he invented his own. The one he is best remembered for was actually both a cipher and a form of steganography. While a cipher aims to make a message unreadable, steganography is the science of secret writing: disguising messages so no one but the recipient knows there is a message there at all.

A Cipher …

Bacon’s method came in two parts. The first was a substitution cipher, where different symbols are substituted for each letter of the alphabet in the message. This idea dates back to Roman times. Julius Caesar used a version, substituting each letter for a letter from a fixed number of places down the alphabet (so A becomes E, B becomes F, and so on). Bacon’s key idea was to replace each letter of the alphabet with, not a number or letter, but it’s own series of a’s and b’s (see the cipher table). The Elizabethan alphabet actually had only 24 letters so I and J have the same code as do U and V as they were interchangeable (J was the capital letter version of i and similarly for U and v).

In Bacon’s cipher everything is encoded in two symbols, so it is a binary encoding. The letters a and b are arbitrary. Today we would use 0 and 1. This is the first use of binary as a way to encode letters (in the West at least). Today all text stored in computers is represented in this way – though the codes are different – it is all Unicode is. It allocates each character in the alphabet with a binary pattern used to represent it in the computer. When the characters are to be displayed, the computer program just looks up which graphic pattern (the actual symbol as drawn) is linked to that binary pattern in the code being used. Unicode gives a binary pattern for every symbol in every human language (and some alien ones like Klingon).

Steganography

The second part of Bacon’s cipher system was Steganography. Steganography dates back to at least the Greeks, who supposedly tattooed messages on the shaved heads of slaves, then let their hair grow back before sending them as both messenger and message. The binary encoding of Bacon’s cipher was vital to make his steganography algorithm possible. However, the message was not actually written as a’s and b’s. Bacon realised that two symbols could stand for any two things. If you could make the difference hard to spot, you could hide the messages. Bacon invented two ways of handwriting each letter of the alphabet – two fonts. An ‘a’ in the encoded message meant use one font and a ‘b’ meant use the other. The secret message could then be hidden inside an innocent one. The letters written were no longer the message, the message was in the font used. As Bacon noted, once you have the message in binary you could think of other ways to hide it. One way used was with capital and lower-case letters, though only using the first letter of words to make it less obvious.

Suppose you wanted to hide the message “no” in the innocuous message ‘hello world’. The message ‘no’ becomes ‘abbaa abbab’. So far this is just a substitution cipher. Next we hide it in, ‘hello world’. Two different kinds of fonts are those with curls on the tails of letters known as serif fonts and like this one and those without curls known as sans serif fonts and like this one. We can use a sans serif font to represent an ‘a’ in the coded message, and a serif font to represent ‘b’. We just alternate the fonts following the pattern of the a’s and b’s: ‘abbaa abbab’. The message becomes

sans serif, serif, serif, sans serif, sans serif,
sans serif, serif, serif, sans serif, serif.

Using those fonts for our message we get the final mixed font message to send:

Bacon the polymath

Bacon is perhaps best known as one of the principal advocates for rigorous science as a way of building up knowledge. He argued that scientists needed to do more than just come up with theories of how the world worked, and also guard against just seeing the results that matched their theories. He argued knowledge should be based on careful, repeated observation. This approach is the basis of the Scientific Method and one of the foundation stones of modern science.

Bacon was also a famous writer of the time, and one of many authors who has since been suggested as the person who wrote William Shakespeare’s plays. In his case it is because they claim to have found secret messages hidden in the plays in Bacon’s code. The idea that someone else wrote Shakespeare’s plays actually started just because some upper class folk with a lack of imagination couldn’t believe a person from a humble background could turn themselves into a genius. How wrong they were!

– Paul Curzon, Queen Mary University of London, Autumn 2017

*Thanks to Pete Langman, whose PhD was on Francis Bacon, for pointing out a mistake in the original version of this blog where I suggested the cipher was published in, 1605, the year of the Gun Powder plot. It was actually first published in 1623 in De augmentis which was a translation/enlargement of his 1605 Advancement of Learning.

He also pointed out that Bacon conceived the idea while working with Elizabethan spymaster, Walsingham’s cipher expert at the time of the Babington plot to assasinate Elizabeth I, Thomas Phileppes, and Mary, Queen of Scots’ jailer, Amias paulet. Bacon also claimed the cipher was never broken!