Frequency Analysis for Fun

by Paul Curzon, Queen Mary University of London

Frequency Analysis, a technique beloved by spies for centuries, and that led to the execution of at least one Queen, also played a part in the development of the game Scrabble, over a hundred million copies of which have been sold worldwide.

Frequency Analysis was invented by Al-Kindi, a 9th Century Muslim, Arabic Scholar, as a way of cracking codes. He originally described it in his “A Manuscript on Deciphering Cryptographic Messages“. Frequency analysis just involves taking a large amount of normal text written in the language of interest and counting how often each letter appears. For example in English, the letter E is the most common. With simple kinds of cyphers that is enough information to be able to crack them, just by counting the frequency of the letters in the code you want to crack. Now large numbers of everyday people do frequency analysis just for fun, solving Cross Reference puzzles.

The link between frequency analysis and puzzles goes back earlier. When the British were looking for potential code breakers to staff their secret code breaking establishment at Bletchley Park in World War II, they needed people with the skills of frequency analysis and problem solving skills. They did this by setting up Crossword competitions and offering those who were fastest jobs at Bletchley: possibly the earliest talent competition with career changing prizes!

Earlier still, in the 1930s, Architect Alfred Mosher Butts, hit on the idea of a new game that combined crosswords and anagrams, which were both popular at the time. The result was Scrabble. However, when designing the game he had a problem in that he needed to decide how many of each letter the game should have and also how to assign the scores. He turned to frequency analysis of the front page of the New York Times to give the answers. He broke the pattern of his frequency analysis though, including fewer letter Ss (the second most common word in English) than there should be so the game wasn’t made too easy because of plurals.

Sherlock Holmes, of course, was a master of frequency analysis as described in the 1903 story “The Adventure of the Dancing Men”. Sir Arthur Conan Doyle wasn’t the first author to use it as a plot device though. Edgar Alan Poe had based a short story called “The Gold Bug” around frequency analysis in 1843. It was Poe who originally popularised frequency analysis with the general public rather than just with spymasters. Poe had discovered how popular the topic was as a result of having set a challenge in a magazine for people to send in cyphers – that he would then crack, giving the impression at the time that he had near supernatural powers. The way it was done was then described in detail in “The Gold Bug”.


This article was first published on the original CS4FN website. We have lots of free magazines and magic booklets that you can download.

Frequency analysis also appears in: The dark history of algorithms.

For little kids we have some fun free kriss-kross puzzles – they’re like crosswords but you’re given the words and you have to fit them into the crossword shape. You need to think like a computer scientist and use logical thinking, pattern matching and computational thinking to complete them. (For even younger kids these can also be used as a way of practising spelling, phonics and writing out words).


EPSRC supports this blog through research grant EP/W033615/1.

The Dark History of Algorithms

Zin Derfoufi, a Computer Science student at Queen Mary, delves into some of the dark secrets of algorithms past.

Algorithms are used throughout modern life for the benefit of mankind whether as instructions in special programs to help disabled people, computer instructions in the cars we drive or the specific steps in any calculation. The technologies that they are employed in have helped save lives and also make our world more comfortable to live it. However, beneath all this lies a deep, dark, secret history of algorithms plagued with schemes, lies and deceit.

Algorithms have played a critical role in some of History’s worst and most brutal plots even causing the downfall and rise of nations and monarchs. Ever since humans have been sent on secret missions, plotted to overthrow rulers or tried to keep the secrets of a civilisation unknown, nations and civilisations have been using encrypted messages and so have used algorithms. Such messages aim to carry sensitive information recorded in such a way that it can only make sense to the sender and recipient whilst appearing to be gibberish to anyone else. There are a whole variety of encryption methods that can be used and many people have created new ones for their own use: a risky business unless you are very good at it.

One example is the ‘Caesar Cipher’ which is named after Julius Caesar who used it to send secret messages to his generals. The algorithm was one where each letter was replaced by the third letter down in the alphabet so A became D, B became E, etc. Of course, it means that the recipient must know of the algorithm (sequence to use) to regenerate the original letters of the text otherwise it would be useless. That is why a simple algorithm of “Move on 3 places in the alphabet” was used. It is an algorithm that is easy for the general to remember. With a plain English text there are around 400,000,000,000,000,000,000,000,000 different distinct arrangements of letters that could have been used! With that many possibilities it sounds secure. As you can imagine, this would cause any ambitious codebreaker many sleepless nights and even make them go bonkers!!! It became so futile to try and break the code that people began to think such messages were divine!

But then something significant happened. In the 9th Century a Muslim, Arabic Scholar changed the face of cryptography forever. His name was Abu Yusuf Ya’qub ibn Ishaq Al-Kindi -better known to the West as Alkindous. Born in Kufa (Iraq) he went to study in the famous Dar al-Hikmah (house of wisdom) found in Baghdad- the centre for learning in its time which produced the likes of Al-Khwarzimi, the father of algebra – from whose name the word algorithm originates; the three Bana Musa Brothers; and many more scholars who have shaped the fields of engineering, mathematics, physics, medicine, astrology, philosophy and every other major field of learning in some shape or form.

Al-Kindi introduced the technique of code breaking that was later to be known as ‘frequency analysis’ in his book entitled: ‘A Manuscript on Deciphering Cryptographic Messages’. He said in his book:

“One way to solve an encrypted message, if we know its language, is to find a different plaintext of the same language long enough to fill one sheet or so, and then we count the occurrences of each letter. We call the most frequently occurring letter the ‘first’, the next most occurring one the ‘second’, the following most occurring the ‘third’, and so on, until we account for all the different letters in the plaintext sample.

“Then we look at the cipher text we want to solve and we also classify its symbols. We find the most occurring symbol and change it to the form of the ‘first’ letter of the plaintext sample, the next most common symbol is changed to the form of the ‘second’ letter, and so on, until we account for all symbols of the cryptogram we want to solve”.

So basically to decrypt a message all we have to do is find out how frequent each letter is in each (both in the sample and in the encrypted message – the original language) and match the two. Obviously common sense and a degree of judgement has to be used where letters have a similar degree of frequency. Although it was a lengthy process it certainly was the most efficient of its time and, most importantly, the most effective.

Since decryption became possible, many plots were foiled changing the course of history. An example of this was how Mary Queen of Scots, a Catholic, plotted along with loyal Catholics to overthrow her cousin Queen Elizabeth I, a Protestant, and establish a Catholic country. The details of the plots carried through encrypted messages were intercepted and decoded and on Saturday 15 October 1586 Mary was on trial for treason. Her life had depended on whether one of her letters could be decrypted or not. In the end, she was found guilty and publicly beheaded for high treason. Walsingham, Elizabeth’s spymaster, knew of Al-Kindi’s approach.

A more recent example of cryptography, cryptanalysis and espionage was its use throughout World War I to decipher messages intercepted from enemies. The British managed to decipher a message sent by Arthur Zimmermann, the then German Foreign Minister, to the Mexicans calling for an alliance between them and the Japanese to make sure America stayed out of the war, attacking them if they did interfere. Once the British showed this to the Americans, President Woodrow Wilson took his nation to war. Just imagine what the world may have been like if America hadn’t joined.

Today encryption is a major part of our lives in the form of Internet security and banking. Learn the art and science of encryption and decryption and who knows, maybe some day you might succeed in devising a new uncrackable cipher or crack an existing banking one! Either way would be a path to riches! So if you thought that algorithms were a bore … it just got a whole lot more interesting.

Further Reading

“Al Kindi: The Origins of Cryptology: The Arab Contributions” by Ibrahim A. Al-Kadi
Muslim Heritage: Al-Kindi, Cryptography, Code Breaking and Ciphers

“The code book: the Science of secrecy from Ancient Egypt to Quantum cryptography” by Simon Singh, especially Chapter one ‘The cipher of Queen Mary of Scots’

The Zimmermann Telegram
Wikipedia: Arthur_Zimmermann

This article was originally published on the CS4FN website, and on page 8 in Issue 6 of the magazine which you can download below along with all of our free material.


Related Magazine …

EPSRC supports this blog through research grant EP/W033615/1.

Hiding in Skype

Steganography in a video app

Computer Science isn’t just about using language, sometimes it’s about losing it. Sometimes people want to send messages so no one even knows they exist and a great place to lose language is inside a conversation.

Cryptography is the science of making messages unreadable. Spymasters have used it for a thousand years or more. Now it’s a part of everyday life. It’s used by the banks every time you use a cash point and by online shops when you buy something over the Internet. It’s used by businesses that don’t want their industrial secrets revealed and by celebrities who want to be sure that tabloid hackers can’t read their texts.

Cryptography stops messages being read, but sometimes just knowing that people are having a conversation can reveal more than they want even if you don’t know what was said. Knowing a football star is exchanging hundreds of texts with his team mate’s girlfriend suggests something is going on, for example. Similarly, CIA chief David Petraeus whose downfall made international news, might have kept his secret and his job if the emails from his lover had been hidden. David Bowie kept his 2013 comeback single ‘Where are we now?’ a surprise until the moment it was released. It might not have made him the front page news it did if a music journalist had just tracked who had been talking to who amongst the musicians involved in the months before.

That’s where steganography comes in – the science of hiding messages so no one even knows they exist. Invisible ink is one form of steganography used, for example, by the French resistance in World War II. More bizarre forms have been used over the years though – an Ancient Greek slave had a message tattooed on his shaven head warning of Persian invasion plans. Once his hair had grown back he delivered it with no one on the way the wiser.

Digital communication opens up new ways to hide messages. Computers store information using a code of 0s and 1s: bits. Steganography is then about finding places to hide those bits. A team of Polish researchers led by Wojciech Mazurczyk have now found a way to hide them in a video app (Skype) conversation.

Skype was one of the early popular video call applications, eventually replaced by Microsoft Teams. When you use Skype to make a phone call, the program converts the sounds you make to a long series of bits. They are sent over the Internet and converted back to sound at the other end. At the same time more sounds as bits stream back from the person you are talking to. Data transmitted over the Internet isn’t sent all in one go, though. It’s broken into packets: a bit like taking your conversation and tweeting it one line at a time.

Why? Imagine you run a crack team of commandos who have to reach a target in enemy territory to blow it up – a stately home where all the enemy’s Generals are having a party perhaps. If all the commandos travel together in one army truck and something goes wrong along the way probably no one will make it – a disaster. If on the other hand they each travel separately, rendezvousing once there, the mission is much more likely to be successful. If a few are killed on the way it doesn’t matter as the rest can still complete the mission.

The same applies to a video call. Each packet contains a little bit of the full conversation and each makes its own way to the destination across the Internet. On arriving there, they reform into the full message. To allow this to happen, each packet includes some extra data that says, for example, what conversation it is part of, how big it is and also where it fits in the sequence. If some don’t make it then the rest of the conversation can still be put back together without them. As long as too much isn’t missing, no one will notice.

Skype does something special with its packets. The size of the packets changes depending on how much data needs to be transmitted. If the person is talking, each packet carries a lot of information. If the person is listening then what is being transmitted is mainly silence. Skype then sends shorter packets. The Polish team realised they could exploit this for steganography. Their program, SkyDe, intercepts Skype packets looking for short ones. Any found are replaced with packets holding the data from the covert message. At the destination another copy of SkyDe intercepts them and extracts the hidden message and passes it on to the intended recipient. As far as Skype is concerned some packets just never arrive.

There are several properties that matter for a good steganographic technique. One is its bandwidth: how much data can be sent using the method. Because Skype calls contain a lot of silence SkyDe has a high bandwidth: there are lots of opportunities to hide messages. A second important property is obviously undetectability. The Polish team’s experiments have shown that SkyDe messages are very hard to detect. As only packets that contain silence are used and so lost, the people having the conversation won’t notice and the Skype receiver itself can’t easily tell because what is happening is no different to a typical unreliable network. Packets go missing all the time. Because both the Skype data and the hidden messages are encrypted, someone observing the packets travelling over the network won’t see a difference – they are all just random patterns of bits. Skype calls are now common so there are also lots of natural opportunities for sending messages this way – no one is going to get suspicious that lots of calls are suddenly being made.

All in all SkyDe provides an elegant new form of steganography. Invisible ink is so last century (and tattooing messages on your head so very last millennium). Now the sound of silence is all you need to have a hidden conversation.

Paul Curzon, Queen Mary University of London


Related Magazines …

A version of this article was originally published on the CS4FN website and a copy also appears on pages 10-11 of Issue 16 of the magazine.

You can also download PDF copies of all of our free magazines.

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos

Cryptography: You are what you know

A path through the forest at dawn in the fog
Image from PIXABAY

“Carter headed into the trees, his hat pulled low. Up ahead was a dark figure, standing in the shadow of a tree. As he drew close, Carter gave the agreed code phrase confirming he was the new agent: “Could I borrow a match?” The dark figure, stepped away from the tree, but rather than completing the exchange as Carter expected, he pulled a silenced gun. Before Carter could react, he heard the quiet spit of the gun and felt an excruciating pain in his chest. A moment later he was dead. Felix put the gun away, and quickly dragged the body into the bushes out of sight. He then went back to waiting. Soon another figure approached, but from the other direction. This time it was Felix who gave the pass phrase, which he now knew. “Could I borrow a match?” The new figure confidently responded, “Doesn’t everyone use a lighter these days?” Felix hadn’t known what he would say, but was happy to assume this was Carter’s real contact. He was in. “Hello. I’m Carter.” …

The trouble with using spy novel style passphrases to prove who you are is you still have to trust the other person. If they might have nefarious intentions, you want to prove who you are without giving anything else away. You certainly don’t want them to be able to take the information you give and use it to pretend to be you. Unfortunately, the above story is pretty much how passwords work, and why attacks like phishing, where someone sends emails pretending to be from your bank, are such a problem.

This is why phishing works

The story outlines the essential problem faced by all authentication systems trying to prove who someone is or that they possess some secret information. You give up the secret in the process to anyone there to hear. Security protocols somehow need ways one agent can prove to another who they are in a way that no one can masquerade as them in future. Creating a secure authentication system is harder than you might think! To do it well takes serious skill. What you don’t do is just send a password!

A simple solution for some situations is sometimes used by banks. Rather than ask you for a whole account number, they ask you for a random set of its digits: perhaps, the third, fifth and eighth digit one time, but completely different ones the next. Though they have learnt some of the secret, anyone listening in can’t masquerade as you as they will be asked for different digits when they do. Take this idea to an extreme and you get the “Zero Knowledge Proof“, where none of the secret is given up: possibly one of the cleverest ideas of computer science.

– Paul Curzon, Queen Mary University of London

More on …

Magazines …


This article was first published on CS4FN and a copy can also be found on page 5 in ‘Keep Out’ – Issue 24 of CS4FN magazine, on Cyber Security and Privacy (you can download the full magazine free as a PDF here).

All of our material is free to download from: https://cs4fndownloads.wordpress.com


Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This blog and page are funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSrC logos

Cryptography: Shafi Goldwasser and the Zero Knowledge Proof

Shafi Goldwasser is one of the greatest living computer scientists, having won the Turing Award in 2012 (equivalent to a Nobel Prize). Her work helped turn cryptography from a dark art into a science. If you’ve ever used a credit card through a web browser, for example, her work was helping you stay secure. Her greatest achievement, with Silvio Micali and Charles Rackoff, is the “Zero knowledge proof”.

Zero knowledge proofs deal with the problem that, to be really secure, security protocols often need to prove that some statement is true without giving anything else away (see “You are what you know“). A specific case is where an agent (software or human) wants to prove they know some secret, without actually giving the secret up.

Satisfy me this

There are three properties a zero knowledge proof must satisfy. Suppose Peggy is trying to convince Victor that some statement about a secret is true. Firstly, if Peggy’s statement is true then Victor must be convinced of this at the end. Secondly, if it is not actually true, there must only be a tiny chance that Peggy can convince Victor that it is true. Finally, Victor must not be able to cheat in any way that means he learns more about the secret beyond the truth of the statement. Shafi and colleagues not only came up with the idea, but showed that such proofs, unlikely as they seem, were possible.

Biosecurity break-in

Imagine the following situation (based on a scenario by Jean-Jacques Quisquater). A top secret biosecurity laboratory is protected so only authorised people can get in and out. The lab is at the end of a corridor that splits. Each branch goes to a door at the opposite end of the lab. These two doors are the only ways in or out. The rest of the room is totally sealed (see diagram).

Now, Peggy claims she knows how to get in, and has told Victor she can steal a sample of the secret biotoxin held there if he pays her a million dollars. Victor wants to be sure she can get in, before paying. She wants to prove her claim is true, but without giving anything more away, and certainly not by showing him how she does it, or giving him the toxin. She doesn’t even want him to have any hard evidence he could use to convince others that she can get in, as then he could use it against her. How does she do it?

“I can get in”

A floor plan of a top secret lab
Plan of top secret lab
Image by Paul Curzon.

She needs a Zero knowledge proof of her claim “I can get in”! Here is one way. Victor waits in the foyer, unable to see the corridor. Peggy goes to the fork, and chooses a branch to go down then waits at the door. Victor then goes to the fork, unable to see where she is but able to see both exit routes. He then chooses an exit corridor at random and tells Peggy to appear there. Peggy does, passing through the lab if need be.

If they do this enough times, with Victor choosing at random which side she should appear, then he can be strongly certain that she really does know how to get in. After all, that is the only way to appear at the other side. More to the point, he still cannot get in himself and even if he records everything he sees, he would have no way to convince anyone else that Peggy can get in. Even if he videod everything he saw, that would not be convincing proof. A video showing Peggy appearing from the correct corridor would be easy to fake. Peggy has shown she can get into the room, but without giving up the secret of how, or giving Victor a way to prove she can do it to anyone else.

So, strange as it seems, it is possible to prove you know a secret without giving anything more away about the secret. Thanks to Shafi and her co-researchers the idea is now a core part of computer security.

– Paul Curzon, Queen Mary University of London

More on …

Magazines …

This article was first published on CS4FN and a copy can also be found on pages 4-5 in ‘Keep Out’ – Issue 24 of CS4FN magazine, on Cyber Security and Privacy (you can download the full magazine free as a PDF here).

All of our magazines are free to download from: https://cs4fndownloads.wordpress.com

Subscribe to be notified whenever we publish a new post to the CS4FN blog.


This page is funded by EPSRC on research agreement EP/W033615/1.

QMUL CS4FN EPSRC logos