Artificial Intelligences are just tools, that do nothing but follow their programming. They are not self-aware and have no ability for self-determination. They are a what not a who. So what is it like to be a robot just following its (complex) program, making decisions based on data alone? What is it like to be an artificial intelligence? What is the real difference between being self-aware and not? What is the difference to being human? These are the themes explored by the dystopian (or is it utopian?) and funny science fiction novel “Service Model” by Adrian Tchaikovsky.
In a future where the tools of computer science and robotics have been used to make human lives as comfortable as conceivably possible, Charles(TM) is a valet robot looking after his Master’s every whim. His every action is controlled by a task list turned into sophisticated human facing interaction. Charles is designed to be totally logical but also totally loyal. What could go wrong? Everything it turns out when he apparently murders his master. Why did it happen? Did he actually do it? Is there a bug in his program? Has he been infected by a virus? Was he being controlled by others as part of an uprising? Has he become self-aware and able to made his own decision to turn on his evil master. And that should he do now? Will his task list continue to guide him once he is in a totally alien context he was never designed for, and where those around him are apparently being illogical?
The novel explores important topics we all need to grapple with, in a fun but serious way. It looks at what AI tools are for and the difference between a tool and a person even when doing the same jobs. Is it actually good to replace the work of humans with programs just because we can? Who actually benefits and who suffers? AI is being promoted as a silver bullet that will solve our economic problems. But, we have been replacing humans with computers for decades now based on that promise, but prices still go up and inequality seems to do nothing but rise with ever more children living in poverty. Who is actually benefiting? A small number of billionaires certainly are. Is everyone? We have many better “toys” that superficially make life easier and more comfortable – we can buy anything we want from the comfort of our sofas, self-driving cars will soon take us anywhere we want, we can get answers to any question we care to ask, ever more routine jobs are done by machines, many areas of work, boring or otherwise are becoming a thing of the past with a promise of utopia, but are we solving problems or making them with our drive to automate everything. Is it good for society as a whole or just good for vested interests? Are we losing track of what is most important about being human? Charles will perhaps help us find out.
Thinking about the consequences of technology is an important part of any computer science education and all CS professionals should think about ethics of what they are involved in. Reading great science fiction such as this is one good way to explore the consequences, though as Ursula Le Guin has said: the best science fiction doesn’t predict the future, it tells us about ourselves in the present. Following in the tradition of “The Machine Stops” and “I, Robot”, “Service Model” (and the short story “Human Resources” that comes with it) does that, if in a satyrical way. It is a must read for anyone involved in the design of AI tools especially those promoting the idea of utopian futures.
DNA is the molecule of life. Our DNA stores the information of how to create us. Now it can be hacked.
DNA consists of two strands coiling round each other in a double helix. It’s made of four building blocks, or ‘nucleotides’, labelled A, C, G, T. Different orders of letters gives the information of how to build each unique creature, you and me included. Sequences of DNA are analysed in labs by a machine called a gene sequencer. It works out the order of the letters and so tells us what’s in the DNA. When biologists talk of sequencing the human (or another animal or plant’s) genome they mean using a gene sequencer to work out the specific sequences in the DNA for that species. They are also used by forensic scientists to work out who might have been at the scene of a crime, and to predict whether a person has genetic disorders that might lead to disease.
DNA can be used to store information other than that of life: any information in fact. This may be the future of data storage. Computers use a code made of 0s and 1s. There is no reason why you can’t encode all the same information using A, C, G, T instead. For example, a string of 1s and 0s might be encoded by having each pair of bits represented by one of the four nucleotides: 00 = A, 01 = C, 10 = G and 11 = T. The idea has been demonstrated by Harvard scientists who stored a video clip in DNA.
It also leads to whole new cyber-security threats. A program is just data too, so can be stored in DNA sequences, for example. Researchers from the University of Washington have managed to hide a malicious program inside DNA that can attack the gene sequencer itself!
The gene sequencer not only works out the sequence of DNA symbols. As it is a computer, it converts it into a binary form that can then be processed as normal. As DNA sequences are long, the sequencer compresses them. The attack made use of a common bug found in programs that malware often uses: ‘buffer overflow’ errors. These arise when the person writing a program includes instructions to set aside a fixed amount of space to store data, but then doesn’t include code to make sure only that amount of data is stored. If more data is stored then it overflows into the memory area beyond that allocated to it. If executable code is stored there, then the effect can be to overwrite the program with new malicious instructions.
When the gene sequencer reaches that malware DNA, the converted program emerges and is converted back into 1s and 0s. If those bits are treated as instructions and executed, it launches its attack and takes control of the computer that runs the sequencer. In principle, an attack like this could be used to fake results for subsequent DNA tests, subverting court cases, disrupt hospital testing, steal sensitive genetic data, or corrupt DNA-based memory.
Fortunately, the risks of exactly this attack causing any problems in the real world are very low but the team wanted to highlight the potential for DNA based attacks, generally. They pointed out how lax the development processes and controls were for much of the software used in these labs. The bigger risk right now is probably from scientists falling for spear phishing scams (where fake emails pretending to be from someone you know take you to a malware website) or just forgetting to change the default password on the sequencer.
CREDIT: Randall Munroe, xkcd.com https://xkcd.com/936 – reprinted under a CC Attribution-NonCommercial 2.5 License
How do you decide whether a password is strong? Computer scientists have a mathematical way to do it. Based on an idea called information entropy it’s part of “Information Theory”, invented by electrical engineer Claude Shannon back in 1948. This XKCD cartoon for computer scientists uses this idea to compare two different password. Unless you understand information theory the detail is a bit mind blowing to work out what is going on though… so let’s explain the computer science!
Entropy is based on the number of guesses someone would need to make trying all the possibilities for a password one at a time – doing what computer scientists call a brute force attack. Think about a PIN on a mobile phone or cash point. Suppose it was a single digit – there would be 10 possibilities to try. If 2-digit PINS were required then there are now 100 different possibilities. With the normal 4 digits you need 10,000 (10^4 = 10x10x10x10) guesses to be sure of getting it. Different symbol sets lead to more possibilities. If you had to use lower case letters instead of digits, there are 26 possibilities for length 1 so over 450,000 (26^4 = 26x26x26x26) guesses needed for the 4 letter password. If upper case letters are possible that goes up to more than 7 million (52 letters so 52^4 = 52x52x52x52) guesses. If you know they used a word though you don’t have to try all the possibilities, just the words. There are only about 5000 of those, so far fewer guesses needed. So password strength depends on the number of symbols that could be used, but also whether the PIN or password was chosen randomly (words aren’t a random sequence of letters!)
To make everything standard, Shannon used binary to do entropy calculations so assumed a symbol set of only 0 and 1 (so all the answers become powers of 2 because he used 2 symbols). He then measured them in ‘bits’ needed to count all the guesses. Any other groups of symbols are converted to binary first. If a cashpoint only had buttons A, B, C and D for PINs, then to do the calculation you count those 4 options in binary: 00 (A), 01 (B), 10 (C), 11 (D) and see you need 2 bits to do it (2^2 = 2×2 choices). Real cashpoints have 10 digits and need just over 3 bits to represent all of a 1 digit PIN (2^3 = 2x2x2 = 8 so not enough, 2^4 = 2x2x2x2 = 16 so more than you need so the answer is more than 3 but less than 4). It’s entropy would be just over 3. To count the possibilities for a 4-digit PIN you need just over 13 bits, so that is its entropy. The lower case alphabet needs just under 6 bits to store the number of possibilities, so entropy is about 6, and so on.
So entropy is not measured directly in terms of guesses (which would be very big numbers) but instead indirectly in terms of bits. If you determine the entropy to be 28 bits (as in the cartoon), that means the number of different possibilities to guess would fit in 28 bits if each guess was given its own unique binary code. 28 bits of binary can be used to represent over 268 million different things (2^28), so that is the number of guesses actually needed. It is a lot of guesses but not so many a computer couldn’t try them all fairly quickly as the cartoon points out!
Where do those 28 bits come from? Well they assume the hacker is just trying to crack passwords that follow a common pattern people use (so are not that random). The hacker assumes the person did the following: Take a word; maybe make the first letter a capital; swap digits in place of some similar looking letters; and finally add a symbol and letter at the end. It follows the general advice for passwords, and looks random … but is it?
How do we work out the entropy of a password invented that way? First, think about the base word the password is built around. They estimate it as 16 bits for an up to 9 letter word of lower-case letters, so are assuming there are 2^16 (i.e. 65,000) possible such base words. There are about 40,000 nine letter words in English, so that’s an over estimate if you are assuming you know the length. Perhaps not if you assume things like fictional names and shorter words are possible.
As the pattern followed is that the first letter could be uppercase, that adds 1 more bit for the two guesses now needed for each word tried: check if it’s upper-case, then check if it’s lower-case. Similarly, as any letter ‘o’ might have been swapped for 0, and any ‘a’ for 4 (as people commonly do) this adds 3 more bits (assuming there are at most 3 such opportunities per word). Finally, we need 3 bits for the 9 possible digits and another 4 bits for the common punctuation characters added on the end. Another bit is added for the two possible orders of punctuation and digit. Add up all those bits and we have the 28 bits suggested in the cartoon (where bits are represented by little squares).
Now do a similar calculation for the other way of creating passwords suggested in the cartoon. If there are only about 2000 really common words a person might choose from, we need 11 bits per word. If we string 4 such words together completely at random (not following a recognisable phrase with no link between the words) we get the much larger entropy of 44 bits overall. More bits means harder to crack, so this password will be much, much harder than the first. It takes over 17 trillion guesses rather than 268 million.
The serious joke of the cartoon is that the rules we are told to follow leads to people creating passwords that are not very random at all, precisely because they have been created following rules. That means they are easy to crack (but still hard to remember). If instead you used the 4 longish and unconnected word method which doesn’t obey any of the rules we are told to follow, you actually get a password that is much easier to remember (if you turn it into a surreal picture and remember the picture), but harder to crack because it actually has more randomness in it! That is the real lesson. How ever you create a password, it has to have lots of randomness for it to be strong. Entropy gives you a way to check how well you have done.
You might be surprised at how many people have something short, simple (and stupid!) like ‘password’ as their password. Some people add a number to make it harder to guess (‘password1’) but unfortunately that doesn’t help. For decades the official advice has been to use a mixture of lower (abc) and upper case (ABC) characters as well as numbers (123) and special characters (such as & or ^). To meet these rules some people substitute letters for numbers (for example 0 for O or 4 for A and so on). Following these rules might lead you to create something like “P4ssW0^d1” which looks like it might be difficult to crack, but isn’t. The problem is that people tend to use the same substitutions so password-crackers can predict, and so break, them too.
Hackers know the really common passwords people use like ‘password’, ‘qwerty’ and ‘12345678’ (and more) so will just try them as a matter of course until they very quickly come across one of the many suckers who used one. Even apparently less obvious passwords can be easy to crack, though. The classic algorithm used is a ‘dictionary attack’.
The simple version of this is to run a program that just tries each word in an online dictionary one at a time as a password until it finds a word that works. It takes a program fractions of seconds to check every word like this. Using foreign words doesn’t help as hackers make dictionaries by combining those for every known language into one big universal dictionary. That might seem like a lot of words but it’s not for a computer.
You might think you can use imaginary words from fiction instead – names of characters in Lord of the Rings, perhaps, or the names of famous people. However, it is easy to compile lists of words like that too and add them to the password cracking dictionary. If it is a word somewhere on the web then it will be in a dictionary for hacking use.
Going a step further, a hacking program can take all these words and create versions with numbers added, 4 swapped for A, and so on. These new potential passwords become part of the attack dictionary too. More can be added by taking short words and combining them, including ones that appear in well known phrases like ‘starwars’ or ‘tobeornottobe’.
The list gets bigger and bigger, but computers are fast, and hackers are patient, so that’s no big deal…so make sure your password isn’t in their dictionary!
– Jo Brodie and Paul Curzon, Queen Mary University of London
Computer hackers are the bad guys, aren’t they? They cause mayhem: shutting down websites, releasing classified information, stealing credit card numbers, spreading viruses. They can cause lots of harm, even when they don’t mean to. Not all hackers are bad though. Some, called white hat hackers, are ethical hackers, paid by companies to test their security by actively trying to break in – it’s called penetration testing. It’s not just business though, it was also turned into a card game.
Perhaps the most famous white hat hacker is Kevin Mitnick. He started out as a bad guy – the most-wanted computer criminal in the US. Eventually the FBI caught him, and after spending 5-years in prison he reformed and became a white hat hacker who now runs his own computer security company. The way he hacked systems had nothing to do with computer skills and everything to do with language skills. He did what’s called social engineering. A social engineer uses their skills of persuasion to con people into telling them confidential information or maybe even actually doing things for them like downloading a program that contains spyware code. Professional white hat hackers have to have all round skills though: network, hardware or software hacking skills, not just social engineering ones. They need to understand a wide range of potential threats if they are to properly test a company’s security and help them fix all the vulnerabilities.
Breaking the law and ending up in jail, like Kevin Mitnik, isn’t a great way to learn the skills for your long-term career though. A more normal way to become an expert is to go to university and take classes. Wouldn’t playing games be a much more fun way to learn than sitting in lectures, though? That was what Tamara Denning, Tadayoshi Kohno, and Adam Shostack, computer security experts from the University of Washington, wondered. As a result, they teamed up with Steve Jackson Games and came up with a card game Control-Alt-Hack(TM) (www.controlalthack.com), sadly no longer available. It was based on the cult tabletop card game, Ninja Burger. Rather than being part of a Ninja Burger Delivery team as in that game, in Control-Alt-Hack(TM) you are an ethical white hat hacker working for an elite security company. You have to complete white hat missions using your Ninja hacking skills: from shutting down an energy company to turning a robotic vacuum cleaner into a pet. The game is lots of fun, but the idea was that by playing it you would understand a lot more of about the part that computer security plays in everyones lives and about the kinds of threats that security experts have to protect against.
We could all do with more of that. Lot’s of people like gaming so why not learn something useful at the same time as having fun? Let’s hope there are more fun, and commercial games, invented in future about cyber security. It would make a good cooperative game in the style of Pandemic perhaps, and there must be simple board game possibilities that would raise awareness oc cyber security threats. It would be great if one day such games could inspire more people to a career as a security expert. We certainly need lots more cybersecurity experts keeping us all safe.
The traditional story of how World War II was won is that of inspiring leaders, brilliant generals and plucky Brits with “Blitz Spirit”. In reality it is usually better technology that wins wars. Once that meant better weapons, but in World War II, mathematicians and computer scientists were instrumental in winning the war by cracking the German codes using both maths and machines. It is easy to be a brilliant general when you know the other sides plans in advance!. Less celebrated but just as important, weathermen and electronic engineers were also instrumental in winning World War II, and especially, the Battle of Britain, with the invention of RADAR. It is much easier to win an air battle when you know exactly where the opposition’s planes. It was down largely to meteorologist and electronic engineer, Robert Watson-Watt and his assistant Arnold Wilkins. Their story is told in the wonderful, but under-rated, film Castles in the Sky, starring Eddie Izzard.
****SPOILER ALERT****
In the 1930s, Nazi Germany looked like an ever increasing threat as it ramped up it’s militarisation, building a vast army and air force. Britain was way behind in the size of its air force. Should Germany decide to bomb Britain into submission it would be a totally one-sided battle. SOmething needed to be done.
A hopeful plan was hatched in the mid 1930s to build a death ray to zap pilots in attacking planes. One of the engineers asked to look into the idea was Robert Watson-Watt. He worked for the met office. He was an expert in the practical use of radio waves. He had pioneered the idea of tracking thunderstorms using the radio emissions from lightening as a warning system for planes, developing the idea as early as 1915. This ultimately led to the invention of “Huff-Duff”, shorthand for High Frequency Direction Finding, where radio sources could be accurately tracked from the signals they emitted. That system helped Britain win the U-Boat war, in the North Atlantic, as it allowed anti-submarine ships to detect and track U-Boats when they surfaced to use their radio. As a result Huff-Duff helped sink a quarter of the U-Boats that were attacked. That in itself was vital for Britain to survive the siege that the U-Boats were enforcing sinking convoys of supplies from the US.
However, by the 1930s Watson-Watt was working on other applications based on his understanding of radio. His assistant, Arnold Wilkins, quickly proved that the death ray idea would never work, but pointed out that planes seemed to affect radio waves. Together they instead came up with the idea of creating a radio detection system for planes. Many others had played with similar ideas, including German engineers, but no one had made a working system.
Because the French coast was only 20 minutes flying time away the only way to defend against German bombers would be to have planes patrolling the skies constantly. But that required vastly more planes than Britain could possibly build. If planes could be detected from sufficiently far away, then Spitfires could instead be scrambled to intercept them only when needed. That was the plan, but could it be made to work, when so little progress had been made by others?
Watson-Watt and Wilkins set to work making a prototype which they successfully demonstrated could detect a plane in the air (if only when it was close by). It was enough to get them money and a team to keep working on the idea. Watson-Watt followed a maxim of “Give them the third best to go on with; the second best comes too late, the best never comes”. With his radar system he did not come up with a perfect system, but with something that was good enough. His team just used off-the shelf components rather than designing better ones specifically for the job. Also, once they got something that worked they put it into action. Unlike later, better systems their original radar system didn’t involve sweeping radar signals that bounced off a plane when the sweep pointed at it, but a radio signal blasted in all directions. The position of the plane was determined by a direction finding system Watson-Watt designed based on where the radio signal bounced back from. That meant it took lots of power. However, it worked, and a network of antennas were set up in time for the Battle of Britain. Their radar system, codenamed Chain Home could detect planes 100 miles away. That was plenty of time to scramble planes. The real difficulty was actually getting the information to the air fields to scramble the pilots quickly. That was eventually solved with a better communication system.
The Germans were aware of all the antenna, appearing along the British coast but decided it must be a communications system. Carrots also helped fool them! You may of heard that carrots help you see in the dark. That was just war-time propaganda invented to explain away the ability of the Brits to detect bombers so soon…a story was circulated that due to rationing Brits were eating lots of carrots so had incredible eye-sight as a result!
The Spitfires and their fighter pilots got all the glory and fame, but without radar they would not even have been off the ground before the bombers had dropped their payloads. Practical electronic engineering, Robert Watson-Watt and Arnold Wilkins were the real unsung heroes of the Battle of Britain.
– Paul Curzon, Queen Mary University of London
Postscript
In the 1950s Watson-Watt was caught speeding by a radar speed trap. He wrote a poem about it:
A Rough Justice
by Sir Robert Watson-Watt
Pity Sir Watson-Watt, strange target of this radar plot
And thus, with others I can mention, the victim of his own invention.
His magical all-seeing eye enabled cloud-bound planes to fly
but now by some ironic twist it spots the speeding motorist
and bites, no doubt with legal wit, the hand that once created it.
My poetry collection, «Αλγόριθμοι Σιωπής» (Algorithms of Silence), explores the quiet, often unseen structures that shape our inner lives. As a computer scientist and a poet, I’m fascinated by the language we use to describe these systems – whether they are emotional, social, or computational.
The following piece is an experiment that embodies this theme. It presents a single core idea – about choice, memory, and predetermination – in three different languages: the original Greek poem “Αυτόματον Μεταβατικόν,” an English transcreation, and a pseudocode version that translates the poem’s philosophical questions into the logic of an automaton.
– Vasileios Klimis, Queen Mary University of London
Transitional Automaton
Once, a decision – small, like a flaw in a cogwheel – tilted the whole system toward a version never written.
In the workshop of habits, every choice left behind a trace of activation; you don’t see it, but it returns like a pulse through a one-way gate.
•
I walk through a matrix of transitions where each state defines the memory of the next. Not infinite possibilities – only those the structure permits.
Is this freedom? Or merely the optimal illusion of a system with elastic rules?
•
In moments of quiet (but not of silence) I feel the null persisting not as absence, but as a repository in waiting. Perhaps that is where it resides, all that was never activated.
•
If there is a continuation, it will resemble a debug session more than a crisis.
Not a moral crisis; a recursion. Who passes down to the final terminal the most probable path?
•
The question is not what we lived. But which of the contingencies remained active when we stopped calculating.
Αυτόματον μεταβατικόν
Κάποτε, μια απόφαση – μικρή, σαν στρέβλωση σε οδοντωτό τροχό – έγερνε το σύνολο προς μια εκδοχή που δεν γράφτηκε ποτέ.
•
Στο εργαστήριο των συνηθειών κάθε επιλογή άφηνε πίσω της ένα ίχνος ενεργοποίησης· δεν το βλέπεις, αλλά επιστρέφει σαν παλμός σε μη αντιστρεπτή πύλη.
•
Περπατώ μέσα σ’ έναν πίνακα μεταβάσεων όπου κάθε κατάσταση ορίζει τη μνήμη της επόμενης. Όχι άπειρες πιθανότητες – μόνον όσες η δομή επιτρέπει. Είναι ελευθερία αυτό; Ή απλώς η βέλτιστη πλάνη ενός συστήματος με ελαστικούς κανόνες;
•
Σε στιγμές σιγής (αλλά όχι σιωπής) νιώθω το μηδέν να επιμένει όχι ως απουσία, αλλά ως αποθήκη αναμονής. Ίσως εκεί διαμένει ό,τι δεν ενεργοποιήθηκε.
•
Αν υπάρξει συνέχεια, θα μοιάζει περισσότερο με debug session παρά με κρίση.
•
Όχι κρίση ηθική· μία αναδρομή. Ποιος μεταβιβάζει στο τερματικό του τέλους το πιο πιθανό μονοπάτι;
•
Η ερώτηση δεν είναι τι ζήσαμε. Αλλά ποιο από τα ενδεχόμενα έμεινε ενεργό όταν εμείς σταματήσαμε να υπολογίζουμε.
Pseudocode Poem version
Pseudocode poems are poems written in pseudocode: the semi-formailsed language used for writing algorithms and planning the design of program. Here is the above poem as a pseudocode poem.
FUNCTION life_automaton(initial_state)
DEFINE State_Transitions AS Matrix;
DEFINE active_path AS Log;
DEFINE potential_paths AS Set = {all_versions_never_written};
current_state = initial_state;
system.log("Initializing in the workshop of habits.");
REPEAT
WAIT FOR event.decision;
// a decision — small, like a flaw in a cogwheel
IF (event.decision.is_subtle) THEN
previous_state = current_state;
current_state = State_Transitions.calculate_next
(previous_state, event.decision);
// it returns like a pulse through a one-way gate
active_path.append(previous_state -> current_state);
potential_paths.remove(current_state.version);
END IF
// Is this freedom? Or merely the optimal illusion
// of a system with elastic rules?
IF (system.isQuiet) THEN
// I feel the null persisting
// not as absence, but as a repository in waiting.
// Perhaps that is where it resides, all that was never activated.
PROCESS potential_paths.contemplate();
END IF
UNTIL system.isTerminated;
// If there is a continuation,
// it will resemble a debug session more than a crisis.
// Not a moral crisis; a recursion.
DEBUG_SESSION.run(active_path);
// The question is not what we lived.
// But which of the contingencies remained active
// when we stopped calculating.
RETURN final_state = active_path.getLast();
END FUNCTION
How can computer scientists improve computer memory, ensuring saving things is more secure? If Vasileios Klimis of Queen Mary, University of London’s Theory research group has his way, they will be learning from bats.
Imagine spending hours building the perfect fortress in Minecraft, complete with lava moats and secret passages; or maybe you’re playing Halo, and you’ve just customised your SPARTAN with an epic new helmet. You press ‘Save’, and breathe a sigh of relief. But what happens next? Where does your digital castle or new helmet go to stay safe?
It turns out that when a computer saves something, it’s not as simple as putting a book on a shelf. The computer has lots of different places to put information, and some are much safer than others. Bats are helping us do it better!
The Bat in the Cave
Imagine you’re a bat flying around in a giant, dark cave. You can’t see, so how do you know where the walls are? You let out a loud shout!
SQUEAK!
A moment later, you hear the echo of your squeak bounce back to you. If the echo comes back really, really fast, you know the wall is very close. If it takes a little longer, you know the wall is further away. By listening to the timing of your echoes, you can build a map of the entire cave in your head without ever seeing it. This is called echolocation.
It turns out we can use this exact same idea to “see” inside a computer’s memory!
Fast Desks and Safe Vaults
A computer’s memory is a bit like a giant workshop with different storage areas.
There’s a Super-Fast Desk right next to the computer’s brain (the CPU). This is where it keeps information it needs right now. It’s incredibly fast to grab things from this desk, but there’s a catch: if the power goes out, everything on the desk is instantly wiped away and forgotten! If your data is here, it is not safe!
Further away, there’s a Big, Safe Vault. It takes a little longer to walk to the vault to store or retrieve things. But anything you put in the vault is safe, even if the power goes out. When you turn the computer back on, the information is still there.
When you press ‘Save’ in your game, you want your information to go from the fast-but-forgetful desk to the slower-but-safe vault. But how can we be sure it got there? We can’t just open up the computer and look!
Shouting and Listening for Echoes
This is where we use our bat’s trick. To check where a piece of information is, a computer scientist can tell the computer to do two things very quickly:
SHOUT! First, it “shouts” by writing a piece of information, like your game score.
LISTEN! Immediately after, it tries to read that same piece of information back. This is like listening for the “echo”.
If the echo comes back almost instantly, we know the information is still on the Super-Fast Desk nearby. But if the echo takes a little longer, it means the information had to travel all the way to the Big, Safe Vault and back!
By measuring the time of that echo, computer scientists can tell exactly where the write went. We can confirm that when you pressed ‘Save’, your information really did make it to the safe place.
The Real Names
In computer science, we have official names for these ideas:
The Super-Fast Desk is called the Cache.
The Big, Safe Vault is called Non-Volatile Memory (or NVM for short), which is a fancy way of saying it doesn’t forget when the power is off.
The whole system of close and far away memory is the Memory Hierarchy.
And this cool trick of shouting and listening is what we call Memory Echolocation.
So next time you save a game, you can imagine the computer shouting a tiny piece of information into its own secret cave and listening carefully for the echo to make sure your progress is safe and sound.
– Vasileios Klimis, Queen Mary University of London
Why should AI tools explain why? Erhan Pisirir and Evangelia Kyrimi, researchers ar Queen Mary University of London explain why.
From the moment we start talking, we ask why. A three-year-old may ask fifty “whys” a day. ‘Why should I hold your hand when we cross the road?’ ‘Why do I need to wear my jacket?’ Every time their parent provides a reason, the toddler learns and makes sense of the world a little bit more.
Even when we are no longer toddlers trying to figure out why the spoon falls on the ground and why we should not touch the fire, it is still in our nature to question the reasons. The decisions and the recommendations given to us have millions of “whys” behind them. A bank might reject our loan application. A doctor might urge us to go to hospital for more tests. And every time, our instinct is to ask the same question: Why? We trust advice more when we understand it.
Nowadays the advice and recommendations come not only from other humans but also from computers with artificial intelligence (AI), such as a bank’s computer systems or health apps. Now that AI systems are giving us advice and making decisions that affect our lives, shouldn’t they also explain themselves?
That’s the promise of Explainable AI: building machines that can explain their decisions or recommendations. These machines must be able to say what is decided, but also why, in a way we can understand.
From trees to neurons
For decades we have been trying to make machines think for us. A machine does not have the thinking, or the reasoning, abilities of humans. So we need to give instructions on how to think. When computers were less capable, these instructions were simpler. For example, it could look like a tree: think of a tree where each branch is a question with several possible answers, and each answer creates a new branch. Do you have a rash? Yes Do you have a temperature? Yes. Do you have nausea? Yes. Are the spots purple? Yes. If you push a glass against them do they fade away? No … Go to the hospital immediately.
The tree of decisions naturally gives whys connected to the tips of the paths taken: You should go to the hospital because your collection of symptoms: having a rash of purple spots, a temperature and nausea and especially because they do not fade under a glass, mean that it is likely you have Meningitis. Because it is life-threatening and can get worse very quickly, you need to get to a hospital urgently. An expert doctor can check reasoning like this and decide whether that explanation is actually good reasoning about whether someone has Meningitis or not, or more to the point should rush to the hospital.
Humans made computers much more capable of more complex tasks over time. With this, their thinking instructions became more complex too. Nowadays they might look like more complicated networks instead of trees with branches. They might look like a network of neurons in a human brain, for example. These complex systems make computers great at answering more difficult questions successfully. But unlike looking at a tree of decisions, humans cannot understand how the computer reaches its final answer at a glance of its system of thinking anymore. It is no longer the case that following a simple path of branches through a decision tree gives a definite answer, never mind a why. Now there are loops and backtracks, splits and joins, and the decisions depend on weightings of answers not just a definite Yes or No. For example, with Meningitis, according to the NHS website, there are many more symptoms than above and they can appear in any order or not at all. There may not even be a rash, or the rash may fade when pressure is applied. It is complicated and certainly not as simple as our decision tree suggests (the NHS says “Trust your instincts and and do not wait for all the symptoms to appear or until a rash develops. You should get medical help immediately if you’re concerned about yourself or your child.”) Certainly, the situation is NOT simple enough to say from a decision tree, for example, “Do not worry, you do not have Meningitis because your spots are not purple and did fade in the glass test”. An explanation like that could kill someone. The decision has to be made from a complex web of inter-related facts. AI tools require you to just trust their instincts!
Let us, for a moment, forget about branches and networks, and imagine that AI is a magician’s hat: something goes in (a white handkerchief) and something else at the tap of a wand magically pops out (a white rabbit). With a loan application, for example, details such as your age, income, or occupation go in, and a decision comes out: approved or rejected.
Inside the magician’s hat
Nowadays researchers are trying to make the magician’s hat transparent so that you can have a sneak peek of what is going on in there (it shouldn’t seem like magic!). Was the rabbit in a secret compartment, did the magician move it from the pocket and put it in at the last minute or did it really appear out of nowhere (real magic)? Was the decision based on your age or income, or was it influenced by something that should be irrelevant like the font choice in your application?
Currently, explainable AI methods can answer different kinds of questions (though, not always effectively):
Why: Your loan was approved because you have a regular income record and have always paid back loans in the past.
Why not: Your loan application was rejected because you are 20 years old and are still a student,
What if: If you earned £1000 or more each month, your loan application would not have been rejected.
Researchers are inventing many different ways to give these explanations: for example, heat maps that highlight the most important pixels in an image, lists of pros and cons that show the factors for and against a decision, visual explanations such as diagrams or highlights, or natural-language explanations that sound more like everyday conversations.
What explanations are good for
The more interactions people have with AI, the more we see why AI explanations are important.
Understanding why AI made a specific recommendation helps people TRUST the system more; for example, doctors (or patients) might want to know why AI flagged a tumour before acting on its advice.
The explanations might expose if AI recommendations have discrimination and bias, increasing FAIRNESS. Think about the loan rejection scenario again, what if the explanation shows that the reason of AI’s decision was your race? Is that fair?
The explanations can help researchers and engineers with DEBUGGING, helping them understand and fix problems with AI faster.
AI explanations are also becoming more and more required by LAW. The General Data Protection Regulation (GDPR) gives people a “right to explanation” for some automated decisions, especially for high stake areas, such as healthcare and finance.
The convincing barrister
One thing to keep in mind is that the presence of explanations does not automatically make an AI system perfect. Explanations themselves can be flawed. The biggest catch is when an explanation is convincing when it shouldn’t be. Imagine a barrister with charming social skills who can spin a story and let a clearly guilty client go free from charge. The AI explanations should not aim to be blindly convincing whether the AI is right or wrong. In the cases AI got it all wrong (and from time to time it will), the explanations should make this clear rather than falsely reassuring the human.
The future
Explainable AI isn’t an entirely new concept. Decades ago, early expert systems in medicine already included “why” buttons to justify their advice. But only in recent years explainable AI has become a major trend, because of AI systems becoming more powerful and with the increase of concerns about AI surpassing human decision-making but potenitally making some bad decisions.
Researchers are now exploring ways to make explanations more interactive and human friendly, similarly to how we can ask questions to ChatGPT like ‘what influenced this decision the most?’ or ‘what would need to change for a different outcome?’ They are trying to tailor the explanation’s content, style and representation to the users’ needs.
So next time AI makes a decision for you, ask yourself: could it tell me why? If not, maybe it still has some explaining to do.
–Erhan Pisirir and Evangelia Kyrimi, Queen Mary University of London
A perceptron winter: Winter image by Image by Nicky ❤️🌿🐞🌿❤️ from Pixabay. Perceptron and all other image by CS4FN.
Back in the 1960s there was an AI winter…after lots of hype about how Artificial Intelligence tools would soon be changing the world, the hype fell short of the reality and the bubble burst, funding disappeared and progress stalled. One of the things that contributed was a simple theoretical result, the apparent shortcomings of a little device called a perceptron. It was the computational equivalent of an artificial brain cell and all the hype had been built on its shoulders. Now, variations of perceptrons are the foundation of neural networks and machine learning tools which are taking over the world…so what went wrong in the 1960s? A much misunderstood mathematical result about what a perceptron can and can’t do was part of the problem!
The idea of a perceptron dates back to the 1940s but Frank Rosenblatt, a researcher at Cornell Aeronautical Laboratory, first built one in 1958 and so popularised the idea. A perceptron can be thought of as a simple gadget, or as an algorithm for classifying things. The basic idea is it has lots of inputs of 0 or 1s and one output, also of 0 or 1 (so equivalent to taking true / false inputs and returning a true / false output). So for example, a perceptron working as a classifier of whether something was a mammal or not, might have inputs representing lots of features of an animal. These would be coded as 1 to mean that feature was true of the animal or 0 to mean false: INPUT: “A cow gives birth to live young” (true: 1), “A cow has feathers” (false: 0), “A cow has hair” (true: 1), “A cow lays eggs” (false: 0), “etc. OUTPUT: (true: 1) meaning a cow has been classified as a mammal.
A perceptron makes decisions by applying weightings to all the inputs that increase the importance of some, and lesson the importance of others. It then adds the results together also adding in a fixed value, bias. If the sum it calculates is greater then or equal to 0 then it outputs 1, otherwise it outputs 0. Each perceptron has different values for the bias and the weightings, depending on what it does. A simple perceptron is just computing the following bit of code for inputs in1, in2, in3 etc (where we use a full stop to mean multiply):
IF bias + w1.in1 + w2.in2 + w3.in3 ... >= 0
THEN OUTPUT O
ELSE OUTPUT 1
Because it uses binary (1s and 0s), this version is called a binary classifier. You can set a perceptron’s weights, essentially programming it to do a particular job, or you can let it learn the weightings (by applying learning algorithms to the weightings). In the latter case it learns for itself the right answers. Here, we are interested in the fundamental limits of what perceptrons could possibly learn to do, so do not need to focus on the learning side just on what a perceptron’s limits are. If we can’t program it to do something then it can’t learn to do it either!
Machines made of lots of perceptrons were created and experiments were done with them to show what AIs could do. For example, Rosenblatt built one called Tobermory with 12,000 weights designed to do speech recognition. However, you can also explore the limits of what can be done computationally through theory: using maths and logic, rather than just by invention and experiments, and that kind of theoretical computer science was what others did about perceptrons. A key question in theoretical computer science about computers is “What is computable?” Can your new invention compute anything a normal computer can? Alan Turing had previously proved an important result about the limits of what any computer could do, so what about an artificial intelligence made of perceptrons? Could it learn to do anything a computer could or was it less powerful than that?
As a perceptron is something that takes 1s and 0s and returns a 1 or 0, it is a way of implementing logic: AND gates, OR gates, NOT gates and so on. If it can be used to implement all the basic logical operators then a machine made of perceptrons can do anything a computer can do, as computers are built up out of basic logical operators. So that raises a simple question, can you actually implement all the actual logical operators with perceptrons set appropriately. If not then no perceptron machine will ever be as powerful as a computer made of logic gates! Two of the giants of the area Marvin Minsky and Seymour Papert investigated this. What they discovered contributed to the AI winter (but only because the result was misunderstood!)
Let us see what it involves. First, can we implement an AND gate with appropriate weightings and bias values with a perceptron? An AND gate has the following truth table, so that it only outputs 1 if both its inputs are 1:
Truth table for an AND gate
So to implement it with a perceptron, we need to come up with positive or negative number for, bias, and other numbers for w1 and w2, that weight the two inputs. The numbers chosen need to lead to it giving output 1 only when the two inputs (in1 and in2) are 1 and otherwise giving output, 0.
See if you can work out the answer before reading on.
A perceptron for an AND gate needs values set for bias, w1 and w2
It can be done by setting the value of b to -2 and making both weightings, w1 and w2, value 1. Then, because the two inputs, in1, and in2 can only be 1 or 0, it takes both inputs being 1 to overcome b’s value of -2 and so raise the sum up to 0:
So far so good. Now, see if you can work out weightings to make an OR gate and a NOT gate.
Truth table for an OR gateTruth table for a NOT gate
It is possible to implement both OR and NOT gate as a perceptron (see answers at the end).
However, Minsky and Papert proved that it was impossible to create another kind of logical operator, an XOR gate, with any values of bias and weightings in a perceptron. This a logic gate that has output 1 if its inputs are different, and outputs 0 if its inputs are the same.
Truth table for an XOR gate
Can you prove it is impossible?
They had seemingly shown that a perceptron could not compute everything a computer could. Perceptrons were not as expressive so not as powerful (and never could be as powerful) as a computer. There were things they could never learn to do, as there were things as simple as an XOR gate that they could not represent. This led some to believe the result meant AIs based on perceptrons were a dead end. It was better to just work with traditional computers and traditional computing (which by this point were much faster anyway). Along with the way that the promises of AI had been over-hyped with exaggerated expectations and the applications that had emerged so far had been fairly insignificant, this seemingly damming theoretical blow on top of all that led to funding for AI research drying up.
However, as current machine learning tools show, it was never that bad. The theoretical result had been misunderstood, and research into neural networks based on perceptrons eventually took off again in the 1990s
Minsky and Papert’s result is about what a single perceptron can do, not about what multiple ones can do together. More specifically, if you have perceptrons in a single layer, each with inputs just feeding its own outputs, the theoretical limitations apply. However, if you make multiple layers of perceptrons, with the outputs of one layer of perceptrons feeding into the next, the negative result no longer applies. After all, we can make AND, OR and NOT gates from perceptrons, and by wiring them together so the outputs of one are the inputs of the next one, then we can build an XOR gate just as we can with normal logic gates!
An XOR gate from layers of perceptrons set as AND, OR and NOT operators
We can therefore build an XOR gate from perceptrons. We just need multi-layer perceptrons, an idea that was actually known about in the 1960s including by Minsky and Papert. However, without funding, making further progress became difficult and the AI winter started where little research was done on any kind of Artificial Intelligence, and so little progress was made.
The theoretical result about the limits of what perceptrons could do was an important and profound one, but the limitations of the result needed to be understood too, and that means understanding the assumptions it is based on (it is not about multi-layer perceptrons. Now AI is back, though arguably being over-hyped again, so perhaps we should learn from the past!. Theoretical work on the limits of what neural networks can and can’t do is an active research area that is as vital as ever. Let’s just make sure we understand what results mean before we jump to any conclusions. Right now theoretical results about AI need more funding not a new winter!
– Paul Curzon, Queen Mary University of London
This article is based on a introductory segment of a research seminar on the expressive power of graph neural networks by Przemek Walega, Queen Mary University of London, October 2025.