A simple Bayesian network for having a virus

Computer tools based on what are called “Bayesian networks” give accurate ways to determine how likely things are. For example, they give a good way, based on evidence, to determine how likely a given person has COVID. As you collect more evidence, the probability the network gives becomes more accurate.

A Bayesian network for having a virus

How likely is it that you have COVID? There is lots of evidence you might collect to decide whether or not you do. It causes many (but not all) people to cough. So if you do have the a cough that is useful evidence. Other things like flu, however, also cause people to cough. Catching COVID is also known to be caused by breathing the same air as infected people. The more socialising you have done the higher the chance you have caught it, but also the more people in your area with the disease the more likely it is that you have caught it by socialising. You can also take a test – having COVID will cause a positive result. However, tests are not fully accurate, so even with a positive test you may or may not have COVID…

Deciding how likely it is you have COVID relies on knowing lots of facts about the causes of COVID and about the symptoms it causes. It also relies on knowing the probabilities of things such as how likely it is that COVID causes a cough. Finally it relies on knowing lots of facts about you such as whether you have had a positive test result or not.

A Bayesian network is just a way of drawing a diagram that collects all this information in one place. Once created it can be used to determine how likely things like whether you have COVID are to be true based on known facts, known causes, and the chances of one thing causing another. It gives a powerful way to reason about these facts and probabilities based on “causal relationships”. That reasoning allows accurate probabilities to be calculated about the things you are interested in knowing. Given I have a cough and no other symptoms, have had a negative test but have recently socialised outside my family, am I 80 per cent certain I have COVID or is the chance I have it only 2 per cent?

We can take all the evidence for and against our having the virus and draw a Bayesian network as shown in the diagram. For each bubble the percentages show the chance that for a random person in the population this thing is currently true. Arrows show which things can cause others. So, in the diagram, this means that 0.5 per cent of the population currently have the virus (as 1 in 200 have the virus, so probability 0.005, and to turn a probability into a percentage you just multiply by 100); 0.4 per cent of the population have been in recent contact with an infected person; 10 per cent have a cough; 2 per cent have flu, and so on. This is all general evidence we can collect about the country as a whole. (Note we have made up these numbers for the example as they may change over time, but they are the kinds of data scientists collect to help policy makers make decisions.)

The model also includes probabilities not shown, like the chance of a person getting the virus if they have been in recent contact with an infected person and the probability of a positive test depending on whether they do, or do not, have the virus. Once a particular Bayesian network like this has been created it can form the basis of a decision making tool that does all the calculations.

We then want to know about you. Do you have a cough, have you lost your sense of taste or smell, what was the result of your test, and have you been in contact with an infected relative? From this information, we can update the probabilities in the Bayesian network (using a theorem called Bayes’ theorem) to give a new probability for how likely it is that you have the virus. Computer software can do this for us, though the more complicated the Bayesian network, the longer it takes to do all the calculations.

The result, though, is that the computer can give you a personalised risk assessment of how likely it is that you have the virus based on the specific evidence about you. You can find such a comprehensive personal COVID risk calculator, based on a Bayesian network with much more data, at covid19.apps.agenarisk.com/

– Norman Fenton and Paul Curzon, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.

Here

A manequin in shadow, arms spread wide waiting to be dressed
Image by Zaccaria Boschetti from Pixabay

Amy Dowse wondered if an app might help people suffering with anxiety. One way to overcome panic attacks is a mindfulness technique where you focus on the here and now – your surroundings rather than your internal feelings. For her university MSc project, she created an app to help people do this, called Here. It prompts you to look for coloured objects in the real world then use them to build a picture in the app. For example, you look at the colour of the clothes that people around you are wearing and try to fully dress a figure on the app using what you see.

– Paul Curzon, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.

A graphical explanation of Bayes theorem

A diagrammatic proof

If you take a test how do you work out how likely it is that you have the virus? Bayesian reasoning is one way (see “What are the chances of that”). Here is a graphical version of what that kind of reasoning is actually about.

If recent data shows that the virus currently affects one in 200 of the population, then it is reasonable to start with the assumption that the probability YOU have the virus is one in 200 (we call this the ‘prior probability’). Another way of saying that is that the prior probability is 0.5 per cent.

A better estimate

Suppose the probability a random person has the virus is 1 in 200 or 0.5 per cent. With no other evidence, your best guess that you have the virus is then also 0.5 per cent. You have also however taken a test and it was positive. However, for every 100 people taking the test, 2 will test positive when they actually do NOT have the virus. This means that the false positive rate is 2 per cent.

How? Bayes worked out a general equation for calculating this new, more accurate probability, called the ‘posterior’ probability (see page 8). It is based, here, on the probability of having the virus before testing (the original, prior probability) and any new evidence, which here is the test result.

A surprising result

How likely is it that you have the virus? With only this evidence, the probability you have the virus is still only 20 per cent.

– Norman Fenton, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.

What are the chances of that? The church minister’s hobby and clever machines

The hobby of a church minister over 250 years ago is helping computers make clever decisions.

Crowd of ghost people
Image by Free-photos from Pixabay

Thomas Bayes was an English church minister who died in 1761. His hobby was a type of maths that today we call probability and statistics, though his writings were never really recognised during his own lifetime. So, how is the hobby of this 18th century church minister driving computers to become smarter than ever? His work is now being used in applications as varied as: helping to diagnose and treat various diseases; deciding whether a suspect’s DNA was at a crime scene; accurately recommending which books and films we will like; setting insurance premiums for rare events; filtering out spam emails; and more.

How likely is that?

Bayes was interested in calculating how likely things were to happen (their probability) and particularly things that cannot be observed directly. Suppose, for example, you want to know the probability that you have an infectious virus, something you can’t just tell by looking. Perhaps you’re going to a concert of your favourite band – one for which you’ve already paid a lot of money. So you need to know you are not infected. If recent data shows that the virus currently affects one in 200 of the population, then it is reasonable to start with the assumption that the probability YOU have the virus is one in 200 (we call this the ‘prior probability’). Another way of saying that is that the prior probability is 0.5 per cent.

A better estimate

However, you can get a much better estimate of how likely it is that you have the virus if you can gather more evidence of your personal situation. With a virus you can get tested. If the test was always correct, then you would know for certain. Tests are never perfect though. Let’s suppose that for every 100 people taking the test, two will test positive when they actually do NOT have the virus. Scientists call this the false positive rate: here two per cent. You take the test and it is positive. You can use this information to get a better idea of the likelihood you have the virus.

How? Bayes worked out a general equation for calculating this new, more accurate probability, called the ‘posterior’ probability. It is based, here, on the probability of having the virus before testing (the original, prior probability) and any new evidence, which here is the test result.

A surprising result

If we assume in our example that every person who does have the virus is certain to test positive then, plugging the numbers into Bayes’ theorem, tells us there is actually a surprisingly low, one in five (i.e., 20 per cent) chance you have the virus after testing positive. See A Graphical Explanation of Bayes’ theorem for why the answer is correct. Although this is much higher than the probability of having the virus without testing (two per cent), it still means you are unlikely to have the virus despite the positive test result!

If you understand Bayes theorem, you might feel it unfair if your doctor still insists that you have the virus and must miss the trip. In fact, many people find the result very surprising; generally, doctors who do not know Bayes’ theorem massively overestimate the likelihood that patients have a disease after a positive test result. But that is why Bayes’ theorem is so important.

To go or not to go

Of course, no one knows which are the five concert goers that are the ones infected. If all 25 ignore their doctor that means there are five people mingling in the crowd, passing on the virus, which would mean lots more people catch the virus who pass it on to lots more, who … (see Ping pong vaccination).

We have seen that, with a little extra information (such as a test result), we can work out a more accurate probability and so have better information upon which to make decisions. In practice, there are many different kinds of information that we can use to improve our estimate of the real probability. There are symptoms such as lack of taste/smell which are quite specific to the virus. Others, like a cough, are common in people with the virus but also in people with flu. There are also factors that can cause a person to have the virus in the first place such as close contact with an infected relative. So, instead of just inferring the probability of having the virus from one piece of information, like the test result, we can consider lots of interconnected data, each with its own prior probability. This is where computers come in: to do all the calculations for us.

We first need to tell the computer about what causes what. A convenient way to do this is to draw a diagram of the connections and probabilities called a ‘Bayesian network’ (see A Simple Bayesian Network – to come). Once a computer has been given the Bayesian network, it can not only work out more accurate probabilities, but it can also use them to start making decisions for us. This is where all those applications come in. Deciding whether a suspect’s DNA was at a crime scene, for example, needs the same kind of reasoning as deciding whether you have the virus.

Obviously, it is more complex to apply Bayes’ theorem in realistic situations and, until quite recently, even the fastest computers weren’t able to do the calculations. However, breakthroughs by computer scientists developing new algorithms mean that very complex Bayesian networks, with lots of inter-connected causes, can now be computed efficiently. Because of this, Bayesian networks can now be applied to a multitude of important problems that were previously impossible to solve. And that is why, perhaps surprisingly, the ideas of Thomas Bayes, from over 250 years ago, are showing us how to build machines that make smarter decisions when things are uncertain.

– Norman Fenton, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.

The ping pong vaccination programming challenge

Vaccination programmes work best when the majority of the population are vaccinated. One way scientists simulate the effects of disease and vaccination programmes is by using computer simulations. But what is a computer simulation?

Lots of multi-coloured ping pong balls

You can visualise what a simulation is with ping pong balls bouncing around a crowd. Imagine having a large room full of people. A virus is represented by a ping pong ball, bouncing from person to person, infecting each person it touches. Each person who is hit by a ping pong ball and not already infected becomes infected. That means they toss that ping pong ball back into the crowd to infect more people, but they also toss an extra one too (and then they sit down: dead). Start with a few ping pong balls. Quickly the virus spreads everywhere and lots of people sit down (die). You have run a physical simulation of how a virus spreads!

Now start again but ‘vaccinate’ 80 per cent of the people first: give them a baseball cap to wear to show who is who. If those people get a ping pong ball, they just destroy it: they infect noone else. Start with the same number of ping pong balls. This time, the virus quickly dies out and only a few people sit down (die). Not only are the vaccinated people protected but they protect many of the un-protected people too who might have died.

Now (if you can program) you can write a program to do the same thing, and so simulate and explore the spread of infection, which is easier perhaps than getting a thousand people to chuck ping pong balls about. Create a grid (an array) of 1000 cells. Each represents a person. They can be infected or not. They can also be vaccinated or not. Start with five random cells (so people marked as infected). Run a series of rounds. After each round, a newly infected cell randomly chooses two others to infect. If not infected already and not vaccinated, then they become newly infected. If already infected or vaccinated, they do not pass the infection on.

You can run lots of different experiments with different conditions. For example, experiment with different proportions of people infected at the start or explore what percentage of people need to be vaccinated for the virus to quickly die out. Is 50 per cent enough? You could also change how many people one person infects, or for how long a person can infect others before dying. Perhaps they each keep causing new infections for three rounds before stopping instead of only one. In what situations does the virus infect lots of people and when does it die out quickly?

What you are doing here is computer modelling or simulating the effects of the virus in different scenarios, and that is essentially how computer scientists make the predictions that governments use to make decisions about lockdowns and mask wearing, if they are “following the science”. Of course, such models are only as good as the data that goes into them, such as how many other people does each person infect. In reality, this is data provided by surveys, experimental studies, and so on.

– Paul Curzon, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.

Smart health: decisions, decisions, decisions

The trouble with healthcare is that it’s becoming ever more expensive: new drugs, new treatments, more patients, the ever-increasing time needed with experts. Smart healthcare might be able to help.

Cover of cs4fn issue 27 on smart health - a spiders web covered in droplets of dew

We want everyone to get the care they need, but the costs are growing. Perhaps computer scientists can help? Research groups worldwide are exploring ways to create computing technology to improve healthcare, and intelligent programs that can support patients at home, helping monitor them and make decisions about what to do.

For example, say you are on powerful drugs to manage a long term illness: should you have the vaccine? Can you have a baby? Is a flare up of your disease about to hit you and how can you avoid it? Is that new ache a side effect of the drugs? Do you need to change medicines? Do you need to see a specialist?

If smart programs can help support patients then the doctors and nurses can spend more time with those who actually need it, hospitals can save on expensive drugs that aren’t working, and patients can have better lives. But what kind of technology can deliver this sort of service?

In the current issue of cs4fn magazine, we explore one particular way being developed on the EPSRC funded PAMBAYESIAN project at Queen Mary University of London, based on an area of computing called Bayesian networks, that might just be the answer. We also look at other ways computers can help deliver better healthcare for all and other uses of Bayesian networks.

We will be blogging each article here over the coming days or you can download Issue 27 of the cs4fn magazine on Smart Health here and read it all now.

This image has an empty alt attribute; its file name is epsrclogo.png

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project. UK schools that subscribe will be sent copies in the coming weeks.

This image has an empty alt attribute; its file name is qmul-logo-blackonwhite-small.png

Smart tablets (to swallow)

The first ever smart pill has been approved for use. It’s like any other pill except that this one has a sensor inside it and it comes with a tracking device patch you wear to make sure you take it.

A big problem with medicine is remembering to take it. It’s common for people to be unsure whether they did take today’s tablet or not. Getting it wrong regularly can make a difference to how quickly you recover from illness. Many medicines are also very, very expensive. Mass-produced electronics, on the other hand, are cheap. So could the smart pill be a new, potentially useful, solution? The pill contains a sensor that is triggered when the pill dissolves and the sensor meets your stomach acids. When it does, the patch you wear detects its signal and sends a message to your phone to record the fact. The specially made sensor itself is harmless and safe to swallow. Your phone’s app can then, if you allow it, tell your doctor so that they know whether you are taking the pills correctly or not.

Smart pills could also be invaluable for medical researchers. In medical trials of new drugs, knowing whether patients took the pills correctly is important but difficult to know. If a large number of patients don’t, that could be a reason why the drugs appeared less effective than expected. Smart pills could allow researchers to better work out how regularly a drug needs to be taken to still work. 

More futuristically still, such pills may form part of a future health artificial intelligence system that is personalised to you. It would collect data about you and your condition from a wide range of sensors recording anything relevant: from whether you’ve taken pills to how active you’ve been, your heart rate, blood pressure and so on: in fact anything useful that can be sensed. Then, using big data techniques to crunch all that data about you, it will tailor your treatment. For example, such a system may be better able to work out how a drug affects you personally, and so be better able to match doses to your body. It may be able to give you personalised advice about what to eat and drink, even predicting when your condition could be about to get better or worse. This could make a massive difference to life for those with long term illnesses like rheumatoid arthritis or multiple sclerosis, where symptoms flare up and die away unpredictably. It could also help the doctors who currently must find the right drug and dose for each person by trial and error.

Computing in future could be looking after your health personally, as long as you are willing to wear it both inside and out.

– Paul Curzon, Queen Mary University of London, Spring 2021

Download Issue 27 of the cs4fn magazine on Smart Health here.

This post and issue 27 of the cs4fn magazine have been funded by EPSRC as part of the PAMBAYESIAN project.