The obscure maths theorem that governs the reliability of Covid testing

Tom Chivers
·10 min read

Maths quiz. If you take a Covid test that only gives a false positive one time in every 1,000, what’s the chance that you’ve actually got Covid? Surely it’s 99.9%, right?

No! The correct answer is: you have no idea. You don’t have enough information to make the judgment.

This is important to know when thinking about “lateral flow tests” (LFTs), the rapid Covid tests that the government has made available to everyone in England, free, up to twice a week. The idea is that in time they could be used to give people permission to go into crowded social spaces – pubs, theatres – and be more confident that they do not have, and so will not spread, the disease. They’ve been used in secondary schools for some time now.

There are concerns over LFTs. One is whether they’ll miss a large number of cases, because they’re less sensitive than the slower but more precise polymerase chain reaction (PCR) test. Those concerns are understandable, although defenders of the test say that PCR testing is too sensitive, able to detect viral material in people who had the disease weeks ago, while LFTs should, in theory, only detect people who are infectious.

But another concern is that they will tell people that they do have the disease when in fact they don’t – that they will return false positives.

Without knowing the prior probability, you don’t know how likely it is that a result is false or true

The government says – accurately – that the “false positive rate”, the chance of a test returning a positive result in a person who does not have the disease, is less than one in 1,000. And that’s where we came in: you might think that that means, if you’ve had a positive result, that there’s a less than one in 1,000 chance that it’s false.

It’s not. And that’s because of a fascinating little mathematical anomaly known as Bayes’s theorem, named after the Rev Thomas Bayes, an 18th-century clergyman and maths nerd.

Bayes’s theorem is written, in mathematical notation, as P(A|B) = (P(B|A)P(A))/P(B). It looks complicated. But you don’t need to worry about what all those symbols mean: it’s fairly easy to understand when you think of an example.

Imagine you undergo a test for a rare disease. The test is amazingly accurate: if you have the disease, it will correctly say so 99% of the time; if you don’t have the disease, it will correctly say so 99% of the time.

But the disease in question is very rare; just one person in every 10,000 has it. This is known as your “prior probability”: the background rate in the population.

So now imagine you test 1 million people. There are 100 people who have the disease: your test correctly identifies 99 of them. And there are 999,900 people who don’t: your test correctly identifies 989,901 of them.

But that means that your test, despite giving the right answer in 99% of cases, has told 9,999 people that they have the disease, when in fact they don’t. So if you get a positive result, in this case, your chance of actually having the disease is 99 in 9,999, or just under 1%. If you took this test entirely at face value, then you’d be scaring a lot of people, and sending them for intrusive, potentially dangerous medical procedures, on the back of a misdiagnosis.

Without knowing the prior probability, you don’t know how likely it is that a result is false or true. If the disease was not so rare – if, say, 1% of people had it – your results would be totally different. Then you’d have 9,900 false positives, but also 9,990 true positives. So if you had a positive result, it would be more than 50% likely to be true.

This is not a hypothetical problem. One review of the literature found that 60% of women who have annual mammograms for 10 years have at least one false positive; another study found that 70% of prostate cancer screening positives were false. An antenatal screening procedure for foetal chromosomal disorders which claimed “detection rates of up to 99% and false positive rates as low as 0.1%” would have actually returned false positives between 45% and 94% of the time, because the diseases are so rare, according to one paper.

A lateral flow test in progress.
A lateral flow test in progress. Photograph: SlavkoSereda/Getty Images

Of course, it’s not that a positive test would immediately be taken as gospel – patients who have a positive test will be given more comprehensive diagnostic checkups – but they will scare a lot of patients who don’t have cancer, or foetal abnormalities.

A misunderstanding of Bayes’s theorem isn’t just a problem in medicine. There is a common failure in the law courts, the “prosecutor’s fallacy”, which hinges on it too.

In 1990, a man called Andrew Deen was convicted of rape and sentenced to 16 years, partly on the basis of DNA evidence. An expert witness for the prosecution said that the chance that the DNA came from someone else was just one in 3m.

But as a professor of statistics explained at Deen’s appeal, this was mixing up two questions: first, how likely would it be that a person’s DNA matched the DNA in the sample, given that they were innocent; and second, how likely would they be to be innocent, if their DNA matched that of the sample? The “prosecutor’s fallacy” is to treat those two questions as the same.

We can treat it exactly as we did with the cancer screenings and Covid tests. Let’s say you have simply picked your defendant at random from the British population (which of course you wouldn’t, but for simplicity…), which at the time was about 60 million. So your prior probability that any random person is the murderer is one in 60m.

If you ran your DNA test on all of those 60 million people, you’d identify the murderer – but you’d also get false positives on about 20 innocent people. So even though the DNA test only returns false positives one time in 3m, there’s still about a 95% chance that someone who gets a positive test is innocent.

Of course, in reality, you wouldn’t pick your defendant at random – you’d have other evidence, and your prior probability would be greater than one in 60m. But the point is that knowing the probability of a false positive on a DNA test doesn’t tell you how likely it is that someone is innocent: you need some assessment of how likely it was that they were guilty to begin with. You need a prior probability. In December 1993, the court of appeal quashed Deen’s conviction, saying it was unsafe – precisely because the judge and the expert witness had been taken in by the prosecutor’s fallacy. (It’s worth noting that he was convicted in the retrial.)

And in 1999, the heartbreaking case of Sally Clark turned on the prosecutor’s fallacy. She was convicted of murdering her two children, after another expert witness said that the chance of two babies dying of sudden infant death syndrome (Sids) in one family was one in 73m. But the witness failed to take into account the prior probability – that is, the likelihood that someone was a double murderer, which is, mercifully, even rarer than Sids. That, taken with other problems – the expert witness didn’t take into account the fact that families which have already had one case of Sids are more likely to have another – led to Clark’s conviction also being overturned, in 2003.

Let’s go back to the LFT tests. Assume that the one-in-1,000 false positive rate is accurate. But even if it is, and you get a positive result, you don’t know how likely it is that you have the virus. What you need to know first is (roughly) how likely it was, before you took the test, that you might have had it: your prior probability.

At the peak of the second wave, something like one person in every 50 (2% of the population) in England was infected with the virus, according to the Office for National Statistics’ prevalence survey. That was carried out with PCR tests, not LFTs, but let’s use that as the standard.

Say you tested 1 million people, chosen at random, with LFTs (and, for the sake of simplicity, say that they detect all the real cases – that definitely won’t be true in reality). About 20,000 people would have the disease, and of the 980,000 who don’t, it would wrongly tell about 980 that they do, for a total of 20,980 positive results. So if you tested positive, your chance of a false positive would be 980/20,980, or nearly 5%. Or, to put it another way, it’d be almost 95% likely that you really had the disease.

Now, though, the prevalence has dropped enormously – down to about one person in every 340 in England. If we run through the same process, we get a very different picture: of your million people, about 2,950 will have had it. Again assuming your test identifies all of them (and again remembering that won’t be true in reality), you’ll have 2,950 true positives, and about 997 false ones. Suddenly your false positive rate is 997/3,947, or about 25%. In fact, last week government data showed that the false positive rate for LFTs since 8 March was 18%. This rate will rise if prevalence falls – which might become problematic if, for instance, it means an entire class of children has to take time off school.

These sums only apply, of course, if you’re truly testing the population at random. If people are using the tests because they think there’s a good reason why they might be positive – perhaps they have symptoms, or were recently exposed to someone who had the disease – then your prior probability would be higher, and the positive test would be stronger evidence.

Even doctors struggle with Bayesian reasoning. In one 2013 study, 5,000 qualified American doctors were asked to give the probability that someone had cancer, if 1% of the population had the disease and they received a positive result on a 90% accurate test. The correct answer was about one in 10, but even when given a multiple-choice answer, almost three-quarters of the doctors answered wrongly.

None of this means LFTs are a bad idea – I think, cautiously, that they will be useful, especially since positive results will be confirmed by PCR, and if the PCR comes back negative the patient can return to work or school or whatever. But it’s worth remembering that, if you read that a test is 99.9% accurate, it doesn’t mean that there’s a 99.9% chance that your test result is correct. In fact, it’s much more complicated than that.

Tom Chivers is the science editor at UnHerd

• This article is an adapted extract from How to Read Numbers: A Guide to Stats in the News (And Knowing When to Trust Them) by Tom Chivers and David Chivers (Orion, £12.99). To order a copy go to guardianbookshop.com. Delivery charges may apply