Here’s a riddle for you: How can a medical test that is 99% accurate be wrong half the time, even when it is performed correctly?
The answer: When the disease is rare. That doesn’t seem to make much sense – testing is even more important when a disease is rare, isn’t it? – but it’s true.
The explanation involves statistics (yuck!), which means many people won’t believe it. But this fact, which is well known in public health, is important if we want to make good decisions about our own health and understand decisions that are made for public health policy.
Why bring this up? When I hear debates about health issues, people often act as if we could solve all problems if we just did enough testing. This result, which I like to call the Positive Predictive Paradox or Triple-P because that name sounds cool, lets us know that testing isn’t always the get-out-of-jail card that it seems to be.
So here’s how the paradox works. (Warning: It includes numbers and arithmetic!)
Imagine a town of 10,000 people in which 1% of the population has a deadly disease. You want to find out who those people are using a test that’s 99% accurate, so you give the test to everybody.
There are 100 people in town who have the disease – that’s 1% of the total population – and the test finds 99 of them, missing just a single sick person. Well done!
But remember that 9,900 of the people who were tested do not have the disease. The test is 99% accurate so it knows that 9,801 of them are healthy. What a relief!
But since the test is sometimes wrong, it incorrectly says that 1% of the healthy people are infected. That’s 99 people, all of whose lives are turned upside down when they’re told (incorrectly) that they have a deadly disease.
Look at those results again. Testing the whole population for a rare disease produced 198 positive results, but one-half of them were wrong: 99 of the people were sick but 99 were actually healthy.
A 99% accurate result was wrong half the time. Truly that’s a Paradox from Prediction of Positive results. Triple P in action!
The situation is worse in real life because no medical test is that accurate. Even 95% accuracy is excellent, yet in the above scenario a 95% test would produce 495 false positives – people told incorrectly that they had a deadly disease – for just 95 correct positives, an error rate of 4-to-1.
Imagine the outrage and chaos that would result if a population was given a medical test for a deadly disease but four-fifths of the positive results were wrong.
This explains why we don’t do widespread tests for rare diseases, even though it seems to be the obvious action. The reluctance isn’t due to bad tests or to cost or to pressure from “big pharma” or the “natural medicine” lobby. It’s due to math.
The situation is different for diseases that are common. For example, if one-third of our fictional town had the illness in question then a test which is 99% accurate would find 329 truly sick people and give a false diagnosis to just seven people, an error rate that is probably acceptable.
However, if a disease is that common it might be cheaper and faster just to treat everybody without bothering to test. Better yet, vaccinate them if a vaccine is available.
In other words, public health is complicated. In some ways, I think it’s the most complicated thing that society does, since it has to factor in biology, economics, individual preferences, group behavior and politics.
And statistics. Whatever you do, don’t forget statistics.
I respectfully dissent. New Hampshire’s newborn screening program DOES test for rare diseases, and well it should. Plus it’s widespread, reaching essentially every baby born in the state. The paradox you describe notwithstanding — that “99 percent accurate” can still yield a lot of false positives — this is good public policy for two reasons. First, the value of identifying the TRUE positives can be astronomically high, saving lives and abating suffering in a manner that vastly outweighs the social cost of false positives. Second, positives don’t result in mindless uncritical diagnosis in this scenario — they result in more refined testing and careful monitoring so that nobody actually gets incorrectly diagnosed.
All of this is why, about 15 years ago, I worked like hell to get cystic fibrosis CF added to the state’s newborn screening program. We succeeded, and I’m still proud of that achievement. As a CF dad I saw firsthand the terrible cost of not screening every baby for CF at birth. As I write, I’m in Nashville at this year’s North American Cystic Fibrosis Conference, hearing about how a massive scientific effort has brought us to the edge of a cure for this ‘rare’ disease. (Rare, by the way is a funny concept. One in 25 people, me included, carry a CF mutation.) Delivering these emerging therapies to every CF kid, starting at birth, is the greatest story in medicine right now. Nothing false about that!
It would not hurt to use the relevant statistical terms- sensitivity and specificity, as opposed to accuracy. While in general you want both to be extremely high, the prevalence of the disease and it’s implications can let you sacrifice one for the other (e.g. something highly contagious but not noticed like Covid you want maximum sensitivity and perhaps are willing to have some false +’s). Your observation is not a new one- it’s the primary reason that mammography has fallen out of favor. There is a real cost to telling people who do not have a disease that they have it. It’s often financial (more testing, procedures etc which in this country people typically pay for) and of course emotional. Being told you have a disease causes a lot of stress, and stress is without doubt bad for your health.