The Bayesian Trap

The Bayesian Trap

Posted by


Picture this: You wake up one morning and you feel a little bit sick. No particular symptoms, just not 100%. So you go to the doctor and she also doesn’t know what’s going on with you, so she suggests they run a battery of tests and after a week goes by, the results come back, turns out you tested positive for a very rare disease that affects about 0.1% of the population and it’s a nasty disease, horrible consequences, you don’t want it. So you ask the doctor “You know, how certain is it that I have this disease?” and she says “Well, the test will correctly identify 99% of people that have the disease and only incorrectly identify 1% of people who don’t have the disease”. So that sounds pretty bad. I mean, what are the chances that you actually have this disease? I think most people would say 99%, because that’s the accuracy of the test. But that is not actually correct! You need Bayes’ Theorem to get some perspective. Bayes’ Theorem can give you the probability that some hypothesis, say that you actually have the disease, is true given an event; that you tested positive for the disease. To calculate this, you need to take the prior probability of the hypothesis was true – that is, how likely you thought it was that you have this disease before you got the test results – and multiply it by the probability of the event given the hypothesis is true – that is, the probability that you would test positive if you had the disease – and then divide that by the total probability of the event occurring – that is testing positive. This term is a combination of your probability of having the disease and correctly testing positive plus your probability of not having the disease and being falsely identified. The prior probability that a hypothesis is true is often the hardest part of this equation to figure out and, sometimes, it’s no better than a guess. But in this case, a reasonable starting point is the frequency of the disease in the population, so 0.1%. And if you plug in the rest of the numbers, you find that you have a 9% chance of actually having the disease after testing positive. Which is incredibly low if you think about it. Now, this isn’t some sort of crazy magic. It’s actually common sense applied to mathematics. Just think about a sample size of 1000 people. Now, one person out of that thousand, is likely to actually have the disease. And the test would likely identify them correctly as having the disease. But out of the 999 other people, 1% or 10 people would falsely be identified as having the disease. So, if you’re one of those people who has a positive test result and everyone’s just selected at random – well, you’re actually part of a group of 11 where only one person has the disease. So your chances of actually having it are 1 in 11. 9%. It just makes sense. When Bayes first came up with this theorem he didn’t actually think it was revolutionary. He didn’t even think it was worthy of publication, he didn’t submit it to the Royal Society of which he was a member, and in fact it was discovered in his papers after he died and he had abandoned it for more than a decade. His relatives asked his friend, Richard Price, to dig through his papers and see if there is anything worth publishing in there. And that’s where Price discovered what we now know as the origins of Bayes’ Theorem. Bayes originally considered a thought experiment where he was sitting with his back to a perfectly flat, perfectly square table and then he would ask an assistant to throw a ball onto the table. Now this ball could obviously land and end up anywhere on the table and he wanted to figure out where it was. So what he’d asked his assistant to do was to throw on another ball and then tell him if it landed to the left, or to the right, or in front, behind of the first ball, and he would note that down and then ask for more and more balls to be thrown on the table. What he realized, was that through this method he could keep updating his idea of where the first ball was. Now of course, he would never be completely certain, but with each new piece of evidence, he would get more and more accurate, and that’s how Bayes saw the world. It wasn’t that he thought the world was not determined, that reality didn’t quite exist, but it was that we couldn’t know it perfectly, and all we could hope to do was update our understanding as more and more evidence became available. When Richard Price introduced Bayes’ Theorem, he made an analogy to a man coming out of a cave, maybe he’d lived his whole life in there and he saw the Sun rise for the first time, and kind of thought to himself: “Is, Is this a one-off, is this a quirk, or does the Sun always do this?” And then, every day after that, as the Sun rose again, he could get a little bit more confident, that, well, that was the way the world works. So Bayes’ Theorem wasn’t really a formula intended to be used just once, it was intended to be used multiple times, each time gaining new evidence and updating your probability that something is true. So if we go back to the first example when you tested positive for a disease, what would happen if you went to another doctor, get a second opinion and get that test run again, but let’s say by a different lab, just to be sure that those tests are independent, and let’s say that test also comes back as positive. Now what is the probability that you actually have the disease? Well, you can use Bayes formula again, except this time for your prior probability that you have the disease, you have to put in the posterior probability, the probability that we worked out before which is 9%, because you’ve already had one positive test. If you crunch those numbers, the new probability based on two positive tests is 91%. There’s a 91% chance that you actually have the disease, which kind of makes sense. 2 positive results by different labs are unlikely to just be chance, but you’ll notice that probability is still not as high as the accuracy, the reported accuracy of the test. Bayes’ Theorem has found a number of practical applications, including notably filtering your spam. You know, traditional spam filters actually do a kind of bad job, there’s too many false positives, too much of your email ends up in spam, but using a Bayesian filter, you can look at the various words that appear in e-mails, and use Bayes’ Theorem to give a probability that the email is spam, given that those words appear. Now Bayes’ Theorem tells us how to update our beliefs in light of new evidence, but it can’t tell us how to set our prior beliefs, and so it’s possible for some people to hold that certain things are true with a 100% certainty, and other people to hold those same things are true with 0% certainty. What Bayes’ Theorem shows us is that in those cases, there is absolutely no evidence, nothing anyone could do to change their minds, and so as Nate Silver points out in his book, The Signal and The Noise, we should probably not have debates between people with a 100% prior certainty, and 0% prior certainty, because, well really, they’ll never convince each other of anything. Most of the time when people talk about Bayes’ Theorem, they discussed how counterintuitive it is and how we don’t really have an inbuilt sense of it, but recently my concern has been the opposite: that maybe we’re too good at internalizing the thinking behind Bayes’ Theorem, and the reason I’m worried about that is because, I think in life we can get used to particular circumstances, we can get used to results, maybe getting rejected or failing at something or getting paid a low wage and we can internalize that as though we are that man emerging from the cave and we see the Sun rise every day and every day, and we keep updating our beliefs to a point of near certainty that we think that that is basically the way that nature is, it’s the way the world is and there’s nothing that we can do to change it. You know, there’s Nelson Mandela’s quote that: ‘Everything is impossible until it’s done’, and I think that is kind of a very Bayesian viewpoint on the world, if you have no instances of something happening, then what is your prior for that event? It will seem completely impossible your prior may be 0 until it actually happens. You know, the thing we forget in Bayes’ Theorem is that: our actions play a role in determining outcomes, and determining how true things actually are. But if we internalize that something is true and maybe we’re a 100% sure that it’s true, and there’s nothing we can do to change it, well, then we’re going to keep on doing the same thing, and we’re going to keep on getting the same result, it’s a self-fulfilling prophecy, so I think a really good understanding of Bayes’ Theorem implies that experimentation is essential. If you’ve been doing the same thing for a long time and getting the same result that you’re not necessarily happy with, maybe it’s time to change. So is there something like that that you’ve been thinking about? If so, let me know in the comments. Hey, this episode of Veritasium was supported in part by viewers like you on Patreon and by Audible. Audible is a leading provider of spoken audio information including an unmatched selection of audiobooks: original, programming, news, comedy and more. So if you’re thinking about trying something new and you haven’t tried Audible yet, you should give them a shot, and for viewers of this channel, they offer a free 30-day trial just by going to: audible.com/Veritasium You know, the book I’ve been listening to on Audible recently is called: ‘The Theory That Would Not Die’ by Sharon Bertsch McGrayne, and it is an incredible in-depth look at Bayes’ Theorem, and I’ve learned a lot just listening to this book, including the crazy fact that Bayes never came up with the mathematical formulation of his rule that was done independently by the mathematician Pierre-Simon Laplace so, really I think he deserves a lot of a credit for this theory, but Bayes gets naming rights because he was first, and if you want, you can download this book and listen to it, as I have, when I’ve just been driving in the car or going to the gym, which I’m doing again, and so if there’s a part of your day that you feel is kind of boring then I can highly recommend trying out audiobooks from Audible. Just go to: audible.com/Veritasium So as always I want to thank: Audible for supporting me, and I want to thank you for watching.

100 comments

  1. At around 3:22, the aerial shots were disturbing. I lost concentration on the verbal narrative and had to go back to watch it!

  2. Thanks for sharing this great thought. According to my limited amount of knowledge in psychology, there is a term called "learned helplessness" that could be one of the Bayesian Trap this video describes. It is a self-enhancing process of believing that one is incapable of doing something and failed again and again and again, and that failure fulfills one's image of incapability. The reason of this trap is probably because for every event happening, they are closely correlated with the previous events instead of "independent". So one will become even less willing to try hard or try a different way. The point to avoid or at least become aware of this trap is to reflect on whether I am just inheriting the old belief of incapability from the past failures or I have learned lessons from the past failures so as not to make the same mistakes again. "Independent" means trying things differently and not making the same mistakes again as best as one could.

  3. Doesn't the false positive rate for the test also apply to people who are tested yet feel perfectly fine and have no reason to go to the doctor? If so, does that mean the first prior is off since this person specifically went to the doctor because they felt unwell?

  4. Did anyone even notice that the calculation of the probability of the person having the disease at the start of the video was wrong …as he put .001% instead of .01% and that's y he got a 9% probability and not a 90%

  5. must admit I was confused af at 2:00 (when substitution happened)
    so given only
    P(H) = 0.001
    P(E|H) = 0.99
    we end up substituting 0.01 for P(E|-H) leaving to false impression that P(E|H) + P(E|-H) = 1 (like in P(E, H) + P(E, -H) case)
    so at the end of day, I must conclude that P(E|-H) was given (though its really hard to catch)
    and that's bad example of oversimplification (trying to have nice numbers that hide complexity)

  6. I can't believe how good the quality of his content is. Connectivity something what I learnt in school like Bayes theorem to reality and how it can change our perspective on real life #MindBlown

  7. But Baye's theorem comes to mind like an intuition without any thinking involved if you have even a fair enough of Mathematical IQ. I think it getting published was more like nobody else cared to publish that water is wet so Mr. James got it published. And since it's a theorem … it's not required to have gigantic proofs that a Law requires. It's just common sense proven with example. I don't know… I don't know why we have it formalised as a theorem… I find Pythagoras theorem more non-intuitive even to a 15yr school student old than Baye's theorem.

  8. The comment from ‘The signal and the noise’ is interesting but problematic. People might say they’re 100% certain about x but how certain are they really about their certainty? How certain can we be about the certainty they claim? We can’t, hence we argue.

  9. With or without Baysean reasoning, it's 100% certain that an "unexpected" catastrophe will affect the planet.
    By selecting some specific causal factor like Nuclear First Strike, or Global Warming / Climate Change etc, the preparation for defense against consequences, like moving to Mars or Moon, is ridiculously expensive and is an inappropriate distraction.
    Muddling about with Mathematical Abstractions is good for entertainment and not much else in the face of certainty, as is the topic of the video?

  10. This was a dope video for me because I just started back to school to pursue an engineering career after almost 10 years in the auto field. I was learning about electricity and came to Youtube because i was curious about how exactly we determine the number of particles in an atom. Watched a couple videos, got what I was looking for and decided to watch this from the suggested row. It correlates with my recent life changes perfectly and I said all that just to say thank you. I've been hesitant and nervous about "starting over" but this really helped me. In so many ways. If i keep typing this comment will be a book. Thank you again

  11. Oh my god, I still am not clear about Bayes' Theorem but I'm much more positive about getting to understand it. Thanks.

  12. Thanks so much for this video, your explanation really helped me understand Bayes' Theorem. Much more confident for my test/exam now!

  13. I understand the point you made by "updating" the data with the new lab and the new test. However, this need not be the case in the real world. You'd have to assume that the tests are independent. I'd argue that the probability of the second test being positive is not equal to the probability of being positive under the condition of the first test being positive. I hope this makes sense…;)

  14. Thanks, that's a philosophy getting closer and closer to truth without defining a definite truth excluding other observations. Our Identity in fact is formed kind of this Naive Bayes way, We are born with a prior probability of 0+Delta(x), a small portion of everything, we get in touch and confirmed that things always connected to us, the hand, foot, body and mother, father, our self identity start to form; Through Primary and Middle School, Probability turning to 100% sure about our body, family and country, and our social identity established; Middle Age, we are turning to change to different friends, community, family members, and the Probability getting down to 50%, as we began to doubt everything we are so sure about before; when died, 2 outcomes, 1 way that you get not sure about anything again, questioning if the body is my own body? Another way, you finally get out of this Naive form of probability and feel happy and confident about a world with something that is more real, constant and eternal than your changing identity, over time and space, find eternality in a moment : )

  15. A man goes to his physician mentioning that over the past several weeks he has had intermittent bouts of chest pain. His physicians, says…. better send to you to a cardiologist to undergo an exercise stress test. "They can tell if this is coronary artery narrowing by the results of the EKG while you are exercising on the treadmill. They find 'ST depression' during exercise if you have coronary artery disease." Bayesian theorem applies to this problem. If the man is a 58 year old hard driving executive, who smokes like a chimney, has a cholesterol of 250, never exercises, has hypertension and father who died of a heart attack at age 45, the EKG is probably not necessary. (He should go directly to a coronary angiogram). In this setting, he is so likely to have coronary artery narrowing that even a negative treadmill exercise test is probably a false negative. On the other hand, if the man is a 26 year old athlete, who never smoked, has stayed trim, has a cholesterol of 135, and a blood pressure of 110/65 and has a no family history of heart attacks, even a "positive" treadmill is likely a false positive. (an angiogram is not necessary). It is the 48 year old man who is a little out of shape, doesn't smoke but has mildly elevated blood pressure, a cholesterol of 180 and a weak family history (or a similar combination) where a treadmill can help decide if an angiogram is necessary. This all pertains to people with "atypical" chest pain. For a convincing history of pressure-like chest pain radiating down the left arm upon strenuous exertion would favor to a greater degree getting an angiogram. I believe that it is this setting (does this individual with atypical chest pain deserve an exercise stress test or not) is the first recognized setting where Bayesian theorem was applied. (at least that is how I learned about it in internal medicine residency, long ago). Very important concept…. and I am afraid that very few physicians understand it.

  16. You can use bayes's theorem to solve all sorts of things, if you are clever enough to use it the right way. Kind of like the method of Lagrange Multipliers.

  17. this is excellent. I wonder if a finite number of tests could provide !00% probability. Intuitively it seems so, but I can't work that out easily. (I guess we are to assume the test has a 0% change of a false negative). I guess maybe if there are a finite number of tests required to indicate 100% probability, a formula could be known to tell us how many it would take to provide that?
    Could a false positive reoccur at some point among those falsely tested positive, and would that affect that crazy sequence I mention that "probably" no one is going to work out?

  18. and as a side topic, the distribution of those false positives could be anywhere in any order….so it could have been depicted as the first 10, the last 10, every 100the, etc, none of those would be incorrect and not really that misleading. Though the distributions are more likely to appear random, there is no more likelihood of them occurring in any exact way vs any other exact way. Another consideration is the 1% chance of false positives is not going to necessarily occur in a sample size of 1000. The larger the sample size the closer to 1% it will be.

  19. Hey Ve,
    Does "sort by view count" filter on YouTube outputs the best video on any topic (ex. Bayes theorem) or other filters like "sort by rating/relevance/upload date" is better.
    Please share your statistical views

  20. Running with your man from a cave idea.
    Man comes out of cave and sees the sunrise and sunset.
    Sees the day is 11:50 minutes long.
    Tomorrow sees the sunrise and sunset.
    The day is 11:44 minutes
    The third day, sunrise and sunset.
    The day is 11:37 minutes long.
    He concludes that the days are gradually getting shorter until the night will be complete and everlasting. Is he wrong? How can you prove to him he is?
    How can he prove that he isn't?
    Not until months later will you have proof.

  21. Which was it—.1% or .001%? Initially stated as .001%, then later, stated verbally as .1%—but shown on-screen as .001%. Mistakes like this one have a massive effect.

  22. This morning I was thinking about probability. I spent 20 years as a private consultant for significant Insurance losses. One day it dawned on me how is that mistakes Insurance companies make in their estimates always are a benefit to their bottom line and never to the insured. This was never seen by the insureds as the estimates were computer-generated with software containing complicated formats of codes, percentages, measurements, depreciation, and items not covered by the policy.

    One day I took a senior adjuster from a famous company to lunch as he was retiring that week. We chatted for a while, and I asked him if he could impart any wisdom to me. He asked like what kind of knowledge. I noted to him of our experiences of the errors in the estimates where always in the companies favor and how that was possible.

    He took a sip of this wine and grinned like at thresher cat. Algorithms Chaz, Algorithms. He revealed a few more tricks they use to cut cost using contractors. I asked how much is the savings to the company. He said 10-12 percent. If you catch them, they just pay you and say it was a computer error. Pragmatic Profit?

    For some reason, we now trust computers as if they always tell us the truth. NOT!! in fact the opposite. Image all those who want to be plugged into an AI computer 24/7 managed by DARPA/Google?.

    Since that time I have found these algorithms in my billings from big corporations, Debit card bills and even Mercedes dealership bills as if that's how they make their second profit to pay the CEOs. Where ever a high level of trust (gullibility) is found with the public these algorithms are probably at work.

    I must conclude Trust is a form of gullibility that they prey on. Years ago I understood Churchs are a form of Insurance which has a premium people pay in offerings, free labor services, discounts, etc. Their policy promises you security, healings, (verbal placebos called prayer), paradise, and Gods love (placebo) if you obey the provisions of the Holy Book. (Policy) Just imagine how much all the corporate religions investments are worth on a global scale. The Vatican bank is now in the Trillions and that's just one religion. The church members cannot even make a claim thus, the Chruch pays them back nothing. This is a perfect model of capitalizing on peoples ignorance and trust.

    Does a degree or even a Ph.D. exempt you from being extorted?
    Extortion

    Extortion is a criminal offense of obtaining money, property, or services from an individual or institution, through coercion. The threat of death and hell.

    Hum?

  23. Very well explained. While watching it though I so wanted a view from another camera that zooms pack to show that you have been lecturing a bird of prey perched on your arm.

  24. I have to admit, I'm still confused by it. The test is 99% accurate yet is 10 times more likely to give a false-positive than a true-positive? How can it be called 99% accurate, then? Or is it that if you have the disease the test is 99% likely to identify it, but will give a false-positive more often than it identifies a person who actually has the disease?

  25. Stopped playing videogames for this exact reason in this exact way. After years and years of "you would not even imagine how many" hours. I naturally gave my thoughts about what I gave me, what I achieved doing it and how it aligned to my dreams. Also the probabilities of any wanted outcomes and the probability of those outcomes sustaining end game goals and keeping my biggest dreams alive. This mindset saved my life, I had to fill the void with another addiction which now is random science podcasts and more youtube. Next step is more training, more science, and no nicotine.

  26. Thank you for a clear explanation of this topic. In medschool we don't talk nearly enough about this in the context of lab testing. No method is 100% specific and 100% sensitive, more often the analyses are highly uncertain and we need to question the numbers and learn to put them into perspective with the clinical picture of the patient in front of us.

  27. Was the premise, that this machine can accurately tell you if you have a disease, then what's your chance of having the disease.

    Those two things aren't correlated at all.

  28. Might just be me, but there seems to be a hole in this analogy. Given the doctor would usually only give you a test if they thought it was possible you had the disease, that would increase your odds of the test results being true. Say the doctor was right 50% of the time when giving people the test, surely that would be what you would have in place of “testing the population at random” instead the probability should be around 50% likely to be true

  29. You are correct about someone starting with a prior probability of 0, but not with those starting with 100%. It will take more evidence, but in the end they will be convinced. Any non zero positive number will cause the function to converge. That suggests it might help to add a small epsilon to the prior if starting with 0 when you have the opportunity for multiple tests.

  30. interesting.  what are the odds that you're not on an active fault line close to a nuclear reactor w/ only 3 bridges, none damaged and easy to escape on, given the ground your standing on isn't shaking?

  31. If people would pay 10 million for a 3 bedroom shack on the Hayward fault, would that make the next impending earthquake more remote?

  32. Veritasium made an error applying Bayes' at about 5:30 when he stuck the 9% (0.9) in the equation because both tests would have used the same test methodology (usually).  So, the second test doesn't provide any new info except, at best, regarding lab worker performance (meaning its less likely the first lab test was performed poorly).  Anyway, the error at 5:30 is so big I hope Veritasium is neither Berkeley alumni, a statistician nor a professional who uses probability and statistics.

  33. The thought experiment with the balls on the table… Sounds like he was the original inventor of the board game "Battleship!" 🙂

  34. 2:20 The only, and seriously massive, flaw in this is that you are NOT a person in a group of 1000 random samples of the population. You went in specifically because of showing symptoms. This dramatically increases your likelihood of being positive.

  35. This video makes me wonder what percentage of medical doctors have a firm grasp on the accurate statistical interpretation of test results and how many are willing/prepared to meaningfully convey that interpretation to their patients?

  36. In data science, you don't start at 0% or 100% certainty, you start off with random probabilities and can refine them with bayes' theorem. Philosophically, we can apply the same, striving to come up with new bonkers ideas instead of settling for what we know best.

Leave a Reply

Your email address will not be published. Required fields are marked *