ML Interview Q Series: Bayesian Analysis: Reconciling a Positive Test with Negative Contact Results
Browse all the Probability Interview Questions here.
Bob received a positive result on a test for a certain illness, but the six close friends he was in contact with all tested negative for the same illness. The test in question yields 1% false positives and 15% false negatives. What is the probability that Bob is in fact not infected?
Comprehensive Explanation
This scenario is a classic application of Bayesian reasoning, where we have a test with certain false positive and false negative rates, and additional information about six close contacts who tested negative. The question is how all of these observations update our belief about Bob’s actual health status. Below is an outline of the reasoning.
Assumptions
• We assume there is some “prior probability” that Bob has the disease, denoted as P(A). Often, such puzzle-like questions use a simple prevalence assumption, for example P(A) = 0.01 (i.e., 1%). If the question does not explicitly provide the prevalence, one typical approach is to choose a small but non-trivial prior probability. • If Bob truly has the disease, we assume the six close friends also have it with probability 1 (i.e., we assume perfect transmission from Bob to them, which is a simplifying assumption to highlight the Bayesian update from the friends’ negative test results). In real-world settings, the probability of transmission to each friend is certainly less than 1, but this puzzle is typically simplified to illustrate the dramatic effect that multiple negative results in close contacts can have on the posterior probabilities. • The test characteristics are given: – False Positive Rate (FPR) = 1%. This means if a person is actually negative, the test incorrectly says “positive” with probability 0.01. – False Negative Rate (FNR) = 15%. This means if a person is actually positive, the test incorrectly says “negative” with probability 0.15 (and correctly says “positive” with probability 0.85).
Key Notation
Let A be the event “Bob is actually infected” and A^c be its complement “Bob is actually not infected.” Let B denote the compound event “Bob tests positive, and all six close friends test negative.”
We want to find P(A^c | B), i.e., the probability Bob is not infected given all of the observations.
Breaking Down P(B | A)
If Bob is actually infected (A), two things must happen for B to be true:
Bob tests positive. Probability = 0.85 (since false negative rate is 15%, correct positive rate is 85%).
All six friends test negative. If we assume they are all infected if Bob is infected, then each of them would test negative with probability 0.15 (the false negative probability). Therefore, all six test negative with probability 0.15^6.
Hence, P(B | A) = 0.85 * (0.15^6).
Breaking Down P(B | A^c)
If Bob is actually not infected (A^c), two things must happen for B:
Bob tests positive even though he is negative, which occurs with probability 0.01 (false positive).
All six friends test negative. Here, if Bob is not infected, we typically assume the friends are not infected from him, hence they are truly negative. A truly negative individual tests negative with probability 0.99 (since the false positive rate is 1%). For six friends, the probability all test negative is 0.99^6.
Hence, P(B | A^c) = 0.01 * (0.99^6).
Bayes’ Theorem
To combine these with the prior probabilities, we use the classic Bayesian update:
Once we find P(A | B), we get the quantity of interest:
P(A^c | B) = 1 – P(A | B).
Numerical Example with a 1% Disease Prevalence
Let’s assume P(A) = 0.01 and thus P(A^c) = 0.99. We compute:
• P(B | A) = 0.85 * 0.15^6. Since 0.15^6 is a very small number (approximately 1.1391e-5), we get P(B | A) ≈ 0.85 * 1.1391e-5 = 9.6814e-6. • Multiply by P(A) = 0.01 → 9.68e-8.
• P(B | A^c) = 0.01 * 0.99^6. The value of 0.99^6 is around 0.94148, so P(B | A^c) ≈ 0.01 * 0.94148 = 0.0094148. • Multiply by P(A^c) = 0.99 → about 0.009320652.
Hence, Bayes’ formula gives:
P(A | B) = 9.68e-8 / (9.68e-8 + 0.009320652) which is extremely small (on the order of 1e-5). Therefore, the probability Bob is actually infected given that he tested positive but all six friends tested negative is around 0.001% (depending on rounding).
Consequently,
P(A^c | B) ≈ 99.999%.
This result is surprisingly high, showing that the evidence of six negative close contacts drastically shifts our belief that Bob’s positive test result was likely just a false alarm.
Potential Follow-up Questions
How would the result differ if the probability of transmission to each friend is less than 1?
If there is only a partial chance p that Bob infects each friend (independently), then the probability all six remain uninfected (and hence all test negative because they never caught it) even if Bob is truly infected is (1–p)^6. Conversely, if a friend is infected, the chance of that friend testing negative is 0.15, so we have a more complicated expression that combines the probabilities of “friend is not infected at all” and “friend is infected but still tests negative.” Specifically, each friend has probability (1–p) + p*0.15 of testing negative. For six friends, that entire term is raised to the power 6. That changes P(B | A) and modifies the final posterior probability. The key takeaway is that the lower the probability that Bob transmits the disease to friends, the smaller the effect the friends’ negative tests have on Bob’s posterior probability of actually being infected.
What if the disease prevalence is much higher or much lower?
The prior P(A) has a strong impact on the final posterior probability. If the disease is extremely common, your prior that Bob is infected becomes much higher, and it might override the six negative results in friends to a greater degree. Conversely, if the disease is extremely rare, seeing a single positive result (with a known false positive rate) might be very suspicious, and combining that with negative results in close contacts drives the posterior probability of true infection even lower.
What about real-world complications like test dependence?
In real scenarios, tests on contacts might not be independent. For example, if there was a bad batch of tests, or if environmental conditions affect the accuracy of tests in a location. Additionally, if Bob’s friends are tested on different days or if they have different immune responses, the correlation structure in these tests can complicate the analysis. The typical puzzle-like approach, however, assumes independence for simplicity.
How does this illustrate the importance of Bayes’ Theorem?
The result highlights the classic idea that tests with modest false positives and false negatives may lead to misleading conclusions if not combined properly with prior probabilities and additional contextual information. Seeing multiple negative tests in close contacts can drastically shift the belief about whether a single positive is correct or not.
Could the false negative rate for friends matter in a different way?
Yes. Even if friends are infected, a 15% false negative rate is not negligible. If Bob truly was infected and transmitted the disease, each friend still has a significant chance (15%) of showing a negative result. However, the probability that all six show a negative result simultaneously (0.15^6) becomes incredibly small. This is the main driver behind the large posterior probability that Bob’s test is actually a false positive, rather than a true positive.
Summary of Key Insights
• Observing multiple negative tests in close contacts who could plausibly have been infected greatly reduces the likelihood that the single positive test is correct. • Bayesian updates with realistic assumptions can drastically alter the conclusion from a naive reading of “Bob tested positive, so he probably has the disease.” • Real-world applications must account for more complex factors like partial transmission probabilities, possible correlation in tests, and varying prior prevalence.
In this scenario, under the simplified assumptions (100% transmission if Bob is truly infected and around 1% disease prevalence), the probability that Bob is actually negative ends up exceeding 99.999%.
Below are additional follow-up questions
What if Bob took multiple tests on consecutive days?
If Bob were to take the same test repeatedly over several days, the probability that all of those tests come back incorrectly positive (if Bob is truly negative) or incorrectly negative (if Bob is truly positive) would need to be factored into a Bayesian update. In particular:
• Repeated positive results make it far more likely Bob is truly positive, because the probability of multiple false positives in a row diminishes rapidly if the tests are statistically independent. For instance, if the false positive rate is 1% (0.01), then two consecutive false positives happen with probability 0.01 * 0.01 = 0.0001. • Conversely, if Bob got multiple negative tests in a row (while believed to be infected) and each test had a 15% false negative rate, the probability of repeatedly testing negative if truly infected becomes (0.15^n) for n repeated tests, which becomes very small as n grows.
However, in real-world usage, repeated testing often relies on similar technology within a short time window; the tests themselves might not be truly independent. For instance, if Bob has a low viral load, each test might be systematically more likely to yield a negative result. Or if Bob has some unique trait that triggers a false positive, every test might also be false positive. Accounting for these correlated errors introduces complexities in the Bayesian update, as we can’t multiply probabilities in a straightforward manner. It requires a model for the correlation between different test outcomes on the same individual.
Could the stage of infection or disease progression affect these probabilities?
Yes. A key pitfall in relying on a single false positive rate or false negative rate is that these rates can depend on the stage of infection, the type of test (e.g., PCR, antigen), and the actual viral load:
• Early in the infection, viral load might be too low for detection, increasing the practical false negative rate. • Later in infection, if the person’s immune response is significant or if the test type changes (e.g., antigen vs. antibody test), the reliability also changes. • Some tests are significantly more sensitive during certain windows post-exposure.
In real-world application, we’d need to consider a time-based probability model, where P(test positive | infected) changes over time. If Bob’s test was taken very early (or very late) relative to exposure, that might alter the false negative probability. Similarly, if the friends were tested at a suboptimal time, their negative results might not be as reassuring as if they had been tested at the optimal detection window.
What if only some of the six friends were tested at the same time as Bob, and others were tested days apart?
Timing differences introduce complexities:
• If there’s a delay between Bob testing positive and the friends being tested, a friend who was infected but not yet showing a detectable viral load might test negative (false negative due to timing). • On the flip side, if the friends were tested earlier than Bob, perhaps they were still in an incubation period. • Variation in timing leads to conditional probabilities that depend on the latent period of the infection and the window of maximum test sensitivity.
Thus, the assumption that all six negatives strongly indicate Bob’s positive is false might weaken if the testing wasn’t synchronized around the same point in the disease progression. You’d need a more sophisticated model that accounts for the probability of detection in each friend based on time since potential exposure.
How could the sample size or specificity of the test for the general population impact these results?
The test’s specificity is the true negative rate (i.e., 1 – false positive rate). Even a small shift in specificity can massively change outcomes when applied across a large population where prevalence is low:
• If specificity is 0.99 (a 1% false positive rate), that’s typically considered good. However, when applied broadly in a low-prevalence setting, the number of false positives can still outnumber the true positives. • If specificity were 0.995 (a 0.5% false positive rate), the probability that a single positive is truly positive in a low-prevalence scenario rises considerably.
For Bob’s scenario, an unrecognized slight variation in specificity or prevalence can change P(A^c | B). If the test specificity is overestimated (say it’s actually 0.98 instead of 0.99), the chance that Bob’s test is a false positive becomes even higher, which strengthens the conclusion that he is likely not infected.
Could differing risk factors for Bob and his friends complicate the Bayesian calculation?
In real life, Bob and his friends might have different risk levels based on behavior, co-morbidities, or exposure to other infected individuals. For instance:
• Bob could have had multiple exposures beyond these six friends, making his prior P(A) higher than a baseline population prevalence. • The friends might also have separate exposures that influence their own probabilities of testing positive or negative.
When applying Bayesian updates, you’d want a model that accounts for these individualized priors. If Bob’s pre-test probability is much higher than 1% (due to known risk factors), even six negative results in friends might not reduce Bob’s posterior probability of infection as drastically. Alternatively, if Bob has minimal risk factors, the test’s positive result is more likely to be false, and the negative results in friends would reinforce that perspective.
What if the disease is only partially transmissible even if Bob is truly infected?
Not all close contacts of an infected person become infected themselves. For a respiratory disease, typical transmission probabilities are often nowhere near 1. Maybe there’s a 20%–30% chance of infecting a close contact. Then:
• Even if Bob is truly infected, each friend might remain negative simply by not catching it, so the probability of friend_i being negative is (1 – p_trans) + p_trans * P(negative test if infected). • The probability that all six are negative would be that quantity raised to the sixth power.
This makes the “multiple negative friends” piece of evidence less conclusive. You would need to re-derive P(B | A) with a more realistic probability of infection for each friend. It could still be very small that all six remain negative, but it would be noticeably larger than if p_trans = 1.
What are the potential mistakes in interpreting conditional probabilities in this scenario?
A common pitfall is to interpret P(Bob is negative | Bob tested positive) purely from the test’s false positive rate alone, ignoring all other context (such as the six negative friends, or the prevalence). This is an example of the base rate fallacy if one disregards the prior probability. Another mistake is to assume test independence incorrectly. If all seven tests were processed in the same lab machine that had a consistent bias, or if the same reagent batch is faulty, the outcomes might be correlated in a way that’s not accounted for by a simple product of probabilities.
Could the test be less accurate for Bob than for his friends for any reason?
Yes. Certain demographic factors, health conditions, or sample collection procedures might systematically affect test accuracy for Bob differently than for his friends:
• For instance, if Bob’s sample was collected improperly (causing a high chance of a false positive or contamination), his test result might be unreliable. • If the friends had near-perfect sample collection protocols, then their 1% false positive rate might actually be an overestimate, so the chance of them all being negative is even more certain.
These real-world nuances indicate that test performance metrics (1% false positive, 15% false negative) are usually population averages. Individual-specific factors can deviate considerably, impacting the Bayesian update.
Could there be an alternative explanation for Bob’s positive result?
Yes. Alternative explanations might include:
• Lab contamination or a clerical mix-up. This is a scenario in which Bob’s test result is not even “random false positive” but a full technical error. • Cross-reactivity with another pathogen (some tests register positive for similar viruses). • Bob could have had a recent vaccination or medical treatment that affects the test outcome (for instance, in antibody tests, recent immunization can cause positive test results).
From a Bayesian standpoint, these alternative possibilities further increase the likelihood of Bob’s test result being a false positive, making it even more likely he is not actually infected, especially when weighed against the six negative friends.
What if there is a different distribution of false positives and false negatives for different subgroups?
Test characteristics can vary across sub-populations. For example, older individuals or those with certain co-existing conditions might have a different sensitivity or specificity. If Bob and his friends fall into different demographic or clinical categories, the 1%/15% numbers might not apply equally:
• Maybe for younger, healthy individuals (like Bob’s friends), the false negative rate is 10%, whereas for Bob (an older individual or with certain comorbidities), the false negative rate is 20%. • In that case, we need to use different test characteristics for each person when computing probabilities.
This leads to more complicated Bayesian modeling because you handle each friend’s test probability differently from Bob’s.
What if some of the friends test positive, but Bob tests negative next time?
This scenario flips the problem around. If a friend or two tested positive later, while Bob subsequently tested negative, it might suggest Bob was never the index case or Bob’s initial positive test was mistaken. Or, Bob might have recovered quickly while the friends only later showed symptoms. The interpretation depends on timing, test dependence, and disease progression. Ultimately, partial test outcomes among close contacts can reinforce or refute Bob’s original test result, but a full analysis requires carefully modeling the likelihood of each sequence of test results over time.
Could a Bayesian network representation help handle these complexities?
Yes. Instead of working with a single equation, a Bayesian network (a graphical model) can capture more complex dependencies:
• Nodes for Bob’s infection status, each friend’s infection status, and each test outcome. • Edges reflecting conditional dependencies, such as the probability of transmission from Bob to friend, or the probability of a test outcome given infection status. • This network approach allows you to systematically incorporate partial transmission probabilities, different test accuracies, disease progression timelines, and any known correlations.
Such a network makes the problem more tractable computationally, especially when dealing with multiple tests, varying times, and heterogeneous population assumptions.
How could real epidemiological data affect our understanding of the posterior probability?
Epidemiological data might show, for instance, a higher prevalence in Bob’s community at the time or more accurate estimates of the transmission rate between close contacts. If the local prevalence is high, Bob’s prior P(A) might be significantly larger than 1%. This would increase the chance that his test result is a true positive. Conversely, if the local prevalence is much lower (e.g., 0.1%), then the false positive scenario becomes even more likely. Moreover, real contact-tracing data might show that not every close contact has the same risk. One friend might be immunocompromised (more likely to get infected), or one might have strictly quarantined (less likely to be infected). All of these factors refine the Bayesian update.