ML Interview Q Series: Three Prisoners Puzzle: Updating Beliefs with Conditional Probability.
Browse all the Probability Interview Questions here.
Three prisoners, A, B, and C, are held in separate cells. Two are to be executed. The warder knows specifically who is to be executed, and who is to be freed, whereas the prisoners know only that two are to be executed. Prisoner A reasons that his probability of being freed is 1/3 until he learns more. He observes that at least one among B and C must be executed, so he asks the warder to name one other prisoner who will be executed. Once the warder names that prisoner, A believes that now either he himself is freed, or else the unnamed other prisoner is freed, which should mean a 1/2 chance for A to be freed. Investigate whether A’s reasoning (that his probability improves from 1/3 to 1/2) is correct. [Hint: compute the conditional probability that A is freed given that the warder names B to be executed.]
Short Compact solution
A’s reasoning is incorrect because it overlooks the flexibility the warder has in choosing whom to name. Define A_F as the event that A is freed, and W_B as the event that the warder names B when asked to name someone (other than A) who will be executed. We use:
In equally likely scenarios, we have:
If A is freed (which happens with probability 1/3), then B and C are both executed. In that case the warder can name B or C with equal probability, so P(W_B | A_F) = 1/2.
If B is freed (probability 1/3), B cannot be named (since B is not executed), so P(W_B | B_F) = 0.
If C is freed (probability 1/3), then A and B are the ones executed, so the warder must name B, making P(W_B | C_F) = 1.
Hence P(W_B) = (1/2)(1/3) + (0)(1/3) + (1)(1/3) = 1/2. Finally,
P(A_F | W_B) = [ (1/2)(1/3 ) ] / (1/2) = 1/3.
Thus learning that B is the one named by the warder does not actually raise A’s probability of being freed above 1/3.
Comprehensive Explanation
The key misunderstanding in A’s reasoning is the assumption that once B is named, the situation is “A or the other unnamed prisoner” each with 1/2 probability. However, the warder’s naming choice is not random among all possible prisoners who might be executed; it is conditional on the fact that the warder already knows who will actually be executed and, if A is indeed freed, the warder has two equally viable candidates (B and C) to name. On the other hand, in scenarios where exactly one of B or C must be executed (because the other is the freed prisoner), the warder has no choice at all and must name the one who is actually executed. This asymmetry in “naming power” is what prevents A’s probability from jumping to 1/2.
The easiest way to see this rigorously is to list the three equiprobable scenarios from the perspective of the prior distribution of who is freed:
A freed, B executed, C executed.
B freed, A executed, C executed.
C freed, A executed, B executed.
If A is freed (scenario 1), the warder can name B or C with 50-50 probability. If B is freed (scenario 2), the warder must name C if asked to name someone else who is executed. If C is freed (scenario 3), the warder must name B. Observing that the warder names B (event W_B) arises in scenario 1 with half the probability that scenario 1 occurs and in scenario 3 with all the probability that scenario 3 occurs. Adding these up yields a total probability of 1/2 for W_B overall.
When we compute the conditional probability that A is freed given the warder names B, we see that it is (1/2 of scenario 1) over (all W_B occurrences). That fraction is:
(1/2)(1/3) / (1/2) = 1/3.
Hence, A’s chance remains 1/3 even after hearing B named as an executed prisoner.
Further Discussion and Potential Follow-up Questions
How does this compare to the Monty Hall Problem?
This problem has some similarities to the Monty Hall puzzle. In Monty Hall, a host knows the location of the prize and deliberately opens a door that does not have the prize. The key similarity is that both the warder and Monty are knowledgeable and make nonrandom choices that affect the conditional probabilities. A crucial difference is that in Monty Hall, you have an option to switch your choice, which changes the probability. Here, the puzzle focuses on how prisoner A’s knowledge changes (or fails to change) upon hearing which other prisoner is named.
What if the warder had a different rule for choosing which prisoner to name?
If the warder’s method for revealing a name were different (for example, if the warder always revealed a particular prisoner whenever possible), the computations for P(W_B) might change. The crucial point is that A’s inference depends on precisely how the warder chooses whom to name. Under the typical assumption of naming someone at random among those who are executed, the final probability remains 1/3. If the warder had a bias, the numbers could be different, but the logic of conditional probability would still apply.
Why does the intuitive reasoning fail?
People often think “once a doomed prisoner is named, the remaining two must have equal chances.” That mistake arises because we tend to oversimplify how information is revealed, forgetting that the person who provides the information (the warder) is constrained by what they already know. Conditional probability calculations show that the probability does not necessarily split evenly among the remaining possibilities unless the selection process is truly random in a uniform sense that excludes hidden constraints.
Could we verify the result by a direct enumeration approach?
Yes. We can simulate or directly enumerate each of the three equally likely original possibilities (A freed, B freed, C freed). Then we check in which fraction of those worlds the warder would name B, and among those, how many times A is actually freed. This procedure precisely matches the formal conditional probability calculation. The result of 1/3 emerges in a straightforward way.
What lessons does this puzzle teach for practical Machine Learning or Data Science?
While this puzzle itself may not appear directly in ML practice, it illustrates a common pitfall in probabilistic reasoning: incorrectly updating beliefs in the presence of partial, selectively revealed information. This is directly relevant to problems like missing data imputation or inference from systematically collected or biased samples. If a sample (or piece of information) is not drawn uniformly at random, naive assumptions about updated probabilities can lead to incorrect conclusions. Understanding how to properly condition on the data-collection (or data-revealing) mechanism is crucial in many real-life data science and machine learning applications.