ML Interview Q Series: Conditional Probability: Probability of Two Red Balls Given At Least One Drawn
Browse all the Probability Interview Questions here.
A bag contains four balls. One is blue, one is white, and two are red. Someone draws two balls from the bag and observes that at least one is red. What is the probability that the other ball is also red?
Short Compact solution
Label the two red balls as R1 and R2, and label the blue and white balls as B and W. There are six possible ways to pick any two balls from the set {R1, R2, B, W}: {R1, R2}, {R1, B}, {R1, W}, {R2, B}, {R2, W}, {B, W}. Each pair is equally likely.
Define:
A = event that both drawn balls are red (i.e. {R1, R2}).
B = event that at least one red ball is among the two drawn.
We see that P(A) = 1/6 because only {R1, R2} satisfies both red. We see that P(B) = 5/6 because only {B, W} fails to have at least one red.
Hence,
Therefore, the probability that the other ball is also red, given that one of the two is red, is 1/5.
Comprehensive Explanation
Understanding the problem
We have four distinct balls in the bag:
Red 1 (R1)
Red 2 (R2)
Blue (B)
White (W)
Two are drawn at once. We then learn that among these two drawn balls, there is at least one red ball. We want the probability that the other one is also red.
Constructing the sample space
If we label the two red balls as R1 and R2, we have these six equally likely outcomes when drawing two balls without regard to order:
{R1, R2}
{R1, B}
{R1, W}
{R2, B}
{R2, W}
{B, W}
Out of these six, exactly one subset is both red: {R1, R2}.
Event definitions
Event A: Both chosen balls are red. This corresponds to the subset {R1, R2}.
Event B: At least one chosen ball is red. This corresponds to all subsets except {B, W}, so it is {R1, R2}, {R1, B}, {R1, W}, {R2, B}, {R2, W}.
Computing the probabilities
Since the total number of equally likely subsets is 6:
Probability of A = number of ways to get two reds / total = 1/6.
Probability of B = number of ways to get at least one red / total = 5/6.
Using conditional probability
We recall the key conditional probability formula:
Since A = “two reds” is automatically contained in B = “at least one red,” P(A ∩ B) is just P(A). Hence:
P(A|B) = [1/6] / [5/6] = 1/5.
Thus, after learning that there is at least one red ball, the probability that both are red is 1/5.
Why 1/5 makes sense intuitively
There are five equally possible outcomes consistent with the event “at least one is red.” Exactly one of those five is the case “both are red.” Hence, 1 out of 5 is the probability that the second ball is also red.
Follow-up questions
What if we did not label the red balls distinctly?
Even if we considered the two red balls as indistinguishable, the probabilities would remain the same. In combinatorial language, the key point is that, before observing which pair was drawn, each pair of balls is equally likely. Once we label them for counting, we see 6 possible outcomes; once we remove the outcome with no reds, we have 5 left. Only one of these 5 is the pair of both reds. Hence the result remains 1/5.
Could we have used a combinatorial formula instead of listing outcomes?
Yes. We could use combinations to count outcomes:
Total ways to choose 2 from 4: C(4,2) = 6.
B excludes the one combination that has no red. The number of ways to pick 0 red from 2 red and 2 non-red is C(2,0)*C(2,2)=1. So B has 6 - 1 = 5 ways.
A ∩ B is just A, which is the one combination that includes both red balls. So there is 1 way.
Hence P(A|B) = (1/6)/(5/6) = 1/5.
How does this differ from the “someone picks a red ball and shows it to you” scenario?
There is a subtle but famous difference between “knowing that at least one of the chosen balls is red” and “someone specifically shows you one red ball.” In the latter case, the updated sample space might shift in ways that can change the probability distribution (similar to the idea behind the Monty Hall puzzle). Here, the statement is simply “among the two drawn, there is at least one red,” which leads to a straightforward 1/5 result. If the problem had been phrased as “the first ball drawn is red,” the conditional probability would be different.
What if the person tells you the color of a particular ball?
If the person specifically points out one ball (say “the left-hand ball is red”) rather than just saying “at least one of the two is red,” we would re-calculate with B = “the first (or left-hand) ball is red”:
Probability of drawing R1 or R2 in the first position depends on the sampling method. For equally likely draws, the chance that the first ball is red changes the sample space differently (there are 2 possible red picks for the first ball out of 4 total, then 1 red left out of 3 for the second ball, etc.). This is a different event from “at least one of them is red.” Typically, that probability becomes 1/3 rather than 1/5, but it depends on carefully enumerating the event definitions.
How might you solve it using a Bayesian perspective?
Define random variables for each event and apply Bayes’ rule:
Let B be “at least one red is drawn,” A be “both drawn are red.”
Bayes’ rule is exactly the standard conditional probability expression P(A|B) = P(B|A)*P(A) / P(B). Here, P(B|A)=1 because if both are red, it is certain that “at least one is red.” The rest proceeds as before.
Does the order of drawing matter for this problem?
In this scenario, the problem is phrased such that the two balls are drawn together, and order is irrelevant. If we picked them sequentially, then at the final step we only care about the unordered pair. This means we are summing over all ways we can arrive at an unordered pair. The conclusion is the same: 1/5 once we know at least one is red. The difference would matter if the question or evidence given was about the sequence or if we were told something about which ball was red first.
How might you implement a quick simulation in Python?
Below is a tiny example of a Monte Carlo approach that randomly draws two balls many times, checks the condition that at least one is red, and counts how often both are red among those draws.
import random
def simulate(num_simulations=10_000_000):
balls = ['R', 'R', 'B', 'W']
count_condition = 0
count_both_red = 0
for _ in range(num_simulations):
draw = random.sample(balls, 2)
if 'R' in draw:
count_condition += 1
if draw.count('R') == 2:
count_both_red += 1
return count_both_red / count_condition
print(simulate())
Running this should yield a number close to 0.2, confirming the theoretical 1/5 result.
Below are additional follow-up questions
What if the statement “there is at least one red ball in the draw” might be unreliable or prone to error?
In real-world scenarios, an observer might misidentify the color of a ball. For example, they might see an orange ball but mistakenly claim it is red, or they might incorrectly report that they spotted a red ball when they actually did not. This adds a layer of uncertainty to the event:
Suppose the observer states “at least one ball is red,” but there is a known probability p that the observer correctly identifies a red ball and a probability q that they incorrectly identify a non-red as red.
In that case, we need to adjust event B (the statement “at least one red”) to account for potential false positives or false negatives. The actual conditional probability P(A|B) is then influenced by the reliability of the observer, requiring a more nuanced Bayesian calculation.
This approach involves modeling P(B|A) and P(B|A^c) (where A^c is the event “not both red,” i.e., 0 or 1 red) under observer error, then updating with Bayes’ rule. Pitfalls here include failing to account for the observer’s accuracy, which can lead to overconfidence in the 1/5 outcome if the observer’s statement might be wrong.
How does the calculation generalize if there are more than two red balls in the bag?
If the bag contained, say, three red balls and one non-red ball, the question would become: “Given that at least one ball is red, what is the probability both drawn balls are red?” The logic remains similar, but the counts change:
The total ways to choose 2 out of 4 is still 6, but if the composition changes (for example, 3 reds, 1 blue), the sets of outcomes are different. We might have {R1, R2}, {R1, R3}, {R2, R3}, {R1, B}, {R2, B}, {R3, B}. If we know there is at least one red, that excludes only {B, B}—which doesn’t even exist when there’s just one blue.
More generally, you can use the combinatorial approach: number of ways to choose 2 reds out of the total red balls, divided by the number of ways to choose pairs that include at least one red. Errors arise if you assume the original 1/5 result without recalculating for the new composition.
What if you learn that exactly one of the balls is red rather than at least one?
If you are told “exactly one is red,” that changes event B to “exactly one red in the pair.” This excludes not only the pair with no reds but also the pair with two reds:
In the original scenario (2 red, 1 blue, 1 white), if B becomes “exactly one red,” the set of possible pairs is {R1, B}, {R1, W}, {R2, B}, {R2, W}. Now there are 4 possible outcomes for B. The event A = “both red” is incompatible with B in that scenario, making P(A|B) = 0.
A real pitfall is to keep using the prior statement “at least one is red” logic. Mixing up “at least one” with “exactly one” leads to incorrect probabilities. Always redefine your event carefully.
How would the result change if sampling is done with replacement?
With replacement, you first draw a ball and note its color, then put it back and draw again. The pair of draws can include the same ball on both draws in principle, though physically that might be a different notion than literally picking the same physical item. Nonetheless, mathematically:
The probability of drawing 2 red in two draws with replacement from a bag of 2 red and 2 non-red (4 total) is (2/4) * (2/4) = 1/4 in the unconditional sense.
If we are told “at least one of these two draws is red,” we must calculate P(A|B) with the updated sampling model. The sample space for two draws with replacement has 16 ordered outcomes (4 choices for the first draw x 4 choices for the second). Only the ones with at least one red remain in B, and we see how many have both red. This yields a different ratio than 1/5. Failing to distinguish “with replacement” from “without replacement” is a classic pitfall that changes the sample space counting.
Can the probability shift if we repeat the experiment and reveal partial information after each draw?
Yes. If multiple draws are performed sequentially and after each draw the partial outcome is revealed (“we saw at least one red, or we saw at least one blue”), the knowledge state changes for subsequent draws, and that influences how we update probabilities. A pitfall is to treat each draw as an isolated event when in fact we are accumulating information:
For instance, if we draw two balls, see at least one red, then return them and draw again, you might track how often this event occurs. Over many draws, your perception of the underlying probability may shift if you start suspecting the bag composition is different from the official description.
Properly handling repeated draws typically involves Bayesian updating over the composition or the statements. If you keep seeing contradictory statements or results, you might suspect an error in your assumptions.
What if the bag’s content was uncertain, and you only have a prior belief about how many red balls might be inside?
Sometimes, you do not know exactly how many red balls are in the bag. You might have a prior distribution: for example, there is a 50% chance the bag has 2 red balls and 50% chance it has 3 red balls. Once you observe “at least one ball is red,” you update that distribution. Then you compute P(A|B) across these possible compositions:
You would use total probability by weighting the scenarios. For instance, P(A|B) = P(A|B, 2-red scenario)*P(2-red scenario|B) + P(A|B, 3-red scenario)*P(3-red scenario|B). Each of those terms is derived from counting outcomes in the 2-red or 3-red case, combined with updated posterior probabilities for each scenario. Failing to incorporate uncertainty about the bag composition is a typical real-world pitfall, because in many practical settings you do not have perfect knowledge of how many red items are present.
How does this logic translate if we were to use a continuous variable instead of discrete color labels?
If the property of interest is continuous (like weight or size) and you only know that at least one of the drawn items is above a certain threshold, you would need to integrate over the relevant continuous densities. For example:
Let X be the random variable representing the weight of the first ball, and Y the weight of the second. Suppose you only know that min(X, Y) > some cutoff. Then the conditional probability that X and Y are both above that cutoff can be derived from integrals of the joint density.
A common pitfall is to assume discrete logic (“at least one > threshold means both might be?”) without recognizing that continuous scenarios often require a carefully defined probability density function and region of integration. The concept remains the same, but the implementation is more involved.
How might alternative enumerations (like tree diagrams or explicit permutations) help confirm the result?
Another approach is to lay out each individual draw scenario in a tree, especially if order matters:
For instance, if you model the draws as two steps (first ball, second ball) from the bag without replacement, you can write out 4 x 3 = 12 ordered outcomes. Then, “at least one red” includes all but those that have no red in either step.
By counting how many of those 12 outcomes reflect both draws being red, you can again confirm the ratio of “both red” to “at least one red.” A pitfall might be to double-count or overlook order if the problem specifically states “together” or “simultaneously,” but the tree approach remains a solid check. It ensures that if you mistakenly used only 6 unordered outcomes, you can cross-validate those 6 sets correspond to 12 permutations in an ordered sense.