ML Interview Q Series: Calculating At Least One Match Probability Using the Binomial Distribution
Browse all the Probability Interview Questions here.
Two sisters maintain that they can communicate telepathically. To test this assertion, you place the sisters in separate rooms and show sister A a series of cards. Each card is equally likely to depict either a circle or a star or a square. For each card presented to sister A, sister B writes down "circle," "star," or "square," depending on what she believes sister A is looking at. If ten cards are shown, what is the probability that sister B correctly matches at least one?
Short Compact solution
To solve this under the assumption of purely random guesses (no telepathy), we note that for each card there is a 1/3 chance sister B guesses correctly, so a 2/3 chance she is incorrect on any single card. The probability that she is incorrect on all ten cards is (2/3)^10 = 0.0173. Hence, the probability of getting at least one correct match is 1 − 0.0173 = 0.9827.
Comprehensive Explanation
One way to understand this is by defining a Bernoulli trial for each card: success means a correct guess, failure means an incorrect guess. The probability of success on a single card is 1/3. We are dealing with ten independent Bernoulli trials, so the number of correct guesses follows a Binomial distribution with parameters n=10 (ten trials) and p=1/3 (probability of success on each trial).
The probability of zero successes (i.e., zero correct guesses) is given by the binomial formula for k=0: (n choose 0) * p^0 * (1−p)^n, which reduces to (1−1/3)^10 = (2/3)^10. Numerically, (2/3)^10 = 0.0173. Therefore, the probability of at least one success (at least one correct guess) is simply the complement of that value.
where
P(at least one correct match) is the probability that sister B matches at least one card exactly.
2/3 is the probability that sister B guesses incorrectly for any single card.
10 is the total number of cards shown.
Plugging in the values, we get: 1 − (2/3)^10 = 1 − 0.0173 = 0.9827.
Hence, if sister B is guessing and the guesses are truly random for each card, the chance that she gets at least one card correct across ten cards is about 98.27%.
Potential Follow-up Questions
Why do we use 2/3 for the probability of an incorrect guess?
Each card can be one of three symbols: circle, star, or square. If sister B is guessing randomly among these three choices, her probability of being correct on any single card is 1/3. Therefore, the probability of being incorrect on a single card is 2/3.
Are the attempts truly independent?
We typically assume independence if sister B's guess for each card does not influence her guess for any other card. In a real scenario, she might change her guesses based on perceived patterns or intuition, which would introduce correlation among guesses. However, under the simplest assumption of random, independent guesses, multiplying the probabilities is justified.
What if sister B keeps track of previous symbols and tries to balance her guesses?
If sister B uses a strategy (for example, trying not to guess the same symbol too many times in a row), the probability for each card might not remain exactly 1/3. The modeling then requires knowledge of how her guesses are distributed. Nonetheless, if sister A’s actual cards are equally likely to be circle, star, or square, and sister B’s strategy has no real predictive power, the expected probability of being correct should still average out to 1/3 unless the strategy is based on genuine communication or a learned bias.
How would we generalize this to more or fewer symbols?
If there were m equally likely symbols rather than 3, the probability of being correct on a single guess becomes 1/m. The probability of being incorrect on a single guess becomes (m−1)/m. Thus, for n cards, the probability of getting at least one match under random guessing would be 1 − [(m−1)/m]^n.
What statistical test could be used to validate the sisters’ telepathy claim?
One approach is to use a binomial test. Under the null hypothesis (no telepathy, random guessing), the chance of k correct guesses out of n is given by the binomial probability. If the sisters score significantly better than chance, one can compute a p-value and see if it is below a threshold like 0.05. If so, we reject the null hypothesis in favor of the alternative hypothesis that they have some predictive power above random guessing (though not necessarily telepathy).
Can we implement this probability calculation in Python?
Yes. Here is a quick example:
import math
# Probability of exactly k successes in n Bernoulli trials
def binomial_pmf(k, n, p):
return math.comb(n, k) * (p**k) * ((1 - p)**(n - k))
# Probability of at least one success
def prob_at_least_one_success(n, p):
return 1 - binomial_pmf(0, n, p)
# Example usage
n = 10
p = 1/3
prob = prob_at_least_one_success(n, p)
print(prob) # Should be around 0.9827
This code computes the binomial probability mass function using math.comb (available in Python 3.8+) for combinations, then calculates the probability of at least one success as 1 minus the probability of zero successes.
Could small sample sizes lead to misleading conclusions?
Yes. Although we computed the probability of at least one match in 10 trials, that alone is not sufficient to conclude telepathy if sister B manages to get one or more matches in a small sample. Real tests of extrasensory perception or telepathy generally require a larger number of trials, careful controls for biases, and rigorous statistical significance testing over many such experiments.