ML Interview Q Series: Random Baby Assignment: Finding Correct Match Probability with Inclusion-Exclusion
Browse all the Probability Interview Questions here.
Three babies are given a weekly health check at a clinic, and then returned randomly to their mothers. What is the probability that at least one baby goes to the right mother?
Short Compact solution
Let E1, E2, E3 be the events that baby1, baby2, baby3 respectively is reunited with its correct mother. We want P(E1 ∪ E2 ∪ E3). By the principle of inclusion-exclusion, we have:
P(E1) = P(E2) = P(E3) = 1/3
P(E1 ∩ E2) = P(E1 ∩ E3) = P(E2 ∩ E3) = 1/6
P(E1 ∩ E2 ∩ E3) = 1/6
Using the standard formula:
Substitute the values:
Hence, the probability that at least one baby is returned to the correct mother is 2/3.
Comprehensive Explanation
Intuitive Understanding with Permutations
We can label the babies as B1, B2, B3 and the mothers as M1, M2, M3, where Bi's correct mother is Mi.
All possible ways to assign the three babies to the three mothers (i.e., all permutations of three items) is 6.
We list them as (B1→M1, B2→M2, B3→M3), (B1→M1, B2→M3, B3→M2), and so on until we have all permutations.
To find the probability that at least one baby is correctly matched:
Count how many permutations match at least one Bi to Mi.
Divide by the total number of permutations (which is 6).
A common shortcut is to recognize that the only permutations that fail to match any baby to its correct mother (known as “derangements”) are exactly 2 out of 6 in the case of 3 items. Thus, the probability of “no baby matched” is 2/6 = 1/3. Therefore, the probability that at least one is matched is 1 - 1/3 = 2/3.
Inclusion-Exclusion Principle Detail
To see why the formula works numerically:
Probability that B1 is correct is 1/3. (Similarly for B2, B3.)
When B1 is correct and B2 is correct simultaneously, we are effectively fixing two positions and assigning the last. This happens with probability 1/6 in a uniformly random assignment of the 3 babies.
All three being correct is 1/6 because out of 6 equally likely permutations, exactly 1 has B1→M1, B2→M2, B3→M3.
Hence the sum of these probabilities, with overlaps subtracted and the triple overlap added back, leads us to 2/3.
Alternative Calculation Using “Derangements”
For completeness, the probability of no matches is found by enumerating derangements for 3 items:
A derangement is a permutation in which no element appears in its original position. For 3 distinct objects, the number of derangements is 2.
Since there are 6 total permutations, the probability is 2/6 = 1/3.
Therefore, probability of at least one correct match is 1 - 1/3 = 2/3.
Python Simulation Example
Below is a quick Python snippet to simulate this scenario and empirically estimate the probability:
import random
import math
def simulate_prob_at_least_one_correct(n_sim=10_000_00):
correct_count = 0
# We'll label mothers as [0, 1, 2] and babies as [0, 1, 2]
for _ in range(n_sim):
mothers = [0, 1, 2]
random.shuffle(mothers) # random assignment
# mothers[i] denotes to which mother baby i is assigned
# A baby is correct if i == mothers[i]
if any(i == mothers[i] for i in range(3)):
correct_count += 1
return correct_count / n_sim
empirical_probability = simulate_prob_at_least_one_correct()
print("Empirical Probability:", empirical_probability)
When run, you should see the estimated probability close to 0.666..., reflecting 2/3.
Follow-up Question 1: What if we had n babies and n mothers? How do we find the probability that at least one baby goes to the right mother?
For n babies (and n mothers), the problem generalizes to permutations of n elements. The probability that no baby is returned to the correct mother (i.e., the probability of a derangement) is given by the formula:
Derangements of n = n! * (1 - 1/1! + 1/2! - 1/3! + ... + (-1)^n * 1/n!).
Thus, the probability that none of the n babies is matched correctly is:
(n! of derangements) / (n!) which simplifies to (1 - 1/1! + 1/2! - 1/3! + ... + (-1)^n / n!).
Hence, the probability that at least one baby is matched correctly is:
1 - (1 - 1/1! + 1/2! - 1/3! + ... + (-1)^n / n!).
This series converges to 1 - 1/e for large n, where e is the base of the natural logarithm, approximately 2.71828. Therefore, as n grows large, the probability that at least one baby is matched with the correct mother approaches 1 - 1/e (approximately 0.6321205588).
Follow-up Question 2: Why is P(E1 ∩ E2 ∩ E3) = 1/6 in the three-baby case?
In a random assignment of three distinct babies to three distinct mothers, we are uniformly choosing one of the 6 permutations. Only one specific permutation places all three babies to their correct mothers. Hence out of 6 equally likely permutations, exactly 1 satisfies E1 ∩ E2 ∩ E3, so the probability is 1/6.
Follow-up Question 3: Are there situations where the assumption of uniform assignment might be violated?
Yes. In real-world scenarios, the matching could be non-uniform. For instance:
There could be biases in how the babies are assigned (perhaps someone typically places them in alphabetical order).
There might be partial knowledge that influences the assignment probability. If the distribution over permutations is not uniform, the calculations change accordingly. One would need the actual probability distribution of each possible baby-to-mother assignment to compute the correct probability.
Follow-up Question 4: Could we apply the same principle of inclusion-exclusion if some events were not equally likely?
Yes. The principle of inclusion-exclusion in its general form does not require each event to be equally likely; it only requires proper accounting of the probabilities of each event, their pairwise intersections, their triple intersections, and so forth. So if P(E1), P(E2), etc., differ from 1/3, we could still use:
provided we have all those intersection probabilities. The difficulty is often in finding accurate values for these intersections in complex situations.
Follow-up Question 5: How could we use Bayesian reasoning if we had partial knowledge about which babies might belong to which mothers?
In a Bayesian framework, each baby–mother pairing would have a prior probability distribution (for example, you might have partial identification hints). After collecting new evidence (like a test or a physical trait), you would update those probabilities to a posterior distribution. Then, you might choose the pairing that maximizes the posterior probability or compute the probability that at least one baby is correctly matched by summing over all pairings that have at least one match and weighting by their posterior probabilities. This approach is common in real-world identity matching problems where perfect information is rarely available.