ML Interview Q Series: Calculating Probabilities for Three Dice Events Using Outcome Counting
Browse all the Probability Interview Questions here.
Suppose that for three dice of the standard type all 216 outcomes of a throw are equally likely. Denote the scores obtained by X1, X2, and X3. By counting outcomes in the events, find
(a) P(X1 + X2 + X3 <= 5) (b) P(min(X1, X2, X3) >= i) for i=1,2,...,6 (c) P(X1 + X2 < (X3)^2)
Short Compact solution
Comprehensive Explanation
Part (a): Probability that X1 + X2 + X3 <= 5
Reasoning by direct enumeration (counting outcomes) Each of X1, X2, and X3 takes values from 1 to 6. There are 6 * 6 * 6 = 216 equally likely outcomes. We want to count how many of these have X1 + X2 + X3 <= 5.
The minimum sum (X1 + X2 + X3) is 3 (if each die is 1).
We must list or otherwise enumerate all triples (x1, x2, x3) whose sum is 3, 4, or 5.
In practice:
Sum = 3: This happens only with (1,1,1). Exactly 1 outcome.
Sum = 4: Possible combinations for (X1, X2, X3) are permutations of (1,1,2). There are 3 permutations of (1,1,2), so 3 outcomes.
Sum = 5: We can have permutations of (1,1,3) or permutations of (1,2,2).
For (1,1,3), there are 3 permutations.
For (1,2,2), there are also 3 permutations.
In total, 3 + 3 = 6 outcomes.
Hence, the total number of favorable outcomes for sum <= 5 is 1 + 3 + 6 = 10. Since each of the 216 outcomes is equally likely, the probability is 10/216.
Part (b): Probability that min(X1, X2, X3) >= i
We want the smallest of the three dice to be at least i. In plain text, that means X1 >= i, X2 >= i, and X3 >= i. Each die can then take any value from i up to 6. The count of valid values for each die is (7 - i). Therefore, the number of favorable outcomes is (7 - i) * (7 - i) * (7 - i) = (7 - i)^3.
Since the total number of outcomes is 216, we get (7 - i)^3 / 216.
Part (c): Probability that X1 + X2 < (X3)^2
We break it down by conditioning on the possible values of X3:
X3 can range from 1 to 6. For each fixed value of X3=j, we look at all pairs (X1, X2) with X1,X2 in {1,...,6} and see how many pairs satisfy X1 + X2 < j^2.
Here is the direct count given in the short solution:
If X3=2, there are 36 equally likely pairs for (X1, X2). Among those 36 pairs, the number of pairs with X1 + X2 < 4 is 3 (these pairs are (1,1), (1,2), (2,1) do NOT satisfy 1+2=3 <4 -> that does; we simply enumerate systematically).
If X3=3, we again have 36 pairs for (X1, X2). We want X1 + X2 < 9. Most pairs (X1, X2) except the ones that sum to 9 or above will qualify. It turns out 26 such pairs satisfy X1 + X2 < 9.
Similarly, for X3=1, (X1 + X2) must be < 1^2=1, but that is impossible because X1,X2>=1.
For X3>=4, the condition X1+X2 < (X3)^2 is always satisfied because (X3)^2 >= 16, and the maximum sum X1+X2 can have is 12. So for X3 in {4,5,6}, all 36 pairs of (X1,X2) will work.
When counting systematically, we get a total of 137 favorable triples out of 216. This yields 137/216.
Follow-Up Questions
How might the results change if the dice are not fair?
If each die has a different probability distribution for its faces (not the uniform 1/6 for each face), the number of total outcomes would still be 6 * 6 * 6 in terms of distinct face-combinations, but each combination would not be equally likely. Instead, each triple (x1, x2, x3) would have probability p1(x1) * p2(x2) * p3(x3) if the dice are independent but not necessarily fair. One would then replace simple integer counts with weighted sums. For instance, for part (a), we would sum up probabilities p1(x1)*p2(x2)*p3(x3) over all x1, x2, x3 such that x1 + x2 + x3 <= 5.
A key pitfall is that you can no longer assume that counting each valid triple and dividing by 216 suffices, because all triples have different probabilities. You would need the explicit probability mass function for each die face.
Can we generate all outcomes programmatically in Python to verify these probabilities?
Yes. A typical approach would be a nested loop from 1 to 6 for each X1, X2, X3, and simply tally the number of occurrences of each event of interest. Here is a quick example:
count_a = 0 # for X1+X2+X3 <= 5
count_b = [0]*7 # for min(X1,X2,X3) >= i, i in 1..6
count_c = 0 # for X1+X2 < X3^2
total = 0
for x1 in range(1,7):
for x2 in range(1,7):
for x3 in range(1,7):
total += 1
if (x1 + x2 + x3) <= 5:
count_a += 1
for i in range(1,7):
if min(x1,x2,x3) >= i:
count_b[i] += 1
if (x1 + x2) < (x3**2):
count_c += 1
print("P(X1+X2+X3 <= 5) =", count_a / total)
for i in range(1,7):
print(f"P(min(X1,X2,X3) >= {i}) =", count_b[i] / total)
print("P(X1+X2 < (X3)^2) =", count_c / total)
This code exhaustively enumerates all possibilities and prints the empirical probabilities, which should align exactly with 10/216, (7 - i)^3/216 for i=1..6, and 137/216 respectively.
Why does (7 - i)^3 count all configurations when min(X1, X2, X3) >= i?
For the condition min(X1, X2, X3) >= i, we need each X1, X2, X3 >= i. Since each Xk can be one of {1, 2, 3, 4, 5, 6}, requiring Xk >= i restricts Xk to {i, i+1, ..., 6}. The number of possible values for Xk is 6 - i + 1 = 7 - i. Hence for each of the three dice, we have 7 - i possibilities. Multiplying these possibilities across the three dice yields (7 - i)^3 total valid outcomes. This is a direct product counting argument and is typical for discrete uniform distributions.
How do we handle higher-dimensional generalizations?
If we have n independent dice, each with values from 1 to 6, the total number of outcomes is 6^n. For events such as min(X1, ..., Xn) >= i, we would use a similar approach but with (7 - i)^n favorable outcomes. For sums, we might either do more advanced combinatorial arguments or use generating functions. For complicated inequalities, we could do a direct enumeration approach or incorporate dynamic programming to handle large n.
What if we wanted the distribution of the sum X1 + X2 + X3?
The probability distribution of X1 + X2 + X3 for three fair dice is well-known to form a discrete distribution on {3, 4, ..., 18}. For each sum s in that range, one can count the number of (x1, x2, x3) that produce sum = s, or use known formulas for dice sum distributions (e.g., using the threefold convolution of the discrete uniform distribution). The probability is (#combinations for sum s)/216.
Key subtlety: The number of combinations is not symmetrical around s=10.5 with the same heights for sums = 3 and = 18. Indeed, the distribution is well-studied and can be enumerated systematically.
What are some edge cases or pitfalls in real applications?
Loaded dice (non-uniform probabilities): counting each outcome equally is no longer correct.
Dependent dice: if the dice rolls affect each other, we cannot simply multiply probabilities.
Model mismatch: in data science applications, “dice” might be replaced by real-world random variables that do not have uniform distributions or might have continuous ranges. The combinatorial approach here would need rethinking.
Precision issues: if probabilities are extremely small or large or we rely on floating-point arithmetic, numerical stability might matter in more complex scenarios.
All these considerations become crucial in advanced probability problems or machine learning scenarios where distributions might not be uniform or might be continuous.