ML Interview Q Series: Modeling Daily Retail Order Probabilities Using the Binomial Distribution.
Browse all the Probability Interview Questions here.
A wholesaler supplies products to 10 retail stores, each of which will independently make an order on a given day with probability 0.35. What is the probability of getting exactly 2 orders? Find the most probable number of orders per day and the probability of this number of orders. Finally, find the expected number of orders per day.
Short Compact solution
Because each store orders independently with probability 0.35, the total number of orders follows a Binomial(10, 0.35) distribution. The probability of exactly 2 orders is computed by looking at all possible ways to choose which 2 of the 10 stores place orders (there are 45 such ways), and then multiplying by (0.35^2)(0.65^8). Numerically:
Probability of exactly 2 orders: 45 * (0.35^2) * (0.65^8) ≈ 0.1757
The most probable (mode) number of orders is 3. Its probability is 120 * (0.35^3) * (0.65^7) ≈ 0.2522
The expected number of orders E(X) = n p = 10 × 0.35 = 3.5
Comprehensive Explanation
Binomial Setting
When a process is repeated n times (in this case, n=10 retail stores) and each trial (store deciding to order) has two outcomes—“order” with probability p=0.35 or “not order” with probability 0.65—and all trials are independent, the number of “successful” trials X (the number of stores that place an order) follows a Binomial distribution.
Probability Mass Function
The binomial probability mass function for X is:
Here:
10 is the total number of stores (n=10).
x is the number of orders, ranging from 0 to 10.
0.35 is the probability that any one store places an order (p=0.35).
0.65 is the probability that the store does not place an order (1–p).
The binomial coefficient (10 choose x) counts the number of ways to pick x stores out of 10 to be the ones that place orders.
Probability of Exactly 2 Orders
To find P(X=2), we plug in x=2 in the formula. We can also think of it more simply by:
Choosing which 2 of the 10 stores order (there are 45 ways to do that).
Multiplying by (0.35^2) to account for those chosen stores ordering.
Multiplying by (0.65^8) for the remaining 8 stores not ordering.
Hence: P(X=2) = 45 × (0.35^2) × (0.65^8) ≈ 0.1757
Most Probable (Mode) Number of Orders
For a Binomial(n, p) distribution, a common formula for the mode is the integer part of ((n+1)p). With n=10 and p=0.35, (10+1)*0.35 = 3.85. Taking the floor of 3.85 gives 3. So the mode is 3.
Plugging x=3 into the PMF: P(X=3) = 120 × (0.35^3) × (0.65^7) ≈ 0.2522
This is the highest single probability across all possible x values.
Expected Value
The expectation E(X) of a Binomial(n, p) random variable is n*p. Here, that is 10 × 0.35 = 3.5. Note that 3.5 is not an integer, so while the expected value is 3.5, the most probable number of orders is 3, which is the integer that maximizes the binomial probability.
Why the Most Probable Number Can Differ from the Mean
Because the binomial distribution is discrete, its mean may not coincide with an integer value. The mode (most probable value) often differs from the mean if (n+1)*p is not an integer. It is perfectly normal in discrete distributions for the expectation to be a non-integer.
Numerical Computation Tips
In Python, one can use libraries like
scipy.stats
(specificallyscipy.stats.binom.pmf
) to compute binomial probabilities directly.For large n, one might consider approximate distributions (such as the Poisson or normal approximations) if needed for efficiency. In this case, n=10 is small enough that direct binomial calculation is trivial.
Possible Follow-Up Questions
How would the probability change if p varied from day to day?
If the probability p is not fixed but changes daily, say p_t on day t, then the number of orders on day t still follows a Binomial(10, p_t) distribution for that day. However, over multiple days, one would see different binomial distributions each day. If you needed a long-term average, you could average the daily expected values: E(X_t) = 10 * p_t. But the total aggregated distribution across many days would no longer be a simple Binomial(10, p) unless p_t remains constant over time.
Could we use a Poisson approximation here?
Yes, if n is large and p is relatively small, the Binomial(n, p) distribution can be approximated by a Poisson(λ = n p). In this case, n=10 and p=0.35 might not be large enough or p might not be small enough for that approximation to be highly accurate, but for bigger n and smaller p, it can be a useful approach.
Why is independence assumed, and what if it is violated?
The binomial model relies on the assumption that each store’s decision to order is independent of others. In reality, external factors might cause correlation (e.g., a special promotional event might make all stores more likely to order). If the events are not independent, the binomial formula would not strictly apply, and you might need alternative models (like a correlated Bernoulli model or a beta-binomial if there is uncertainty in p).
How do we interpret the expected value being non-integer but the mode being an integer?
The expectation is a theoretical average over many repetitions. If you repeated the scenario a large number of days, the long-run average number of orders per day would approach 3.5. However, on any single day, the number of orders must be an integer. The mode is simply the most likely single outcome in a given day.
Implementation in Python
Below is a small Python snippet illustrating how to compute these probabilities in practice:
import math
from math import comb # Python 3.8+
p = 0.35
q = 1 - p
n = 10
# Probability of exactly 2 orders
x = 2
prob_x_2 = comb(n, x)*(p**x)*(q**(n-x))
# Probability of exactly 3 orders
x = 3
prob_x_3 = comb(n, x)*(p**x)*(q**(n-x))
# Expected value
expected_val = n * p
print(f"Probability of exactly 2 orders: {prob_x_2}")
print(f"Probability of exactly 3 orders: {prob_x_3}")
print(f"Expected number of orders: {expected_val}")
This confirms the binomial results and highlights how straightforward it is to implement such calculations in code.