ML Interview Q Series: Calculating Yearly Accident Risk: Applying Bernoulli Trials & Complementary Probability.
Browse all the Probability Interview Questions here.
A certain person considers that he can drink and drive: usually he believes he has a negligible chance of being involved in an accident, whereas he believes that if he drinks two pints of beer, his chance of being involved in an accident on the way home is only one in five hundred. Assuming that he drives home from the same pub every night, having drunk two pints of beer, what is the chance that he is involved in at least one accident in one year? Are there any assumptions that you make in answering the question?
Short Compact solution
We assume that each daily drive home is independent of every other drive. Let p be the probability that the driver does not have an accident on any given day. Since he believes the chance of an accident is 1/500, we have p = 1 − (1/500) = 0.998. Over 365 days in a year, the probability that he has no accidents at all is p^365 = (0.998)^365. Therefore, the probability of at least one accident in a year is 1 − (0.998)^365, which is approximately 0.5184 (or about 51.84%).
Comprehensive Explanation
The key idea is to treat the event of having an accident on any single day as an independent Bernoulli trial with success probability 1/500, where “success” means “having an accident.” This approach is typical in introductory probability questions where no further information is provided regarding day-to-day variability or other confounding factors. The crucial probability is p = 0.998 for “no accident on a single day.” When events are independent, the probability of “no accident across 365 consecutive days” is the product of the “no accident” probabilities for each day, namely (0.998)^365. The probability of “at least one accident” is then computed as the complement of that, which is 1 − (0.998)^365.
Below is the central formula:
Where:
(0.998) is the probability of not having an accident on a single day.
Raising (0.998) to the power of 365 gives the probability that no accidents happen over 365 independent days.
Subtracting this from 1 gives the probability that at least one accident will happen in that same period.
The assumptions that go into this calculation include:
Independence: The chance of an accident on any given day does not depend on whether or not an accident took place on a previous day.
Constant probability: The probability of an accident remains the same (1/500) across all days of the year. This means no changes in weather, driver behavior, car maintenance, or other external factors.
One daily trip: We assume there is exactly one drive from the pub each day, so we have 365 drives per year.
These assumptions simplify the scenario to a basic Bernoulli process.
To illustrate how one might compute this directly in Python:
import math
p_no_accident_day = 0.998
p_no_accident_year = p_no_accident_day**365
p_at_least_one_accident = 1 - p_no_accident_year
print(p_at_least_one_accident)
This code will output the approximate probability (about 0.5184).
Potential Follow-up Question: Why use a complement approach?
When we say “What is the probability of at least one accident in one year?” it can be simpler to compute the probability of the complementary event—that is, “the probability of zero accidents in one year.” The logic is that summing up probabilities for one accident, two accidents, and so on, can become cumbersome, whereas the complement uses a single multiplication of daily survival (no-accident) probabilities.
Potential Follow-up Question: What if the probability of an accident changes day to day?
If the probability of an accident is not constant—perhaps due to different road conditions, daily weather patterns, different amounts of alcohol consumption, or driver fatigue—then p might differ across days. In that case, we could denote p_i as the probability of “no accident” on day i. The probability of no accidents all year would then be p_1 * p_2 * ... * p_365, and the probability of at least one accident would be 1 − (p_1 * p_2 * ... * p_365). If we had a distribution of daily probabilities, we might need more sophisticated methods (such as a Markov chain or time-varying probability) to model the scenario properly.
Potential Follow-up Question: Is independence always realistic?
Independence might be violated if an accident (or near-accident) one day alters the driver’s behavior or the condition of the car on subsequent days. If the driver becomes more careful or if the car is damaged and that increases or decreases future accident probability, then the daily events are not strictly independent. In practice, more advanced models would be required to capture this dependency.
Potential Follow-up Question: Could a Poisson approximation be used here?
The probability of an accident per day, 1/500, is small, and there are 365 trials in a year. This could be approximated with a Poisson process, where the expected number of accidents in one year is λ = (1/500) * 365 = 0.73. Under the Poisson model, the probability of zero accidents is e^(−λ), which would be about e^(−0.73) ≈ 0.48, giving a probability of at least one accident near 0.52. This is close to the exact computation of 1 − (0.998)^365. The Poisson approximation is usually valid for rare events with many trials but is only exact in the limit as the number of trials grows large and the probability of success per trial gets smaller.
Potential Follow-up Question: How might the result differ if there were multiple drives per day?
If the person drives multiple times per day under the same assumptions, the number of Bernoulli trials would increase accordingly. For example, if there were two drives each day under the same accident probability of 1/500 for each drive, then there would be 730 trials per year. The probability of no accident would be (0.998)^730, and the chance of at least one accident would be 1 − (0.998)^730, which would be higher than the single-drive scenario.
Potential Follow-up Question: Could we model changing driver behavior in a Bayesian framework?
Yes. If we suspect that the accident probability might shift over time (e.g., the driver’s skill, caution, or the car’s mechanical reliability changes), we could place a prior distribution on the driver’s accident probability. We would update the probability whenever accidents or near-accidents provide evidence that influences our estimate of the driver’s daily accident risk. Such a Bayesian model would no longer simply multiply constant probabilities; instead, it would involve updating a posterior distribution after each trip, reflecting evolving beliefs about the likelihood of an accident.
These questions highlight many subtle points that go beyond the simple independent Bernoulli assumption. However, when an interview problem provides no further context or data, the standard assumption is that each daily trial is identical and independent, yielding the straightforward calculation 1 − (0.998)^365.