ML Interview Q Series: Probability of First Repeat: Finding the Most Likely Position in Random Selections.

May 08, 2025

Browse all the Probability Interview Questions here.

Twenty-five people each choose at random a number from {1, 2, …, 100}, independently. Next the chosen numbers are announced one by one. The first person (if any) who announces a number that has been announced before wins a bonus. Which person has the largest probability to win the bonus?

Short Compact solution

Let p_k be the probability that the k-th announcement is the first repeated number. From the product rule for probabilities, we get:

p₁ = 0 (the first announcement can never repeat a previously announced number).
p₂ = 1/100.
p₃ = (99/100) * (2/100).

In general, for k=4, …, 25:

Numerical evaluation shows that p₁₁ is the largest among these probabilities, with a value of approximately 0.0628.

Comprehensive Explanation

Understanding the Setup

We have 25 individuals, each independently picking an integer from 1 to 100 with uniform probability. These chosen numbers are then announced in some order (say, person 1 announces first, person 2 second, and so on). The moment someone announces a number that has already been announced by any previous person, that announcer wins a bonus. We want to find which person index k (among the 25 people) maximizes the probability of being the first to repeat a previously announced number.

Decomposing the Probability p_k

We define p_k as “the probability that the k-th person’s announcement is the very first time a number is repeated.” To achieve this:

The first (k − 1) announcements must be all distinct from each other.
The k-th announced number must match exactly one of the previously announced (k − 1) numbers.

Probability that the first (k − 1) announcements are all distinct

Each of the first (k − 1) people picks a number from 1 to 100. The probability that all (k − 1) chosen values are different can be reasoned as follows:

The first person can pick any number freely, so the probability is 1 of being distinct (no previous draws).
The second person must pick a different number from the first person’s pick. The probability is 99/100.
The third person must pick a different number from the first two’s picks. The probability is 98/100.
Continue this pattern until the (k − 1)-th person picks a number that is different from all previously chosen (k − 2) numbers. The probability is (100 − (k − 2))/100 = (102 − k)/100.

Multiplying these probabilities together yields:

(100/100) × (99/100) × (98/100) × … × (100 − (k − 2))/100.

Because the first factor (100/100) is 1, it is often omitted in written form, so effectively we have:

(99/100) × (98/100) × … × (100 − (k − 2))/100.

Probability the k-th announcement is a repeat of one of the first (k − 1)

Now, person k must pick a number that has already been announced. There are (k − 1) distinct numbers in the set of previously announced numbers (under the condition they were all distinct). So the probability that person k picks one of these (k − 1) previously chosen numbers is (k − 1)/100.

Putting it together

Thus, p_k = [Probability that the first (k − 1) picks are all distinct] × [Probability that the k-th pick is in that set of (k − 1) distinct values]. In the fully expanded product form:

This formula is valid for k ≥ 2, but note that p₁ = 0 (no chance to repeat on the very first pick).

Why the Maximum is around the 11th Person

When k is too small (like 2 or 3), there are very few previously announced numbers to match with, so the chance of repeating is relatively small. On the other hand, when k is large (say above 20), it becomes more likely that a repeat might have already occurred before that point, lowering the probability that k-th is the very first. There is a sweet spot somewhere in the middle where there are enough previously distinct numbers to create a decent chance of collision, but not so many that it likely happened earlier. By direct calculation (or by writing a short script to compute p_k for k=1..25), one sees the maximum occurs at k=11, with approximate probability 0.0628.

Potential Follow-up Questions

If the number of people was different, how would that change things?

The same formula extends naturally. If there are N people choosing from M possible numbers, the probability that the k-th person is the first repeat is:

Probability that the first (k − 1) picks are all distinct = (M/M) × ((M−1)/M) × ((M−2)/M) × … × ((M−(k−2))/M).
Then multiply by (k − 1)/M for the k-th to coincide with one of the (k − 1) distinct picks.

You could vary N (the total number of participants) and see which k from 1..N maximizes that quantity.

How does this relate to the Birthday Paradox?

The Birthday Paradox is a well-known probability phenomenon dealing with collisions in random selections, often with M=365 (days in a year). The “first repeat” notion is closely connected to that idea, though the classical Birthday Paradox focuses on the probability that at least one collision occurs among N picks rather than identifying which specific pick is the first to collide. However, the combinatorial arguments have similar structure. The maximum-likelihood index for the first repeat often ends up near the point where collisions become probable in the general “birthday” sense (roughly around the square root of M).

Is there a simple closed-form solution for which k maximizes p_k?

Not a neat closed form. Typically, one either relies on:

Direct computation by evaluating p_k for each k (especially if N and M are not too large).
Approximate analysis using Poisson or birthday-type approximations to find a near value for k that maximizes the probability.

How to quickly compute this in Python?

You could do:

import math

M = 100   # range of numbers
N = 25    # total participants

def probability_first_repeat(k):
    # Probability first (k-1) distinct
    prob_distinct = 1.0
    for i in range(k-1):
        prob_distinct *= (M - i) / M
    # Probability k-th is a repeat of those (k-1)
    return prob_distinct * (k - 1) / M

# Evaluate for k=1..N
best_k = None
best_val = 0

for k in range(1, N+1):
    val = probability_first_repeat(k)
    if val > best_val:
        best_val = val
        best_k = k

print("Best k:", best_k, "with probability:", best_val)

This code systematically checks each k in {1,2,…,25}, calculates p_k, and reports which k gives the highest probability.

Could no one win the bonus?

Yes, if all 25 picks happened to be distinct, then no repeat occurs. The probability that all picks are distinct is:

(100/100) × (99/100) × (98/100) × … × (100 − (25 − 1))/100.

That quantity is nonzero, so it can happen that no bonus is ever awarded (although it’s somewhat unlikely with 25 picks out of 100).

Practical Real-World Implications

In practical large-scale settings (where M is large and N is also large), we typically reference the birthday paradox results to gauge how quickly collisions might occur. This interview-style question specifically focuses on the likelihood that each announcement could be the first collision, pinpointing which announcement index is the “sweet spot” for collisions—often around the 10th to 12th pick for 100 possible numbers and 25 participants.

All of these insights reinforce that the person at position k=11 (in this scenario of 25 participants picking from 1..100) has the highest probability, about 0.0628, of being the first to repeat a number.

Below are additional follow-up questions

What if each individual does not choose their number uniformly at random?

When the assumption of uniformity is relaxed, the problem of determining p_k becomes more complex. Instead of each person having a 1/100 chance of picking each number, suppose some numbers are more likely than others. For instance, imagine a scenario where half of the people have a high preference for smaller numbers.

Effects on p_k:

The probability that the first (k − 1) announcements are distinct now depends on the joint probability distribution of these (k − 1) picks. If certain numbers have significantly higher probability than others, collisions might be more likely even for smaller k.
The chance that the k-th pick matches one of the earlier (k − 1) picks depends on how “overrepresented” certain numbers have been. If the distribution skews toward a specific subset of numbers, the probability of collision can go up or down for different k.

Pitfalls and Edge Cases:

Highly skewed distributions: A few numbers might have extremely large probability mass, making collisions likely very early. This could mean the peak of p_k might shift to a smaller k.
Participant-specific distributions: If each participant chooses from a different distribution (e.g., each has a distinct bias), the analysis must consider every combination of who might choose which number. The final expression for p_k can become unwieldy without simplifying assumptions.
Dependence on population size: With fewer participants, biases can be less predictable. With more participants, large groups might yield early repeats because many are likely to pick from that popular subset of numbers.

How do correlations between picks change the solution?

In the original problem, each participant picks independently. However, if participants coordinate or if they try to avoid previously chosen numbers (or, conversely, mimic them), there will be correlations among picks.

Effects on p_k:

If participants avoid numbers already seen, the probability of early repeats might drop (they are intentionally trying to prevent duplication). This strategy would reduce p_k for smaller k but might cause a sudden jump in p_k for some middle or later k if eventually a forced collision arises.
If participants mimic earlier picks, collisions become more likely and might occur at very small k.

Pitfalls and Edge Cases:

Partial knowledge: Perhaps each new participant only knows some fraction of previously announced numbers or incorrectly remembers them. The induced correlation pattern can be quite complex.
Strategic behavior: If the bonus is desirable, participants might attempt to pick in a way that maximizes their personal chance of collision (e.g., the 10th person might intentionally pick a number frequently chosen by people before them).
Unintended correlations: Real-world scenarios (like numbering patterns or groupthink) might create subtle correlations that distort the distribution from the idealized independent model.

What happens if the order of announcements is randomized rather than fixed?

In the usual problem setup, we assume person 1 announces first, person 2 second, etc. But suppose all 25 chosen numbers are known privately in advance, and then a random permutation decides the announcement order.

Effects on p_k:

The random ordering of announcements means each permutation is equally likely. The index k is no longer tied to a specific person but to the k-th position in the random permutation.
We might still want to know “Which position in the random ordering has the highest chance of being the first repeated announcement?” The probabilities can be recomputed by counting how often that position turns out to be the first repeat in all permutations.

Pitfalls and Edge Cases:

Multiple announcements of the same value: If multiple participants chose the same number from the start, the question becomes: in the random ordering, which one of those people with the matching number appears first? If the next person in that set appears in position k, that’s your first repeat.
Symmetry: If all participants are identical and choose from the same distribution, certain symmetrical arguments can sometimes simplify calculations. But large combinatorial expansions might be needed if the total participants is big or the distribution is not uniform.

What if there are more participants than 100 (i.e., N > M)?

While the original question had 25 participants picking from 100 possible values, we might consider N > 100. Then it becomes more probable that a repeat occurs at some point.

Effects on p_k:

For N > M=100, by the pigeonhole principle, there must be at least one duplicate among the picks. Hence there is a 100% chance of some repeat eventually.
However, finding the index k that maximizes p_k still involves the same logic: the first (k − 1) picks must be all distinct, and the k-th is one of those earlier numbers. Once N exceeds M, you will never reach the final participant without a collision, but the question is which participant is most likely to be that first collision.

Pitfalls and Edge Cases:

Very large N: If N >> M, it might be nearly impossible to get too far into the sequence without a collision. The peak p_k could shift to smaller k because collisions are likely to happen earlier.
Overlap among many picks: If many participants pick the same number, the first collision might happen extremely fast, overshadowing mid-range positions that would normally have the highest probability in more balanced scenarios.

How to generalize to continuous distributions or very large discrete sets?

Sometimes, you might have a continuous range of values (e.g., real numbers in [0,1]) or a huge discrete space. Then the notion of the “first repeat” might become extremely rare if the sample size is relatively small.

Effects on p_k:

Continuous range: Mathematically, the probability of an exact repeat in continuous space (assuming a truly continuous uniform distribution) is zero if you only have a finite number of picks, since collisions occur with probability 0 in an uncountably infinite set.
Huge discrete space: If M is extremely large (e.g., on the order of billions or more) and N is small, repeats are also unlikely. The probability p_k might be extremely small for all k, possibly pushing the maximum to a smaller or moderate k if it even shows a visible spike.

Pitfalls and Edge Cases:

Floating-point comparisons: In real-world computer systems, if picks are floating-point numbers, rounding might cause “accidental collisions” if you treat certain approximate matches as a collision. This introduces an implementation-level complication where the definition of “repeat” might not be truly identical values but “close enough” values.
Approximation: With very large M, we often use approximate formulas (e.g., Poisson approximation or birthday problem approximations) to estimate where the first repeat is likely to appear. Relying on direct combinatorial multiplication can be numerically unstable or computationally expensive.

How would the probability distribution shift if each pick can be repeated by the same person (multiple picks per person)?

In some scenarios, each participant might pick more than one number (or re-pick after a certain event). You could see a single participant effectively “spamming” picks. That changes the structure of the problem considerably.

Effects on p_k:

The total number of announcements might exceed the number of participants. Let’s say each of the 25 participants picks multiple times in some sequence. The p_k now references the k-th pick overall, but not necessarily the k-th participant.
The chance of a repeat depends on how often each participant re-picks and whether or not participants wait to see previous picks before choosing again (if that is even allowed).

Pitfalls and Edge Cases:

Sequential picking by the same person: If a person picks multiple times in close succession, they might be more likely to repeat a number they themselves announced earlier, especially if picks are not forced to be distinct.
Dependency on re-picking strategy: If the participant tries to “avoid” or “seek” their previous picks, we create correlation structures that break the simple independence assumption.

Does the result depend on whether people learn about all previous picks before choosing?

If participants only hear partial announcements or if they are not paying attention, their picks might still behave like independent draws. But if each participant is fully informed about prior picks, some might consciously or subconsciously be influenced.

Effects on p_k:

Complete knowledge and rational players: They might attempt to optimize their probability of winning, e.g., intentionally picking a known existing number if they want to secure the bonus and if they happen to go at a certain time. This breaks the i.i.d. assumption thoroughly.
Partial knowledge: If a participant only knows some subset of previously announced numbers (e.g., they missed the first 5), the probability distribution for their picks shifts but not necessarily in a straightforward way.

Pitfalls and Edge Cases:

Game theory: In a game-theoretic sense, each participant might adopt a strategy to maximize their personal chance of winning the bonus, which could cause earlier collisions or sometimes cause participants to wait for a collision to occur from someone else. The overall distribution of p_k could differ drastically from the naive calculation.

Real-world scenario: Could ties or near-simultaneous announcements affect who “wins”?

In some real-world processes, multiple announcements might come in at nearly the same time, or there might be a slight delay in hearing them.

Effects on p_k:

Simultaneous collision: If two participants’ announcements are effectively simultaneous, the notion of “first repeat” can become ambiguous. There might need to be a tiebreaker rule or an assumption about the exact ordering.
Technical glitch or latency: In an online environment, one participant’s announcement might arrive a fraction of a second later even though they said it earlier. The order of announcements might differ from real-time order, adding unpredictability.

Pitfalls and Edge Cases:

Defining who wins: A strict ordering must be enforced or a fallback rule specified. Without it, the entire concept of p_k (the k-th announcement) loses clarity.
Implementation details: If this process runs on a distributed system, network latencies or asynchronous updates might reorder messages, potentially skewing the actual index of the repeated pick.

Rohan's Bytes

Discussion about this post

Rohan's Bytes

ML Interview Q Series: Probability of First Repeat: Finding the Most Likely Position in Random Selections.

Twenty-five people each choose at random a number from {1, 2, …, 100}, independently. Next the chosen numbers are announced one by one. The first person (if any) who announces a number that has been announced before wins a bonus. Which person has the largest probability to win the bonus?

Short Compact solution

Comprehensive Explanation

Understanding the Setup

Decomposing the Probability pk

Probability that the first (k − 1) announcements are all distinct

Probability the k-th announcement is a repeat of one of the first (k − 1)

Putting it together

Why the Maximum is around the 11th Person

Potential Follow-up Questions

If the number of people was different, how would that change things?

How does this relate to the Birthday Paradox?

Is there a simple closed-form solution for which k maximizes pk?

How to quickly compute this in Python?

Could no one win the bonus?

Practical Real-World Implications

Below are additional follow-up questions

What if each individual does not choose their number uniformly at random?

Effects on pk:

Pitfalls and Edge Cases:

How do correlations between picks change the solution?

Effects on pk:

Pitfalls and Edge Cases:

What happens if the order of announcements is randomized rather than fixed?

Effects on pk:

Pitfalls and Edge Cases:

What if there are more participants than 100 (i.e., N > M)?

Effects on pk:

Pitfalls and Edge Cases:

How to generalize to continuous distributions or very large discrete sets?

Effects on pk:

Pitfalls and Edge Cases:

How would the probability distribution shift if each pick can be repeated by the same person (multiple picks per person)?

Effects on pk:

Pitfalls and Edge Cases:

Does the result depend on whether people learn about all previous picks before choosing?

Effects on pk:

Pitfalls and Edge Cases:

Real-world scenario: Could ties or near-simultaneous announcements affect who “wins”?

Effects on pk:

Pitfalls and Edge Cases:

Discussion about this post

Decomposing the Probability p_k

Is there a simple closed-form solution for which k maximizes p_k?

Effects on p_k:

Effects on p_k:

Effects on p_k:

Effects on p_k:

Effects on p_k:

Effects on p_k:

Effects on p_k:

Effects on p_k: