ML Interview Q Series: Calculating Normal Distribution Probabilities and Percentiles via Standardization

May 09, 2025

Browse all the Probability Interview Questions here.

Suppose X is N(10, 1). Find (i) P[X > 10.5], (ii) P[9.5 < X < 11], (iii) x such that P[X < x] = 0.95. Use Standard Normal tables.

Short Compact solution

Let Z denote a N(0,1) random variable. Then:

• For part (i): P[X > 10.5] = 1 - Φ(0.5) = 0.3085 • For part (ii): P[9.5 < X < 11] = Φ(1) - Φ(-0.5) = 0.5328 • For part (iii): We need x such that P[X < x] = 0.95. Transforming to Z: (x - 10) = 1.645, which gives x = 11.645

Comprehensive Explanation

Transforming X to the Standard Normal Variable

Because X has mean μ = 10 and standard deviation σ = 1, we define the standard normal variable Z as:

Here:

X is our original normally distributed variable with mean 10 and standard deviation 1.
μ is the mean of X (10).
σ is the standard deviation of X (1).
Z is a standard normal random variable, which means Z ~ N(0, 1).

Part (i): Computing P[X > 10.5]

We first convert the event X > 10.5 into an event on Z. Note that: 10.5 - 10 = 0.5, and 0.5 / 1 = 0.5

Hence, P[X > 10.5] = P[Z > 0.5] = 1 - P[Z <= 0.5] = 1 - Φ(0.5).

From standard normal tables, Φ(0.5) is approximately 0.6915, so P[X > 10.5] = 1 - 0.6915 = 0.3085.

Part (ii): Computing P[9.5 < X < 11]

We convert the bounds to Z:

Lower bound: (9.5 - 10) / 1 = -0.5
Upper bound: (11 - 10) / 1 = 1

Therefore, P[9.5 < X < 11] = P[-0.5 < Z < 1] = Φ(1) - Φ(-0.5).

Using standard normal tables: Φ(1) ≈ 0.8413 and Φ(-0.5) ≈ 0.3085, so P[-0.5 < Z < 1] = 0.8413 - 0.3085 = 0.5328.

Part (iii): Finding x such that P[X < x] = 0.95

We want the 95th percentile of X. We know that:

P[X < x] = 0.95

Transforming to Z:

P[Z < (x - 10)/1] = 0.95.

The 95th percentile of the standard normal distribution (z-value) is often denoted as z_0.95 ≈ 1.645. Hence,

(x - 10) = 1.645 x = 10 + 1.645 = 11.645.

Follow-up Question 1

Why do we use Z ~ N(0,1) in solving problems involving X ~ N(μ, σ²)?

A standard normal variable Z is a special case of the normal distribution with mean 0 and variance 1. By converting any normal random variable X ~ N(μ, σ²) to Z using the formula (X - μ)/σ, we can leverage standardized tables or well-known software functions that give the cumulative distribution function (CDF) for Z. This standardization is a universal way of handling any normal distribution without compiling separate tables for each possible (μ, σ).

Follow-up Question 2

How do we compute these probabilities in Python without using manual tables?

In Python, one can use libraries such as scipy.stats to compute normal distribution probabilities and quantiles. For example:

import math
from scipy.stats import norm

# Part (i): P[X > 10.5], X ~ N(10, 1)
p_i = 1 - norm.cdf(10.5, loc=10, scale=1)

# Part (ii): P[9.5 < X < 11]
p_ii = norm.cdf(11, loc=10, scale=1) - norm.cdf(9.5, loc=10, scale=1)

# Part (iii): x such that P[X < x] = 0.95
x_95 = norm.ppf(0.95, loc=10, scale=1)

print(p_i, p_ii, x_95)

Here:

norm.cdf(x, loc=μ, scale=σ) returns the value Φ((x - μ)/σ).
norm.ppf(q, loc=μ, scale=σ) is the inverse CDF (i.e., the quantile function).

Follow-up Question 3

Are there any edge cases if σ ≠ 1?

Yes. If the standard deviation σ is not 1, the transformation becomes Z = (X - μ)/σ. You must still look up Z in standard normal tables or use software for Φ. That is the main reason standardization is widely used: it always brings the distribution back to a form where mean=0, std=1. If σ were 2, for example, then P[X > 12] turns into P[Z > (12 - μ)/σ] = P[Z > (12 - 10)/2] = P[Z > 1], and you would look up Φ(1) to find the probability.

Follow-up Question 4

Why is 1.645 used for the 95th percentile instead of 1.96?

The value 1.645 is associated with a one-sided 95th percentile, meaning P[Z < 1.645] = 0.95. On the other hand, 1.96 is associated with a two-sided 95% confidence interval, where we typically want the central area between -1.96 and 1.96 to be 0.95. So 1.96 is the z-value used when we talk about two-sided coverage. If the question explicitly asks for the one-sided 95th percentile cut-off, 1.645 is correct.

These nuances often arise when interpreting confidence intervals vs. percentile cut-offs. In summary, 1.645 for a single tail at 5% above that point, and 1.96 for splitting 5% across both tails.

Rohan's Bytes

Discussion about this post