ML Interview Q Series: Analyzing Linear Probability Density Functions: Normalization, CDF, Quantiles, and Interval Probabilities

May 09, 2025

Browse all the Probability Interview Questions here.

Question

Let the random variable X have the density

and 0 otherwise. Find the constant k that makes this a valid probability density function. Next, find x1 and x2 such that P(X <= x1) = 0.1 and P(X <= x2) = 0.95. Finally, compute P(|X - 1.8| < 0.6).

Connect with me on X (Twitter)

Short Compact solution

To ensure the integral from 0 to 3 of f(x) dx equals 1, we solve for k and find k = 2/9.

For any c in (0, 3), the cumulative distribution function is P(X <= c) = c^2 / 9. Therefore:

x1 is found by solving x1^2 / 9 = 0.1, which gives x1 = sqrt(0.9) ≈ 0.9487
x2 is found by solving x2^2 / 9 = 0.95, which gives x2 = sqrt(8.55) ≈ 2.9240

The probability that |X - 1.8| < 0.6 translates to 1.2 < X < 2.4, so

(2.4^2 - 1.2^2) / 9 = 0.48.

Comprehensive Explanation

Finding k

A probability density function (pdf) f(x) must integrate to 1 over its domain. Given f(x) = k x for 0 <= x <= 3 and 0 otherwise, we solve:

When integrating k x from 0 to 3:

The indefinite integral of k x is k * (x^2 / 2).
Evaluating from 0 to 3, we get k * (3^2 / 2) - k * (0^2 / 2) = k * (9 / 2).

So we have k * (9/2) = 1. Solving for k:

9k / 2 = 1 => k = 2/9.

Hence the valid pdf is f(x) = (2/9) x for 0 <= x <= 3.

Cumulative Distribution Function (CDF)

For 0 <= x <= 3, the CDF F(x) = P(X <= x) is obtained by integrating the pdf from 0 to x:

F(x) = ∫[0 to x] (2/9) t dt = (2/9) * (x^2 / 2) = x^2 / 9.

When x < 0, F(x) = 0, and when x > 3, F(x) = 1, by definition of the distribution.

Finding x1 and x2 for given probabilities

We want x1 such that P(X <= x1) = 0.1. Using the CDF for 0 <= x <= 3:

x1^2 / 9 = 0.1.

Solving for x1:

x1^2 = 0.9, so x1 = sqrt(0.9) ≈ 0.9487.

Similarly, for x2 such that P(X <= x2) = 0.95:

x2^2 / 9 = 0.95.

So x2^2 = 8.55, giving x2 = sqrt(8.55) ≈ 2.9240.

Finding P(|X - 1.8| < 0.6)

The event |X - 1.8| < 0.6 is equivalent to 1.8 - 0.6 < X < 1.8 + 0.6, i.e. 1.2 < X < 2.4. We compute:

P(1.2 < X < 2.4) = F(2.4) - F(1.2),

where F(x) = x^2 / 9. Thus:

F(2.4) = 2.4^2 / 9 = 5.76 / 9 = 0.64, F(1.2) = 1.2^2 / 9 = 1.44 / 9 = 0.16,

so P(1.2 < X < 2.4) = 0.64 - 0.16 = 0.48.

Hence the probability that |X - 1.8| < 0.6 is 0.48.

Follow-up Question 1: What if the domain were [0, b] instead of [0, 3]?

If the domain of X were changed to 0 <= x <= b (for some positive b), the pdf would be f(x) = k x on that interval. We would still require:

∫[0 to b] k x dx = 1.

Performing the integral:

k * (b^2 / 2) = 1 => k = 2 / b^2.

Then the corresponding CDF for x in [0, b] would be x^2 / b^2.

To find a quantile c such that P(X <= c) = α for some α in (0,1), we solve:

c^2 / b^2 = α => c = b * sqrt(α).

Hence, the overall pattern is straightforward to generalize.

Follow-up Question 2: Is this distribution related to any known family?

The function f(x) = k x on [0, b] is a special case of the triangular-like distributions and can also be considered a scaled version of the Beta(2,1) distribution. Specifically, if Y ~ Beta(2,1) on [0,1], then Y has pdf 2y. Scaling Y by b gives X = bY, which has the pdf (2 / b^2) x for x in [0,b]. Although not always referred to as a “standard named distribution” (like normal or uniform), it can be viewed as a “Beta-type” distribution on [0,b].

Follow-up Question 3: How do we find the median or a general q-th quantile?

To find the median m, we solve P(X <= m) = 0.5, which means F(m) = 0.5. In this problem (with b=3, k=2/9):

m^2 / 9 = 0.5 => m^2 = 4.5 => m = sqrt(4.5).

In general, for a q-th quantile, we set x^2 / 9 = q. Then x = 3 sqrt(q). This approach generalizes to any percentile or quantile.

Follow-up Question 4: How would we compute E[X] and Var[X]?

Once we know f(x) = (2/9)x for x in [0,3], we can compute expectations:

E[X] = ∫[0 to 3] x f(x) dx = ∫[0 to 3] x * (2/9)x dx = (2/9) ∫[0 to 3] x^2 dx.

That integral is (2/9) * (3^3 / 3) = (2/9) * (27 / 3) = (2/9) * 9 = 2.

So E[X] = 2.

To find Var(X), we use Var(X) = E[X^2] - (E[X])^2. First we compute E[X^2]:

E[X^2] = ∫[0 to 3] x^2 * (2/9)x dx = (2/9) ∫[0 to 3] x^3 dx = (2/9) * (3^4 / 4) = (2/9) * (81 / 4) = 162 / 36 = 4.5.

Hence Var(X) = 4.5 - (2)^2 = 4.5 - 4 = 0.5.

Follow-up Question 5: What if we want P(|X - a| < d) for general a and d?

In general, P(|X - a| < d) = P(a - d < X < a + d). We would use the CDF:

F(x) = x^2 / 9, for 0 <= x <= 3, and clamp it to 0 or 1 if x is outside [0,3].

So the probability would be:

P(a - d < X < a + d) = F(a + d) - F(a - d),

provided 0 <= (a - d) < (a + d) <= 3. If those endpoints exceed the domain, we would clip them to [0, 3] to keep the probability in a valid range.

This approach is a common step in dealing with absolute-value inequalities involving random variables.

Rohan's Bytes

Discussion about this post