ML Interview Q Series: How would you evaluate if extra driver pay during peak hours meets consumer demand on a delivery platform?

May 05, 2025

📚 Browse the full ML Interview series here.

Comprehensive Explanation

One straightforward approach to evaluate how effectively the extra compensation meets consumer demand is to measure relevant performance metrics before and after the policy is introduced. These metrics may include average delivery time, driver availability, order acceptance rates, consumer satisfaction ratings, and overall delivery fulfillment rates during peak times. When a company increases pay during peak periods, it generally aims to ensure an adequate supply of drivers to meet elevated demand promptly. Hence, analyzing each metric in depth can clarify whether the policy has met its intended goals.

Connect with me on X (Twitter)

A common strategy to obtain reliable estimates of the policy’s true impact is to run a controlled experiment or adopt a quasi-experimental design such as a difference-in-differences analysis if a randomized control trial is not feasible. For instance, consider using one region or time window as the “treatment” group where the pay incentive is offered, and another, very similar region/time window as the “control” where no pay incentive is applied.

It is possible to formalize the difference-in-differences measurement in a central formula:

Where:

Y_treatment, post is the average performance metric in the treatment group after introducing the extra pay.
Y_treatment, pre is the average performance metric in the treatment group before introducing the extra pay.
Y_control, post is the average performance metric in the control group over the same time period as Y_treatment, post.
Y_control, pre is the average performance metric in the control group over the same time period as Y_treatment, pre.

The performance metric Y might be something like “average fulfillment time,” “percent of orders completed,” or “driver acceptance rate.” By subtracting the before-and-after difference observed in the control group from the corresponding before-and-after difference in the treatment group, this method attempts to remove the effects of other simultaneous factors (e.g., general market shifts, seasonalities, or external events).

In addition to difference-in-differences, researchers might also consider a pure A/B test if random assignment is possible. For instance, the platform could randomly select half of the drivers to receive the incentive during peak times and half not to, then compare key outcome measures between the two groups. While pure randomization is ideal, in practical operations, certain logistical constraints or fairness considerations might complicate it.

If truly randomized experiments are not feasible, even simpler observational before-and-after comparisons (i.e., a pre-post analysis) can be utilized, but they introduce more risks of confounding factors. There might be seasonal trends or concurrent changes, so it is critical to isolate the influence of extra pay from any other external shifts. Carefully analyzing historical data around the same period in prior weeks or months could help mitigate some of these biases.

The choice of specific success metrics should align with the ultimate business objective. If the primary concern is speed of delivery, measuring average delivery time can be crucial. If the aim is to ensure high fulfillment rates for all incoming orders, then order acceptance rates and fulfillment rates are vital. The financial implications must also be examined, such as whether the additional cost of peak pay is offset by gains in consumer satisfaction, increased tip amounts, or higher platform usage leading to more revenue.

How the Data Could be Analyzed

The data analysis might involve looking at the distribution of driver supply throughout the day, focusing on the peak windows. If driver coverage remains inadequate even after introducing the bonus, further increases or adjustments to the incentive structure might be necessary. Conversely, if the data shows that supply now exceeds demand, the extra cost might not be justified. Another key analysis could involve consumer feedback on wait times before and after the policy, consumer satisfaction ratings, and potential changes in retention of both drivers and customers.

It is often advisable to blend quantitative and qualitative data. Surveying drivers about whether the incentive encourages them to work more during peak periods can provide insights into the mechanism behind any observed changes. Meanwhile, consumer feedback on order speeds or experience complements the numeric measures.