ML Interview Q Series: Newsvendor Model for Cost-Optimal Electricity Supply Forecasting

May 26, 2025

Browse all the Probability Interview Questions here.

Imagine that each year, PG&E must predict how much electricity to deliver to a particular town. If the supply is insufficient, blackouts can occur, but if too much is provided, unnecessary costs will be incurred. What approach could be used to determine the optimal amount of electricity to provide?

Comprehensive Explanation

A highly effective way to address such a forecasting and supply-level determination problem is to treat it as a form of the “newsvendor problem” or a general cost-minimization forecasting approach. The central insight is to balance the cost of under-supplying electricity (leading to power outages and economic or social penalties) against the cost of over-supplying (resulting in unused capacity and monetary waste).

In practical terms, if the random variable for demand is denoted by X, then one can specify a probability distribution F for X. The model’s objective is to find a supply quantity Q that minimizes the expected total cost of under-supply and over-supply. Suppose the cost for being one unit short is C_u, and the cost for having one unit in surplus is C_o.

Under fairly standard assumptions, the well-known solution to this minimization problem is to pick Q such that the probability that demand is less than or equal to Q is given by the ratio C_u / (C_u + C_o). In more formal notation:

Here, Q^* is the optimal level of supply, F is the cumulative distribution function (CDF) of the random demand X, and F^{-1} is the inverse of this CDF. The expression C_u / (C_u + C_o) is known as the critical fractile. If, for example, you have a higher penalty for being short (C_u is large) relative to over-supplying (C_o is smaller), the critical fractile will be larger, suggesting you should supply more capacity to reduce the likelihood of a shortage.

Inline explanation:

C_u is the cost (per unit) of providing less electricity than the actual demand. This can include the financial penalties of outages, the reputational damage, and any other losses associated with under-supply.
C_o is the cost (per unit) of supplying more electricity than required, which might include wasted costs of generation or the opportunity cost of allocating that capacity elsewhere.
X is a random variable representing the future electricity demand. In real-world applications, X might be modeled with a distribution F estimated from historical electricity consumption data, population growth, economic indicators, and even weather patterns.

In a time-series context, one might begin by building a forecasting model (such as an ARIMA model, a gradient boosting regression, or a neural network like an LSTM) to predict the demand distribution. From this model, you either directly obtain or approximate the cumulative distribution function F. You can then determine the supply Q that meets the desired service level fraction, C_u / (C_u + C_o).

In many real-world scenarios, it is not enough to have a single-point forecast. Instead, you need a full distribution to understand variability and possible extremes of demand. Constructing a probabilistic forecast, or using historical data to estimate empirical or parametric distributions, is critical for deciding Q^* accurately.

A standard pipeline might look like: Train a time-series model on historical consumption data → produce a forecast distribution for next year’s demand → compute Q^* based on the cost trade-offs of under-supply vs. over-supply.

Below is a simplified Python snippet that demonstrates how one might approach this problem, using a hypothetical distribution (for example, a normal approximation) to illustrate the concept of finding the optimal supply Q.

import numpy as np
from scipy.stats import norm

# Suppose we have estimated that the demand is normally distributed
# with mean mu and standard deviation sigma
mu = 1000   # Example mean demand
sigma = 200 # Example standard deviation

# Under-supply and over-supply costs (per unit)
C_u = 5     # e.g., 5 USD cost per kWh short
C_o = 1     # e.g., 1 USD cost per kWh surplus

# Compute critical fractile
critical_fractile = C_u / (C_u + C_o)

# Inverse of the cumulative distribution function (CDF) at the critical fractile
Q_star = norm.ppf(critical_fractile, loc=mu, scale=sigma)

print(f"Optimal supply level: {Q_star:.2f}")

In this snippet:

norm.ppf is the inverse of the normal distribution’s CDF.
critical_fractile is the fraction that trades off the cost of under- vs over-supply.

Once Q_star is found, you have the optimal amount of electricity to produce and supply, given the assumed distribution of demand.

In practice, you may refine this approach by incorporating: Seasonal patterns: Factor in weather-related demand spikes or dips (heating in winter, air conditioning in summer). Trends: Address underlying growth in electricity consumption due to population or economic changes. External events: Consider major industrial expansions or special events that increase short-term demand.

Furthermore, you would estimate demand variability over time by analyzing day-of-week effects, holiday usage patterns, and exogenous factors such as demand response programs or distributed energy resources. Accurately estimating F is pivotal, because an incorrect demand distribution leads to a suboptimal Q^*.

How Would This Be Implemented at Scale?

At a larger utility scale, one might rely on a rolling forecast approach, updating demand forecasts monthly or quarterly to account for new data. A sophisticated pipeline can incorporate machine learning algorithms that handle both wide historical time series data and real-time signals (like weather forecasts, local generation from renewables, or major industrial expansions). The cost parameters C_u and C_o may also be time-varying, influenced by contractual penalties, dynamic electricity markets, or operational constraints on generation capacity.

How Would You Evaluate This Model?

Evaluation involves simulating the model’s predictions on historical data or through a Monte Carlo simulation using a distribution for X. You compare various scenarios of cost combinations (C_u vs. C_o) and measure the model’s performance in terms of total expected cost. If repeated simulation under different assumptions leads to robust performance in Q^*, then you have a resilient forecast model.

What If We Do Not Know the Demand Distribution?

If the distribution is unknown, we might employ non-parametric methods or bootstrapping on historical data. In the case of uncertain or changing distributions, Bayesian approaches or dynamic online learning algorithms can adapt the probability distribution as new data arrives. Over time, the model’s predicted distribution F should become more accurate, refining Q^* further.

Follow-Up Questions

Could We Incorporate Seasonality or Cyclical Behaviors in the Modeling?

Yes. Seasonality can be built into the time series forecast model (ARIMA with seasonal terms, SARIMAX, or LSTM with exogenous features). The distribution F then reflects seasonal variations in demand, leading to a time-dependent Q^* for each relevant period.

How Do We Handle Rare But High-Impact Events?

One strategy is to stress-test your model using extreme demand scenarios. This could involve heavy-tailed distributions (like a Pareto or generalized extreme value distribution) for modeling spikes in electricity usage. By incorporating these tails, you ensure you are better prepared for outliers in demand.

**Is It Always Appropriate to Use a Single Q^*?**

Not necessarily. You might decide on multiple tiered solutions, such as a baseline supply plus additional on-demand capacity that can be quickly ramped up (e.g., peaker plants or stored energy). The single Q^* solution is most relevant to scenarios where the supply must be decided in advance with limited flexibility.

Would Neural Networks Outperform Traditional Methods in This Task?

Neural networks, particularly LSTM or Transformer-based models, can capture complex temporal patterns in demand data. However, simpler statistical or gradient boosting methods may suffice if demand patterns are not too complex or if you have limited data. Model performance should be driven by empirical evidence—comparing forecast accuracy and cost outcomes across multiple model classes.

How Do We Balance Model Complexity with Interpretability?

At a utility scale, interpretability is crucial because decisions often require regulatory approval and explanation to stakeholders. While deep neural networks can outperform simpler models in certain contexts, the black-box nature of these methods might be a drawback. Many utilities prefer methods that provide transparent reasoning for supply decisions, or they combine advanced models with post-hoc interpretability measures.

What If Data Is Insufficient?

In scenarios with limited demand history, you might rely on data from similar towns or external benchmarks. Another option is to start with expert judgment or industry standards and update the model’s distribution F as new data arrives, effectively using a Bayesian updating or an online-learning approach.

By carefully choosing the cost parameters, constructing an accurate demand distribution, and solving for Q^* in the newsvendor-inspired framework, a utility like PG&E can balance the penalties of outages with the costs of wasted electricity.

Below are additional follow-up questions

How Can We Handle Slow-Varying Shifts in Demand Over Multiple Years?

A significant challenge emerges when demand patterns change gradually due to economic growth, demographic shifts, or technology adoption (for example, widespread electric vehicle usage). If the underlying demand distribution evolves slowly, a single stationary distribution F might no longer hold. In that case, one strategy is to employ a sliding-window approach, where you regularly update the distribution parameters (for instance, mean and variance for a normal model) based on the most recent data. This way, older data that may no longer reflect current usage patterns is down-weighted or discarded.

In practical deployments, you might also build a hierarchical time-series model (e.g., one that tracks multiple regions or consumer classes) and aggregate these sub-forecasts to capture underlying trends. By leveraging domain knowledge about factors like policy changes or infrastructure projects, you can proactively adjust your model to accommodate slow but consistent shifts.

Potential Pitfall: If your method updates too slowly, you remain behind the true demand distribution. If it updates too quickly, you risk “chasing noise” and overfitting to transient fluctuations. Tuning update frequency or discount factors becomes a balance between responsiveness to real shifts and resilience against short-term anomalies.

What Are the Implications of Demand Uncertainty in Real-Time Markets?

Utilities increasingly operate in deregulated environments where surplus electricity can be sold on real-time markets or day-ahead markets. If you over-supply relative to local demand, you may still recuperate some costs by selling excess power elsewhere. On the other hand, if you under-supply, you might cover shortages by buying additional electricity on the spot market—often at a higher price.

To integrate this dynamic into the newsvendor approach, you modify cost parameters (C_u and C_o) to factor in real-time market revenues or costs. Surplus no longer becomes a total loss if it can be sold, and shortfalls might be covered if the penalty of purchasing extra supply is not prohibitively large.

Potential Pitfall: Market prices fluctuate significantly. If you rely on uncertain market predictions for your cost offsets, you introduce another layer of forecasting complexity. Pricing volatility might warrant risk-averse strategies (like robust optimization) rather than a purely expected-cost-minimization approach.

How Would You Deal with Large, Abrupt Shifts in Demand?

Beyond slow-varying trends, there may be sudden or structural changes—like the closure of a major factory, a pandemic-related shift to remote work, or an extreme weather event that causes a sudden surge in demand. In such cases, older data becomes partially irrelevant. You could implement a change-point detection method: whenever the model’s forecast errors exceed a certain threshold, check for a structural break. If a break is detected, re-estimate the demand distribution with data after that breakpoint.

Potential Pitfall: Overzealous change-point detection might misinterpret random fluctuations as structural changes, leading to frequent distribution resets and unstable supply decisions. You need robust statistical tests or domain-informed triggers to confidently declare an actual regime change.

How Do We Handle Correlated Errors When Combining Multiple Forecasting Models?

In some organizations, multiple forecasting approaches may be combined for more accurate predictions. However, if these models all share similar structural assumptions or draw on overlapping data, their errors can be correlated. Simply taking an average or ensemble might incorrectly narrow uncertainty. You need to estimate not only each model’s bias and variance but also the pairwise correlations among forecast errors.

In practice, you might fit a meta-learner that weighs each model’s prediction differently based on historical performance. Alternatively, you can use a Bayesian model averaging framework that explicitly accounts for correlations in the likelihood function.

Potential Pitfall: Failing to incorporate error correlations can lead to an underestimation of overall demand variability, making Q^* appear more confident than it truly is. This could result in either frequent shortfalls or systematic over-supply.

How Would You Approach the Risk of Very High Demand Spikes That Occur Infrequently?

Many real-world demand patterns have “fat tails,” meaning the probability of extremely high values is greater than under a simple normal distribution. A normal-based model might drastically underestimate the likelihood of rare but massive spikes, leading to repeated under-supply in those extreme situations. One response is to use heavy-tailed distributions (like Student’s t, Lognormal, or Generalized Pareto) for modeling demand.

For large spike risks, you might also employ Value at Risk (VaR) or Conditional Value at Risk (CVaR) techniques from financial risk management. These methods explicitly focus on tail behavior. By calibrating Q^* to protect against, say, the 95th percentile of demand, you reduce exposure to crippling outages.

Potential Pitfall: If you overestimate tail probabilities, you will carry excessive surplus costs. Balancing the need for reliable power against the cost of building or contracting for unused capacity can be politically and economically challenging.

How Do Regulatory or Environmental Constraints Impact the Optimal Solution?

Utility companies often have obligations or caps imposed by regulatory agencies. For instance, they may be subject to an emissions limit, constraints on purchasing certain fuel sources, or mandated usage of renewable energy. These constraints can shift your cost structure or availability of generation resources. You might need a constrained optimization approach that satisfies these additional conditions while still trying to minimize under-supply and over-supply costs.

Potential Pitfall: A single closed-form solution like Q^* = F^{-1}(C_u/(C_u + C_o)) might no longer be directly applicable under multiple constraints. You may need to formulate the problem as a more complex constrained optimization (possibly integer programming if discrete units of electricity production are involved). Overlooking regulatory constraints can lead to non-compliance penalties that significantly alter your effective cost parameters.

How Do You Incorporate Distributed Energy Resources (DERs) Into the Model?

Increasingly, end-users deploy rooftop solar, battery storage, or other DERs that can feed electricity back to the grid or shift demand away from peak hours. This adds complexity: the utility’s net demand is now the town’s total consumption minus the power generated and stored locally. The unpredictability of solar or wind resources can create greater volatility.

One approach is to incorporate probabilistic forecasts for DER output into the overall load model. The distribution F becomes a convolution of baseline consumption and net DER generation. Using scenario-based or simulation-based methods, you can approximate the overall demand distribution for the utility.

Potential Pitfall: DER forecasting can be more challenging than classic demand forecasting, given weather dependency, localized conditions, and user behavior (for instance, deciding when to charge or discharge home batteries). Ignoring DER volatility might lead to systematic mismatches in the final supply calculation.

Can We Integrate Predictive Maintenance or Infrastructure Reliability Concerns?

Power generation and transmission infrastructure require periodic maintenance. Unexpected breakdowns in power plants or transmission lines reduce available supply capacity or increase effective costs. You might model supply capacity as a random variable with its own probability distribution, reflecting the chance of a component failure. Then, instead of focusing solely on demand, you consider the joint distribution of demand and supply capacity to find an optimal Q^* that accounts for possible downtime.

Potential Pitfall: If you assume constant reliability (e.g., ignoring aging infrastructure or ignoring times when multiple plants might be down for maintenance simultaneously), you underestimate the risk of forced outages. This underestimation leads to an overly optimistic supply plan.

How Would You Verify the Model Is Not Biased Over the Long Term?

Even if the model performs well initially, systematic biases can emerge over time—perhaps from missing evolving social behaviors or technological changes. You can check bias by aggregating forecast errors over multiple periods to see if they consistently lean negative or positive. If a pattern emerges, it indicates a structural mismatch between your model assumptions and real behavior.

Potential Pitfall: A mild bias might remain undetected if you evaluate performance only month by month. Taking a rolling annual assessment can reveal that small but persistent overestimation (or underestimation) leads to significant cumulative cost or reliability issues.

How Do We Manage Communication and Stakeholder Buy-In for Probabilistic Decisions?

A major real-world challenge is that many stakeholders (municipal governments, regulators, or the general public) are uncomfortable with probabilistic forecasts or accept some risk of outages. They often demand certainty: “We should never have a blackout.”

In practice, absolute zero risk is impossible without allocating enormous resources. Utilities typically need to communicate that they’ve optimized supply based on an acceptable risk threshold—tying the cost of perfect reliability to the realities of resource constraints. Educating stakeholders on the newsvendor logic (the trade-off between the cost of under-supply and over-supply) and how Q^* is derived can help with transparency.

Potential Pitfall: Oversimplifying the explanation to stakeholders might create misunderstandings, especially if an adverse event occurs. Clear, data-driven justification and consistent reporting of outcomes can help maintain trust.

How Do We Prepare the Model for Extreme Climate Variability or Climate Change?

Climate change can significantly affect long-term demand for electricity (for heating, cooling, or new climate-related infrastructure). Extreme weather events, like heat waves or cold snaps, can also drive demand spikes. Traditional models might assume stationary historical patterns. Forward-looking models incorporate climate projections, scenario analysis, and robust optimization techniques that simulate multiple future climate pathways.

Potential Pitfall: Relying solely on historical climate data may lead to forecasts that are no longer valid under more frequent extreme weather scenarios. Without an updated or stress-tested approach, you risk being unprepared for unprecedented demand peaks.

How Do We Scale This Method for Multiple Geographic Regions With Shared Generation Assets?

Often, a utility manages several regions that draw power from a pool of generation assets. The utility must decide how much capacity each region receives in a coordinated manner. In this multi-region scenario, you look at the joint demand across regions, which might or might not be correlated. If one region’s demand is low while another’s is high, the surplus capacity from the former can help meet the latter’s shortfall.

This leads to a multi-dimensional optimization problem that extends the newsvendor logic across correlated random variables. A standard technique is to use linear or stochastic programming frameworks, imposing constraints on total generation capacity, transmission limits, and cost structures.

Potential Pitfall: If you treat each region in isolation, you might overbuild capacity in every region. Conversely, if you overly rely on correlation assumptions (like perfect negative correlation that rarely happens in reality), you could under-supply. You must carefully validate these correlation assumptions with historical data and stress testing.

How Do We Ensure Data Quality and Integrity in a Large-Scale Forecasting System?

Demand forecasts hinge on accurate measurement of electricity usage. Meters might fail, data feeds can have missing entries, or measurement intervals might differ across systems. In a large-scale environment, data might come from various sources—some real-time, others batched. Ensuring consistent, high-quality data is non-trivial.

Common strategies include data validation pipelines that flag outliers or missing values, interpolation methods for short gaps, and robust anomaly detection algorithms. For time-series forecasting models, it’s essential to maintain a clean, synchronized dataset.

Potential Pitfall: Systematic measurement errors (like a batch of faulty meters) can corrupt the demand distribution estimate. If not corrected, the model can drastically miscalculate Q^*. Ongoing auditing, hardware maintenance, and data reconciliation processes can mitigate these risks.

Rohan's Bytes

Discussion about this post