ML Interview Q Series: What formula would you use to calculate average lifetime value given $100/month, 10% churn, 3.5 months retention?
📚 Browse the full ML Interview series here.
Comprehensive Explanation
A common way to compute the average lifetime value (LTV) in a subscription-based setting is to multiply the average monthly revenue per user by the expected number of months a typical user remains subscribed. Although a classic theoretical formula for LTV often involves using the inverse of churn, real-world metrics (such as an empirically observed average of 3.5 months) may differ due to variations in how churn is measured or due to data limitations. Below is how to structure the formula in two different ways and reconcile the difference.
1) Simple Empirical Approach
If we trust the empirical data that customers, on average, stay for 3.5 months, then the straightforward way to calculate LTV is:
Where:
Monthly Subscription Price is the cost of the product each month (in this case, 100 dollars).
Average Number of Months Subscribed is the typical length (in months) that a user remains actively paying (3.5 months in the given scenario).
Using these numbers:
Monthly Subscription Price = 100
Average Number of Months Subscribed = 3.5
Hence:
LTV = 100 * 3.5 = 350
That means, on average, each new customer is expected to yield 350 dollars in revenue over their entire subscription period (based on the observed average of 3.5 months).
2) Theoretical Churn-Based Formula
Another well-known formula in subscription models links LTV to the churn rate. Assuming a constant churn rate r (in decimal form) and a steady monthly price P, one might use:
Where:
P is the average monthly revenue per user (here, 100 dollars).
r is the monthly churn rate (here, 0.10).
From this theoretical perspective:
LTV = 100 / 0.10 = 1000
This result implies that if 10% of subscribers leave each month (in a purely geometric or exponential survival sense), the expected duration for a single customer would be around 1/r = 10 months, leading to an LTV of 1000 dollars. This is higher than the empirically observed average of 3.5 months (which yields an LTV of 350 dollars).
The discrepancy between these two approaches often arises because:
Churn can fluctuate over time or be measured in inconsistent ways.
Real user behavior might show that many customers churn faster at the beginning (onboarding period) and then churn more slowly afterward, or vice versa.
The “10% churn” figure might be averaged over certain cohorts or time periods that do not perfectly reflect the exact distribution of user lifetimes.
There may be non-recurring factors (like marketing promotions, seasonal usage, or external market conditions) that skew the simple theoretical formula.
3) Reconciling the Observed vs. Theoretical LTV
Interviewers often look for your ability to recognize real-world complications:
In practice, if the data consistently shows an average of 3.5 months, you might prefer the empirical approach to set near-term strategy.
For long-term projections or for analyzing potential improvements in retention, some companies use the churn-based formula and assume stable churn behavior across a broader time horizon.
Discrepancies raise important questions about measurement methods, the reliability of churn data, and whether the churn rate is truly constant over a user’s entire lifetime.
Potential Follow-Up Questions
Why might the churn-based formula and empirical data not match?
One reason is that churn might not remain constant each month. In reality, churn can vary significantly, especially early in the customer lifecycle when users might be more prone to cancel. If the 10% figure is simply an average or is measured incorrectly (for instance, including partial months, or ignoring reactivations), it will deviate from the idealized assumption of a perpetual, uniform churn. Moreover, the empirically observed 3.5 months might come from a limited timescale (e.g., the first year of the company’s operation), which may not reflect longer-term user behavior.
How do you handle discounting or the time value of money in LTV calculations?
If you want to account for the fact that revenue earned in later months is worth slightly less in today’s dollars, you can incorporate a discount factor (d) per month. A common formula for that would look like a discounted sum of the expected monthly cash flows:
Where:
(1 - r)^{t-1} is the probability of the user still being subscribed at month t (assuming constant churn r).
(1 + d)^{-t} is the discount factor for month t, with d representing the monthly discount rate.
In many short-term analyses (especially if the product has a relatively low monthly price and the expected lifetime is only a few months), the discount factor is often omitted for simplicity, because the net present value difference is small over such a short timeframe.
How do you approach forecasting if the SaaS product or pricing changes?
If the company plans to introduce new product tiers, promotional pricing, or major product changes that affect retention, you need a more flexible model for LTV. This might involve segmenting users based on new vs. existing pricing, usage level, or behavior, then applying churn and retention assumptions that vary by segment or by time period. The simple formula P / r might no longer hold if the churn rate or the subscription price P changes significantly over time.
Could the 10% churn indicate something different, such as a cohort-based churn rather than a product-level churn?
Yes. Sometimes “monthly churn” can be calculated by looking at specific cohorts (e.g., customers who joined within a particular month or quarter) and then taking an average. If that average is not representative of the entire user base, it could artificially inflate or deflate the overall churn figure. The difference between cohort-based analysis and a more aggregated approach can be large if the product is relatively new or if customer behavior changes drastically once the product matures.
Is there a benefit to using probability-based models to compute the LTV?
Advanced probability models (like Markov chains or survival analysis techniques) provide a more nuanced view of user behavior, especially if churn rates vary with tenure. Instead of a single average churn, such models can estimate the probability of a user remaining subscribed as a function of their time since sign-up. Although this typically requires more data, it can offer superior insights into LTV if the user lifecycle is highly dynamic.
Key Takeaways
The “true” LTV depends heavily on how churn is measured and whether it remains consistent over time.
A commonly stated formula LTV = P / r can be misleading if the churn rate r is not truly constant or if the observational period is too limited.
An empirical measure (e.g., average length of subscription observed in real user data multiplied by the monthly price) is often a simpler and potentially more reliable representation for near-term decisions, as in this scenario with 3.5 months of average subscription duration.
In more advanced or long-term financial projections, incorporating discount factors and refined churn modeling can enhance accuracy.
Below are additional follow-up questions
How do you handle scenarios where churn is not uniform across different stages of the subscription (e.g., a steep drop-off after trial, then stabilization)?
A steep initial drop-off may cause the simple assumption of a flat churn rate to be inaccurate. If a significant portion of new users leaves immediately after a trial (or in the first month), the effective churn for that period could be dramatically different from later months. In such cases, one practical approach is to break down the customer journey into distinct phases:
Trial phase or initial month. Often associated with higher churn as new users decide whether they want to continue.
Stabilization phase. Customers who remain after the trial or first month may exhibit a more predictable churn pattern.
Long-term retention phase. After several billing cycles, churn rates might stabilize further, sometimes even dropping if the product becomes integral to the customer’s workflow.
By segmenting user lifetimes according to these phases, you can calculate separate churn rates for each stage and incorporate them into a multi-phase LTV model. This helps avoid overestimating or underestimating average lifetime due to a single averaged churn figure that masks high-churn and low-churn stages.
How do you account for partial churn or seat-based cancellations when customers can reduce seats instead of fully canceling?
In many B2B SaaS models, a “churn event” does not always mean a complete subscription cancellation. Users can downgrade seat counts, remove certain modules, or move to a lower tier. This introduces partial churn (sometimes referred to as “logo churn vs. revenue churn”):
Logo Churn. Whether a customer cancels the subscription entirely (i.e., 100% churn for that account).
Revenue Churn. A reduction in spending, often due to seat decreases, tier downgrades, or partial cancellations.
For seat-based models, you may find that while a few logos fully churn, more accounts incrementally reduce seat counts over time. Hence, you track both the number of logos lost and the total monthly recurring revenue (MRR) lost. LTV calculations can become more nuanced:
Track MRR per seat (or per module) separately.
Aggregate LTV by estimating the average revenue per seat and multiplying by the expected number of seats over the account’s lifetime.
Factor in expansions (additional seats or modules purchased) as negative churn, potentially offsetting losses from partial cancellations.
This granular approach often yields more accurate forecasts for seat-based SaaS products.
How would negative churn (net expansion) affect lifetime value calculations?
“Negative churn” occurs when upsells and cross-sells in existing accounts surpass the revenue lost through cancellations and downgrades. Instead of seeing a net monthly revenue reduction, you see revenue growth within your existing customer base. This complicates a standard LTV model that assumes a uniform churn rate, because the effective churn can be zero or even negative for some segments. A more accurate representation might involve explicitly modeling expansions:
Base subscription revenue. Expected monthly revenue if no upgrades or downgrades occur.
Expansion revenue. Expected additional purchases, seat expansions, or cross-sells within an account.
Churn. The portion of revenue lost from cancellations or downgrades.
In some cases, you can define a net retention rate (NRR). When NRR exceeds 100%, it indicates negative churn in terms of revenue. A possible formula for monthly net retention rate (in decimal form) is:
Where MRR at Start is the total monthly recurring revenue from existing customers at the beginning of the month, Expansion captures all upsells, Contraction is partial revenue loss (like seat reductions), and Churn is total cancellation. If NRR is consistently above 1.0, LTV can be significantly higher than a naive P / r formula might suggest because expansions offset or exceed standard churn.
How do you ensure accurate measurement of churn rate when subscriptions may expire at different points in the month or billing cycle?
One pitfall in measuring churn is dealing with mid-month cancellations, prorated refunds, or contracts that don’t follow a strict monthly cycle. If your churn metric is based on a simple end-of-month snapshot, you might misrepresent actual retention behavior. Potential strategies include:
Cohort-based churn. Tracking customers from the moment they subscribe and seeing how many remain after consecutive billing cycles. This reduces confusion around partial months.
Daily or weekly active subscription checks. Especially relevant if your billing is usage-based or if mid-cycle cancellations are common.
Pro-rating revenue. For mid-month cancellations or partial refunds, you might use daily or weekly pro-rations to track actual usage-based revenue, then convert it to a monthly equivalent for consistent churn calculations.
Accuracy in churn measurement is critical. Even small miscalculations can skew your LTV estimates substantially when multiplied across thousands of customers.
When acquiring new customers is very expensive, should customer acquisition cost (CAC) be factored into the LTV formula directly?
Many SaaS businesses juxtapose LTV with CAC to gauge the viability of their business model. LTV alone tells you the revenue potential per user, but not the profit if you’re spending heavily on marketing or sales to get each user. Often, a ratio such as LTV:CAC is used to measure how quickly you recoup your acquisition spending. A typical benchmark is that you want an LTV:CAC ratio of 3:1 or higher to demonstrate a healthy return on acquisition investments.
You can incorporate CAC either directly by subtracting acquisition costs from the revenue portion of the LTV calculation (to see net lifetime profitability) or by using it as a separate metric. Although subtracting CAC from LTV can be illustrative, many companies prefer the ratio approach for easier comparison across different acquisition channels and marketing campaigns.
How do cyclical usage patterns (e.g., a seasonal product that spikes in certain months) impact the interpretation of churn and LTV?
In a seasonal business, churn might appear high in low-demand months but moderate or even zero in peak months. This skews naive monthly churn calculations. If your product is used predominantly in certain seasons (for example, a product used by retail businesses mostly in the holiday season), the effective subscription length could appear shorter or longer depending on how you measure. To adjust:
Seasonal normalization. Instead of a simple average monthly churn, measure churn across a full seasonal cycle (e.g., 12 months) to capture both peak and off-peak behaviors.
Segment by industry or use case. If certain customer segments are only active during specific seasons, your churn numbers become more complex. Some may “cancel” in off-peak months and re-subscribe later, which might be a “reactivation” rather than a net-new acquisition.
Adjust LTV. Either smooth out revenue across the year or use a more dynamic forecast model that accounts for seasonal peaks and troughs.
Accounting for seasonality prevents overestimating or underestimating churn when many subscriptions naturally pause or churn outside of the primary usage period.
How do you incorporate overhead or operational costs (like customer support, platform maintenance) into the LTV framework?
Typically, LTV focuses on the revenue side per user (or per account). However, to assess profitability per user, you might allocate a portion of overhead and operating costs. This could involve:
Customer support cost. Estimating an average cost per user for onboarding and ongoing support.
Infrastructure or usage-based cost. If your SaaS includes hosting or data processing fees, you can estimate the average monthly cost allocated to each user.
Maintenance or update costs. Ongoing R&D and maintenance overhead can also be considered.
If these costs vary widely by usage level, you might adopt a more granular LTV approach, factoring in not just the revenue but the net contribution margin (revenue minus the direct variable costs) that each user brings over their lifetime.
If a significant portion of revenue comes from expansions or cross-sells, do you still need to measure churn in the traditional sense?
Yes, because churn remains a vital signal of whether your core product is retaining users. However, expansions, cross-sells, and other add-ons can change the focus from user-level churn to account-level or revenue-level retention. Even if a user cancels one product line, they might still remain subscribed to another product or expand usage in other areas. You then have to:
Track multiple revenue streams (baseline subscription, expansions, cross-sells).
Separate “partial churn” (canceling one product line) from “full churn” (canceling all subscriptions).
Assess the net retention rate across all products to understand whether expansions outpace losses.
This comprehensive view often aligns better with how many SaaS enterprises are structured, especially when they offer broad product suites.
How might LTV calculations change if a usage-based billing model is layered on top of a fixed subscription fee?
Hybrid pricing models that combine a base subscription with usage-based (metered) charges can complicate LTV estimates. The variability in monthly revenue per user can render a static monthly fee assumption inaccurate. To deal with this:
Segment revenue into fixed base vs. variable usage. Model the churn and average usage patterns separately.
Analyze usage distributions. Some customers may pay only the base fee, while power users might incur multiple times that fee in usage charges, which can vastly alter their LTV.
Monitor usage-based churn. Even if a user doesn’t fully cancel, they could reduce usage significantly. That partially affects the variable revenue without reflecting in a typical churn formula.
As usage-based pricing gains popularity, especially in sectors like infrastructure (e.g., cloud computing), data analytics, or API-based services, a more sophisticated LTV model is necessary to accurately capture fluctuations in monthly revenue.
How do you handle wide variations in customer size (e.g., small businesses vs. large enterprises) when computing an “average” LTV?
Mixing small and large customers in a single LTV can obscure meaningful differences in churn, retention, and growth patterns:
Segment by customer type. Compute LTV separately for different segments like SMB, mid-market, or enterprise. Enterprise accounts might pay higher fees but also have longer sales cycles and different churn behaviors.
Cohort analysis. Even within one segment, you can examine cohorts based on join month or usage patterns, revealing differences in churn or expansion trends.
Weighted averages. If you must report a single LTV figure, weighting by revenue or by the proportion of each segment can give a more realistic view than a simple unweighted average.
This segmentation approach helps you avoid making strategic decisions (such as marketing spend or product feature prioritization) based on an “average” that doesn’t accurately reflect the underlying distribution of customer types.