ML Interview Q Series: How can we measure how teens' engagement on Facebook shifts once their parents become members of the platform?
📚 Browse the full ML Interview series here.
Comprehensive Explanation
One way to interpret “evaluating the effect on engagement” is to measure, in a causal sense, how teens’ use of Facebook changes once their parents have accounts. The main challenge is to isolate whether the observed change in a teen’s behavior is truly due to their parents joining (and not due to other factors, such as evolving social trends or platform-wide changes in features).
Observational Approach: Before-and-After Comparison
A direct method is to look at teens before and after their parents join, measuring how certain engagement metrics change for these teens. For example, you can compute average daily active minutes (or number of posts, likes, shares, etc.) before the event (parents’ sign-up) and compare it to the same metrics after. However, a simple before-and-after comparison can be confounded by many time-related effects. Perhaps new platform features were launched simultaneously, or there was a seasonal effect (e.g., summer vacation), leading to changes in usage patterns independent of parental presence.
Difference-in-Differences (DiD) Approach
A more robust way is to employ a difference-in-differences framework. The goal here is to compare the difference in engagement for a group of teenagers whose parents joined Facebook to a matched or similar control group of teenagers whose parents have not joined, before and after the parents’ sign-up event.
If we define Y_treated_before as the average engagement of teens whose parents eventually join, prior to the parents joining, and Y_treated_after as that group’s average engagement after the parents have joined, then we measure the changes in both time periods for them as well as for a control group (Y_control_before, Y_control_after). The difference-in-differences estimate can be expressed through the following formula:
In this formula:
Y_treated_after
is the average engagement among teens whose parents joined, measured after the joining event.Y_treated_before
is the average engagement among the same group of teens before their parents joined.Y_control_after
is the average engagement among teens whose parents did not join (during the same time window).Y_control_before
is the average engagement among those same control teens in the period before the event.
A difference-in-differences approach attempts to subtract out any general time trends (captured by changes in the control group), thus isolating the effect that’s specifically attributable to parents joining.
Matching and Controlling for Confounding
Because parents’ decisions to join are not random, you should match on relevant teen characteristics. For instance, you can match teens on:
Historical engagement levels
Age or grade level
Geographic region
Device usage preferences
This ensures that the control group’s teens are as similar as possible to the teens who end up having parents join. If done appropriately, the main difference between these two groups is the parental sign-up event itself.
Potential Metrics for Engagement
Daily active minutes
Frequency of logins
Number of posts or comments
Reactions or “likes” given
Time spent watching videos on the platform
Each metric could be analyzed separately to see if the observed effect is consistent across different facets of engagement.
Statistical Modeling
You could also employ a regression-based approach. A simplified version might look like:
import statsmodels.api as sm
import pandas as pd
# Suppose we have data in a pandas DataFrame 'df'
# Columns might include: teen_id, time_period, parent_joined, control_vars, engagement_metric
# Create a design matrix
X = df[['parent_joined', 'time_period', 'some_interaction_term', 'control_vars']]
y = df['engagement_metric']
X = sm.add_constant(X) # Adds an intercept to the model
model = sm.OLS(y, X).fit()
print(model.summary())
In a difference-in-differences style regression:
parent_joined
is an indicator for whether the parents of the teen have an account.time_period
can be an indicator for after vs. before the parents joined.You might include an interaction term (e.g. parent_joined * after_event) to estimate the effect of parental presence on teen’s engagement.
By interpreting the coefficient on that interaction term, you can gauge how engagement shifts specifically because the parents joined.
Practical Implementation and Potential Pitfalls
Data Gaps: Some parents might sign up for a short time, never use it, or create multiple accounts. Ensuring data quality is key.
Drop-off from Both Teens and Parents: If a teen stops using the platform around the time the parent joins, you need to ensure this drop-off is measured correctly (and not just missing data).
Simultaneous Events: Other platform changes (e.g., new features) might coincide with parents joining. This can bias results unless carefully controlled.
Selection Bias: Parents might join because the teen invited them or because of a life event, leading to changes in teen usage for reasons unrelated to mere parental presence.
Follow-up Questions
How do you address confounding variables that are not captured in the data?
Many real-world factors might correlate with both a parent deciding to join Facebook and a teen’s changing usage behavior (for instance, if a teen moves to a new school). One approach is thorough feature engineering to proxy these factors (e.g., capturing big life changes from usage patterns). Another approach is using causal inference frameworks such as instrumental variables if an appropriate instrument can be found (though this is rare in such user-behavior contexts). The key principle is to gather as many relevant variables as possible and match or control for them to reduce omitted variable bias.
Could you conduct an A/B test where some parents are invited while others are not?
A formal A/B test might be challenging ethically and practically, because you cannot typically control who invites the parents or how parents decide to join. Additionally, turning away parents artificially would be an odd intervention. However, if there is an optional feature or invitation system that could be randomly rolled out, you could partially approximate a random experiment. For example, if the platform runs a feature that suggests to teens “Invite your parents,” one could randomize the prompt to certain teens and not others, then track the subsequent differences in parental sign-ups and teenage engagement. Still, this is not a pure experiment at the parent level, but at least partially introduces randomness in who might get a nudge to invite their parents.
How do you measure the significance of your results?
Statistical significance can be assessed by constructing confidence intervals or performing hypothesis tests around your estimated effect (for example, using a t-test for difference in means in a difference-in-differences setup, or examining the p-value for the interaction term in a regression). If the confidence interval does not include zero, it suggests a statistically significant effect. Beyond significance, you also want to assess practical significance, i.e., whether the magnitude of the effect is large enough to matter in business or user-experience terms.
What if the effect only appears in certain subgroups of teens?
You can segment the teens by attributes like age bracket, region, device usage, or prior engagement levels. Then you can rerun your analysis or incorporate interaction terms in a regression to see if there are differential effects. For instance, older teens might be less affected by their parents joining than younger teens, or vice versa.
How would you handle long-term vs. short-term effects?
Engagement might shift immediately upon the parent’s sign-up, but that effect could stabilize or even reverse later. One way is to analyze engagement over multiple time windows: short-term (week 1 to week 4), medium-term (month 1 to month 3), and long-term (month 3+). You then compare how the effect evolves. Sometimes difference-in-differences or regression frameworks can explicitly incorporate time windows as separate variables or create repeated measures for multiple post-join intervals.
By structuring the analysis this way, you can see if the novelty of the parent joining eventually wears off or if it leads to lasting changes in how teens engage on the platform.
Below are additional follow-up questions
How do you account for teens who may reduce their public posts but switch to more private communication channels?
Some teens might react to parents joining by posting less publicly on Facebook’s main feed but instead resort to private messaging or other hidden avenues within the platform. This shift in behavior can make it appear as though engagement has dropped if you only measure public metrics like posts or likes. To address this:
Track both public and private engagement metrics if possible. If the data system allows, measure the frequency of direct messages, group messages, or private story views.
Understand platform privacy constraints. The ability to capture private messaging data accurately might be limited. If you only have aggregate-level metadata (e.g., the number of messages sent, not content), interpret those with caution.
Consider segmenting engagement into “public engagement” vs. “private engagement” to see whether a shift is simply a redistribution of activity from one channel to another.
A subtle edge case arises if the platform’s Terms of Service or privacy laws prevent collecting private data or even metadata. Another scenario is teens might adopt “dummy” accounts or alternative platforms. If you lose track of them entirely, your measured engagement may drop while actual usage just migrates to channels you’re not monitoring.
Could the impact vary depending on whether the teen actively invited the parent to join or if the parent joined independently?
If a teen proactively invites a parent, it might reflect a different kind of teen–parent relationship. Teens who are comfortable sharing their digital life with family might show minimal negative change. By contrast, teens whose parents joined on their own (perhaps due to peer influence from other parents) might feel more intruded upon and reduce or alter their activity. To handle this:
Classify parent-joining events based on whether the teen or other external factors prompted it. Some platforms log “invitations” or “suggest a friend” interactions.
Compare teen engagement changes in these two scenarios. If the teen invited the parent, the teen might be more accepting of parental presence.
Adjust your matching or controls to ensure that the difference in teen–parent relationship type (friendly vs. invasive) is not conflating the measured effect.
A tricky part is that not all invitations are explicitly logged. Parents might simply decide on their own to join. Additionally, the teen might not know the parent has joined for quite some time, thus introducing lag in the teen’s reaction.
What if teens respond by migrating to competitor platforms instead of just reducing Facebook usage?
In real-world settings, teens have multiple social media channels at their disposal. If they feel uncomfortable on Facebook once their parents arrive, they could remain physically present but severely reduce their activity, or they might switch to other platforms like Instagram, TikTok, or Snapchat for most of their social expression. Some considerations:
Identify whether overall time on Facebook remains stable while cross-platform usage changes. Though not always feasible, if you have multi-platform data (e.g., from an analytics panel), investigate whether usage on competing apps increases.
If competitor platform data is unavailable, look for indirect indicators such as longer inactivity gaps on Facebook or an abrupt, sustained drop in daily sessions.
Evaluate the effect from Facebook’s broader ecosystem perspective. If Facebook also owns other platforms, you could track engagement across all owned apps. If the teen is just moving to Instagram, that might still be relevant to the larger corporate picture, but it still indicates a shift away from Facebook’s core.
The subtlety here is that a pure “Facebook-based” measure might see dramatic drops, yet the overall social media usage by that teen could remain high. The underlying cause would be the parent joining. If the question is specifically about Facebook engagement, those external migrations matter greatly to your effect estimate.
How would you handle the scenario where a parent joins but never really engages with the teen’s content?
Parents might create accounts but not friend their teens or not actively view or comment on teen posts. In that case, the theoretical “invasive effect” might be minimal. Conversely, an actively engaging parent (commenting, tagging, etc.) could have a far more significant impact. Practical steps:
Classify parent accounts by activity level, not just existence. For instance, measure how often the parent interacts with the teen’s profile or content.
Separate “active parents” from “inactive parents” to see if there is a differential effect. In some analyses, you might find that parental presence alone doesn’t matter unless there is visible or frequent interaction from the parent.
Consider looking at the teen’s friendship circle. If the teen does not add the parent to their friend list, the direct effect might be limited. However, the teen might still adjust behavior simply knowing the parent can search for them.
Edge cases include parents who sign up briefly, do not friend the teen, or forget their credentials immediately. This can muddy your data if you only rely on “parent joined” as a single binary indicator without capturing the intensity or nature of parental interaction.
How do cultural or familial factors influence a teen’s response to a parent joining?
In many cultures, family dynamics vary widely. In some, having parents on social media is normal and might even encourage more open, communal sharing. In others, teens might be more sensitive about privacy and autonomy. Key considerations:
Segment teens by region or cultural background if data allows, and assess whether the effect is universal or more pronounced in particular demographic groups.
Understand that certain features might be used differently across cultures. For example, in some places, families create joint accounts or share devices; in others, each individual has a private profile.
If cultural data is not explicitly collected (for privacy or data-protection reasons), you might find proxies, such as language or region of sign-up, to glean patterns.
An edge case is that the effect might be masked if certain cultural groups are less likely to use Facebook at all. You could see minimal changes not because there is no effect, but because the baseline adoption within that cultural group is already low.
Could the visibility of teen content to the parent change the teen’s social graph?
When a parent joins, the teen might manually unfriend or restrict certain other individuals to avoid the parent seeing them, or the teen might proactively curate their friends list. Alternatively, the teen might block the parent or use privacy settings that hide certain posts. Analytical considerations:
Track teen friend network changes post parent-join. Sudden friend removals or changes in privacy settings might be a sign of the teen reacting specifically to parental presence.
Analyze the correlation between modifications to the teen’s social graph and subsequent engagement. Sometimes fewer but more “trusted” friends could lead to less overall activity but deeper interactions.
For difference-in-differences, ensure that teen-level random or systematic changes in network size are accounted for, or at least measured, to see whether they confound or mediate the effect of parent presence.
A potential pitfall is conflating normal friend list churn (which can be high for teens generally) with purposeful curation due to parents joining. Historical data on churn rates might help determine if there’s a meaningful deviation post parent arrival.
How can sentiment or emotional response to parents’ presence be measured and factored in?
Quantitative engagement metrics do not necessarily capture whether the teen is comfortable or stressed about parental presence. A teen might post more out of rebellion or post less out of discomfort. If the platform has sentiment analysis tools, you could try:
Analyzing text-based content for changes in sentiment. If the teen’s posts become less personal or more guarded, you may see a shift in language style and emotional tone.
Conducting targeted surveys or user research. These might ask teens directly how they feel about their parents joining, though response rates and honesty can be an issue.
Monitoring changes in certain sensitive keywords or topics posted.
The risk is privacy and ethical concerns. Collecting sentiment data, especially from minors, has major ethical and legal restrictions in many jurisdictions. Additionally, teens might be reluctant to express negative views if they believe parents (or Facebook) are monitoring them.
Are there any dynamic interactions with sibling behavior?
If the teen has siblings on the platform, a parent joining might alter group or familial dynamics. One teen might reduce activity, another might not change, and a third might use it more to stay connected. If multiple teens are within the same family:
Model each teen’s engagement independently and see if siblings’ responses are correlated. Does one sibling’s usage pattern influence the other?
Check if the parent’s primary interactions are split among different siblings. The teen whose content is most viewed or commented on by the parent might have a stronger negative (or positive) reaction.
For difference-in-differences, carefully handle families with multiple teens to avoid double-counting or correlated errors across multiple siblings in the same treatment group.
An edge case is that sometimes parents join primarily to monitor a younger sibling, which could drive the older teen’s reaction differently. Without sibling data, you might incorrectly assume a uniform effect for any teen in the household.
How do you measure and interpret the spillover effect on the teen’s peer group?
A teen might reduce public Facebook interactions if they see other friends also cutting back when parents join. Peer influence is strong among teenagers. If peers are collectively shifting to other platforms or forming private groups, that can magnify the effect or create network-level feedback loops:
Map the social graph of each teen and look for changes in the group’s overall posting frequency. If a significant portion of the teen’s friend network faces the same “parental invasion,” the teen’s feed might become less appealing.
Distinguish between direct effect (the teen’s own parent joining) and indirect effect (the teen’s friend’s parent joining). Indirectly, seeing multiple parents across the friend network might accelerate a move to private or alternative channels.
If you only track direct teen–parent relationships, you might underestimate the cumulative effect. A small wave of parents across an entire friend cluster could drastically alter that teen cluster’s overall usage.
One subtlety is that friends might differ in how they respond—some might not care about parental presence while others leave the platform. Modeling these heterogeneous responses requires network-level analysis, which can get very complex very quickly.
Does platform design or new feature introduction confound the analysis?
Facebook or any social network often rolls out new features, design changes, or policy updates. These could coincide with the time period parents start joining en masse (for instance, a major marketing campaign targeting older demographics). If those features also affect teen engagement, you might wrongly attribute the change to parent presence. To mitigate:
Track the release dates of major features or interface changes and incorporate them as control variables in your difference-in-differences model or regression.
Segment your time analysis to exclude or specifically model periods of big changes. For example, if a major redesign was launched mid-study, run separate analyses for pre-redesign vs. post-redesign.
If possible, adopt a staggered rollout approach in the data. If some geographic areas or demographics received the new feature earlier, you can analyze that variation to partial out the effect of the feature from the effect of parental onboarding.
A subtle edge case is that parents might be more likely to join during or after certain feature rollouts (like new group functionalities or an easier sign-up flow), intensifying the correlation between platform changes and parental presence. That correlation can complicate any causal claims regarding teen engagement shifts.