ML Interview Q Series: How would you design and implement a "trending posts" feature to boost engagement at Reddit?
📚 Browse the full ML Interview series here.
Comprehensive Explanation
One way to address the challenge of building a “trending posts” or “hot posts” feature is to combine factors such as voting, time decay, and user interactions (e.g., comments, views, and dwell time) into a dynamic ranking formula. The key idea is that recent and highly engaged posts should rise to the top in a balanced way, so neither purely fresh nor purely popular posts dominate indefinitely.
A common approach, inspired by real-world implementations, is to compute a score for each post that considers both popularity signals (likes, upvotes, comments) and a time-decay function that gradually demotes older posts. The broad form of such a function might include a logarithmic scale for upvotes, combined with a penalty factor for the post’s age in hours.
An illustrative core formula often seen in practice would look similar to the following:
where:
score is the net engagement metric for the post. It can be upvotes minus downvotes, or a more sophisticated measure that also includes comment count or other engagement features.
t is the age of the post in hours (or another time unit).
alpha is a tuning parameter that controls how quickly the hotness score decays over time. Smaller alpha values penalize age more aggressively, resulting in newer posts being favored more strongly.
This formula naturally pushes recent posts upward and prevents very old but once-popular posts from dominating the top of the feed. Adjusting alpha, or weighting different engagement signals differently, can be part of continuous experimentation.
Additional refinement might incorporate Bayesian ranking or a Wilson confidence interval for handling the uncertainty when the number of votes is low. This helps avoid awarding top positions too eagerly to posts with only a small initial burst of upvotes.
Real-World Factors and Practical Considerations
High-volume environments such as Reddit require efficient data pipelines to update these scores regularly, possibly in near real time. A pipeline would collect upvote/downvote counts, comment data, and time since the post was created, then compute the updated trending score for each post. Careful attention must be paid to caching, incremental updates, and the computational cost of regularly recalculating scores at scale.
When designing the final ranking approach, you might also integrate contextual signals such as the subreddit category or user-specific preferences. This can lead to a more personalized feed rather than a one-size-fits-all feed, though personalization can complicate both the algorithm and the infrastructure.
Below is a short Python snippet illustrating how one might calculate and store a simplified “hot” score:
import math
from datetime import datetime
def hot_score(upvotes, downvotes, post_time, alpha=45000):
# net score
net_score = upvotes - downvotes
net_score = max(net_score, 1) # avoid log(0)
# time in hours since post creation
time_diff = (datetime.utcnow() - post_time).total_seconds() / 3600.0
# trending/hotness formula
hotness = math.log(net_score, 10) - (time_diff / alpha)
return hotness
# Example usage:
# Suppose a post has 200 upvotes, 30 downvotes, and was created 10 hours ago
post_time_example = datetime.utcnow() # For demonstration, assume now as creation time
post_time_example = post_time_example.replace(hour=post_time_example.hour - 10)
score = hot_score(200, 30, post_time_example)
print("Calculated Hot Score:", score)
This sort of calculation, or a more advanced variant, would be performed for each post, and then the posts can be sorted descending by the resulting score.
Handling New Posts
Newly created posts tend to have an unfair disadvantage if the model relies heavily on votes or comments, as they have not had enough exposure to receive engagement. To remedy this, you can:
Use a short-term “boost” or set an initial weight, ensuring new posts have visibility for a while. Define a cold-start mechanism that shows a selection of random new posts or ensures that new posts appear in the feed for some minimum period. Continuously refine the time-decay parameter alpha to balance the trade-off between freshness and popularity.
Personalization
Depending on Reddit’s user base and the desire to customize individual feeds, you might incorporate user interests (subreddits they frequent, keywords they’ve upvoted in the past, communities they’ve joined) into the score. A simple approach could add a bonus factor if the user previously engaged with similar content. More advanced methods might employ machine learning models that learn user embeddings and content embeddings and then rank posts by similarity or predicted engagement likelihood.
Dealing with Noise and Manipulation
Any engagement-based system can be gamed if malicious actors coordinate votes or rely on bots to artificially inflate a post’s score. Common defenses include:
Rate-limiting or weighting factors for suspicious upvote bursts. Integrating signals from spam detection, user reputation, and anomaly detection algorithms. Applying more robust statistical methods that can adjust for suspicious patterns.
Scalability Concerns
As the platform grows, the number of posts, votes, and comments can become extremely large. Real-time ranking computations must be done in a highly efficient manner. Techniques to ensure scalability include:
Caching partial results and updating incrementally rather than recalculating everything from scratch. Using message queues or streaming platforms (e.g., Kafka) to process engagement events in near real time. Employing distributed data storage (like sharded or partitioned databases) so that updating and retrieving post information is efficient at high volumes.
Follow-up Questions
How would you incorporate comment counts, share counts, and dwell time?
One possibility is to add these as additional components to the net score. For example, you could define net_score as upvotes - downvotes + some_coefficient * number_of_comments + another_coefficient * share_count. Dwell time, or the average length of time people spend on a post, can be factored into an engagement metric that indicates deeper user interest. Alternatively, you could build a classifier that predicts the overall interest in the post and produce a single numeric score.
What if you want a personalized trending feed?
In a personalized system, you might use collaborative filtering or content-based filtering. One approach is to learn user and item embeddings using matrix factorization or deep learning techniques. Another is to process data on each user’s historical behavior, matching them with posts that exhibit similar features to previously highly engaged content. This typically requires a serving infrastructure for real-time recommendations, which can be more complex than a simple global feed.
Could you explain how a Bayesian approach might help?
A Bayesian method can be used to estimate the true popularity of a post when the number of votes is relatively small. Instead of taking raw upvote-downvote counts, a Bayesian framework provides a posterior distribution that accounts for the possibility of random fluctuations from small sample sizes. For instance, you might assume a Beta prior over upvote probabilities, then update it with the observed upvotes/downvotes to obtain a posterior distribution. The post’s ranking could be based on a high percentile of this posterior, which is more stable and less prone to random spikes.
How do you address very fast viral growth?
If a post suddenly gains massive engagement, you want your feed to reflect that quickly. This could be handled by reducing the time window for recalculations, or updating the relevant post’s score more frequently when you detect sharp spikes in engagement. You could also design a dynamic alpha that becomes smaller for posts experiencing rapid growth, making them decay more slowly in the algorithm and thus placing them at the top for longer.
What are some potential pitfalls?
Posts with misleading or “clickbait” titles might inflate engagement yet disappoint users. Incorporating user signals on post quality (such as quick downvotes or short dwell times if users leave immediately) can mitigate the problem. Another pitfall is failing to handle spam or bot traffic. A robust moderation policy and bot detection systems are essential to preserving the integrity of the trending algorithm.
Below are additional follow-up questions
How would you handle cross-community (subreddit) interaction when ranking trending posts?
When users browse a combined “All” or “Popular” feed that aggregates posts from various communities, you need to ensure fairness and diversity across multiple topics. A naive approach might simply take the highest-scoring posts across all subreddits. However, popular subreddits with large user bases could crowd out smaller niche subreddits entirely.
A possible solution is to introduce weighting factors that normalize subreddit size or average activity. One approach is to compute a subreddit-level popularity baseline (for example, the average score or engagement metric of posts in that subreddit). Then you measure each post’s deviation from that baseline. This ensures smaller communities producing highly engaging posts can still appear on the aggregator feed. Another subtlety involves ensuring that communities with frequent or time-sensitive content (news-related subreddits) do not dominate. You might incorporate a specialized decay factor that changes depending on the typical lifespan of content in a given community.
Edge Cases and Pitfalls
Rapidly growing subreddits can lead to anomalies in the baseline calculation. The average engagement might shift abruptly, so you may need rolling windows for computing subreddits’ baselines.
Extremely small subreddits might have inflated relative engagement. A single post might overshadow or distort the community’s representation if the normalization is not robust.
What strategies can you use to mitigate potential echo chambers or filter bubbles?
Echo chambers arise when users only see content that aligns with their existing interests or beliefs, potentially limiting exposure to new or diverse posts. One potential approach is to add an “exploration” component that occasionally injects posts outside of a user’s standard preference domain into the feed. This exploration can be a fixed percentage (for example, 5% of displayed posts) or adaptive based on how much new content the user has been exposed to historically.
You can also limit how heavily personalization features weigh in the ranking formula, ensuring that popular or trending content from outside a user’s usual subreddits still appears occasionally. Furthermore, some platforms offer a “Random” or “Discover” feed mode that intentionally diversifies the content, letting users see a wider range of topics and communities.
Edge Cases and Pitfalls
Over-exposing users to irrelevant or undesired content might harm user engagement if too much “exploration” content is shown.
Extremely polarized communities may require additional moderation or community-driven curation to mitigate negative behavior triggered by seeing opposing viewpoints.
How do you incorporate user feedback about the relevance or quality of the trending posts?
User feedback can come in multiple forms, such as explicit ratings (like upvotes/downvotes), surveys, or even unsubscribing from certain subreddits. One strategy is to build a feedback loop that regularly adjusts the ranking model or weighting scheme. For instance, if you notice that posts with high initial upvotes often lead to subsequent user dissatisfaction signals (like quick downvotes or negative comments), you could reduce the weighting of early upvotes in the final ranking. Alternatively, you might promote content that garners positive comments rather than short or negative interactions.
You could also integrate direct user feedback about content relevance, such as letting users mark certain topics or subreddits as “not interested.” These signals can then be used to demote unwanted content in the future or remove it entirely from that user’s feed.
Edge Cases and Pitfalls
Users might be inconsistent or even contradictory in their feedback, leading to noisy signals. You need robust statistical methods to filter out anomalies.
High-level changes based on user feedback can take time to propagate, requiring real-time or near-real-time recalculations to keep up with shifting user preferences.
How would you manage performance constraints when recalculating trending scores at scale?
Large-scale platforms can see thousands of new posts and millions of engagement events every hour. Constantly recalculating trending scores in real time for each post can be prohibitively expensive, so a common approach is to run recalculations at fixed intervals or use a streaming architecture that updates only the affected posts when engagement events arrive. Caching can further minimize the need for on-demand score recomputation. For example, each post’s score might be updated every few minutes unless it has an unusually high volume of new votes or comments, in which case you update more frequently.
You might also consider partitioning or sharding the data by subreddit or topic. A distributed system like Apache Spark, Flink, or an event streaming platform could be used to handle the incremental updates efficiently. A specialized queue might triage updates: items with high engagement join a fast-lane queue that updates more often, whereas most posts see periodic but less frequent score recalculation.
Edge Cases and Pitfalls
Delaying updates too much might cause highly viral or fast-growing posts to appear stale, missing their peak.
Overly frequent updates can overload the system, resulting in slow response times or computational bottlenecks. Balancing these extremes is key.
How would you handle emergent content formats, such as polls, videos, or live streams?
Posts that contain rich media or interactive elements might have different engagement patterns. A video post might garner longer dwell time but fewer comments, whereas a poll might get a lot of quick interactions but fewer in-depth discussions. You could create a separate engagement metric for each content format, or you could unify these signals under a more generalized ranking mechanism that properly weights each form of engagement.
For instance, if dwell time is crucial for videos, you might factor that more heavily than raw upvote counts. Polls might rely on the ratio of poll participants to total views. For live streams, concurrency metrics (how many people are engaging at once) can be an important factor.
Edge Cases and Pitfalls
Using a single aggregated formula for all content types might not capture the nuanced ways people engage with different formats, leading to skewed rankings.
Overcomplicating the ranking system with too many format-specific parameters might make it difficult to maintain or interpret.
How do you adjust your system in response to changing user behavior patterns over time?
Platforms evolve; new subreddits can surge in popularity, and user interests shift. A static formula might become outdated if it does not adapt to these trends. Periodic recalibration or parameter tuning, guided by usage data and A/B testing, can address gradual changes. In more automated systems, you could incorporate machine learning models that continuously retrain on fresh data. If your platform notices that older posts are receiving unexpected surges in engagement (e.g., due to a nostalgic revival trend), your decay function or alpha parameter might need adjustments to accommodate these shifts.
Edge Cases and Pitfalls
Overfitting to short-term fluctuations can lead to unstable or inconsistent feeds. A new meme might not justify a permanent model shift.
Reacting too slowly can fail to capture genuine shifts in user tastes, causing stale content to linger.
How do you deal with external traffic spikes or external linking?
A post might receive a large number of external visitors from other websites or social media platforms. If your ranking is based on on-site engagement alone, a sudden jump in pageviews could inflate a post’s score (if views or dwell time are included) without reflecting genuine interest from registered users. You might choose to separate external traffic signals from internal engagement signals or discount external bursts if they do not translate into real on-platform actions (like upvotes from authenticated users).
Edge Cases and Pitfalls
An orchestrated campaign from an external community could artificially inflate a post’s standing. You need to decide whether that qualifies as “trending” in a broader sense or if it’s mere brigading.
Over-discounting external traffic might cause you to miss out on legitimate trending topics that attract newcomers to the platform.
How would you approach testing and validation of the “trending posts” feature before a full launch?
You might run a series of controlled experiments. One approach is to conduct an A/B test with two versions of the trending ranking: the current formula versus your new algorithm. You can monitor key metrics like dwell time, upvote ratio, total comments, or even user retention. Another dimension is user satisfaction, measured via surveys or direct feedback prompts. If the new approach consistently shows higher engagement and satisfaction without introducing negative consequences (like spam or inappropriate content surfacing), it can be rolled out more broadly.
In addition, you might run small-scale tests in select subreddits or user segments. Observing the performance in these limited environments can help you identify potential issues with the ranking logic or data pipeline. If you detect anomalies (like older posts dominating or brand-new posts never surfacing), you can make adjustments before a platform-wide release.
Edge Cases and Pitfalls
If the sample pool for testing is not representative of the broader user base, conclusions might be skewed.
Deploying the new ranking to a large population too early might disrupt existing user behavior and lead to negative feedback that’s hard to undo.
How do you ensure transparency and user trust in the trending algorithm?
Many users are interested in why certain posts reach the top. Providing a general explanation about how the algorithm considers votes, recency, and engagement can foster trust. You might display a “Why this post is trending” tooltip that reveals simplified metrics (e.g., net upvotes, number of comments, post age, etc.) to help users understand the ranking rationale.
At the same time, you do not want to reveal so much detail that it’s easy to game the system. Striking a balance between transparency and security is key. Being too opaque can lead to speculation about favoritism or manipulation, while being overly transparent may encourage unscrupulous actors to exploit known ranking signals.
Edge Cases and Pitfalls
Partial transparency might still be enough for sophisticated spammers to reverse-engineer key aspects of the ranking system.
Overly complex or jargon-heavy explanations can confuse average users, defeating the purpose of transparency.
How would you handle multi-lingual or geo-specific trending content?
Platforms with a global user base must account for linguistic and geographical differences. Users might only care about trending posts within their language, region, or culture, leading you to maintain separate trending lists. One approach is to tag posts with language or region metadata (inferred from user profiles or post text) and compute region-specific or language-specific trending scores. If you want a global trending list, you can unify these scores while still applying normalization based on the volume of posts in each language or region.
Edge Cases and Pitfalls
Misdetection of language or region might place posts in the wrong feed, undermining the user experience.
A global feed might inadvertently boost content from large user bases, overshadowing content from smaller regions or languages. Balancing fairness and global relevance is challenging.