Skip to Content

News Feed

Questions

What

  • What platforms does the app support? (Both mobile and web)
  • What are the important features? (User can publish a post and see friends’ posts on the news feed)
  • What content can the feed contain? (Images, videos, and text)

How

  • How is the news feed sorted? (Reverse chronological order)

Who

  • Who are the users and their connections? (Each user can have up to 5000 friends)

When

  • When is the feed content updated? (Not explicitly mentioned but implied to be in real-time or near real-time)

Overview

Design Overview: Feed Publishing and News Feed Building

Feed Publishing:

  • When a user publishes a post:
    • Data is written into the cache and database.
    • The post is populated to the user’s friends’ news feeds.

News Feed Building:

  • News feed is built by aggregating friends’ posts in reverse chronological order.

Newsfeed APIs:

  • Primary methods for clients to interact with servers.
  • HTTP-based APIs for actions such as posting a status, retrieving the news feed, adding friends, etc.

Key APIs:

  1. Feed Publishing API:

    • Method: POST
    • Endpoint: /v1/me/feed
    • Parameters:
      • content: Text of the post.
      • auth_token: Used to authenticate API requests.

    Example:

    POST /v1/me/feed Content-Type: application/json { "content": "This is my new post!", "auth_token": "your_auth_token_here" }
  2. Newsfeed Retrieval API:

    • Method: GET
    • Endpoint: /v1/me/feed
    • Parameters:
      • auth_token: Used to authenticate API requests.

    Example:

    GET /v1/me/feed Authorization: Bearer your_auth_token_here

News feed publisher

Flow Summary:

  • Feed Publishing:

    1. User publishes a post.
    2. Data written to cache and database.
    3. Post is added to friends’ news feeds.
  • News Feed Retrieval:

    1. User requests their news feed.
    2. Server aggregates friends’ posts in reverse chronological order.
    3. News feed is returned to the user.

Components

Web Servers

  • Functions:
    • Communicate with clients.
    • Enforce authentication and rate-limiting.
    • Only allow users with valid auth_token to make posts.
    • Limit the number of posts a user can make within a certain period to prevent spam and abuse.

Fanout Service

  • Definition: The process of delivering a post to all friends.

  • Models:

    1. Fanout on Write (Push Model):
      • Workflow: News feed is pre-computed during write time, delivering new posts to friends’ cache immediately after publishing.
      • Pros:
        • Real-time news feed generation.
        • Fast fetching of news feed due to pre-computation.
      • Cons:
        • Slow and time-consuming for users with many friends (hotkey problem).
        • Wastes computing resources on inactive users.
    2. Fanout on Read (Pull Model):
      • Workflow: News feed is generated during read time, pulling recent posts when a user loads their home page.
      • Pros:
        • Efficient for inactive users.
        • Avoids hotkey problem.
      • Cons:
        • Slower news feed fetching due to on-demand generation.
  • Hybrid Approach:

    • Strategy: Combine benefits of both models and mitigate pitfalls.
    • Implementation:
      • Use push model for most users to ensure fast news feed fetching.
      • Use pull model for celebrities or users with many friends/followers to avoid system overload.
      • Use consistent hashing to distribute requests/data evenly and reduce hotkey problem.

Fanout Service Workflow

  1. Fetch Friend IDs:

    • Retrieve from the graph database, which manages friend relationships and recommendations.
  2. Get Friends Info:

    • Fetch from the user cache.
    • Filter friends based on user settings (e.g., muted friends, selective sharing).
  3. Message Queue:

    • Send friends list and new post ID to the message queue.
  4. Fanout Workers:

    • Fetch data from the message queue.
    • Store news feed data in the news feed cache as <post_id, user_id> mappings.
  5. Cache Management:

    • Store only IDs in the cache to minimize memory consumption.
    • Set a configurable limit to keep the memory size manageable.
    • Focus on storing latest content due to low likelihood of users scrolling through thousands of posts.

Example Structure of News Feed Cache:

  • News Feed Table:
    • Format: <post_id, user_id>
    • Only IDs stored to reduce memory usage.
    • Configurable limit to maintain manageable memory size and low cache miss rate for recent content.

News feed retriever/building

Workflow Summary:

  1. User sends a request to /v1/me/feed.
  2. Load balancer directs the request to an available web server.
  3. Web server requests the news feed from the news feed service.
  4. News feed service retrieves post IDs from the news feed cache.
  5. Service fetches additional data (user info, post content, media links) from user and post caches.
    • The news feed includes more than just post IDs; it also includes:
    • Username
    • Profile picture
    • Post content
    • Post images/videos, etc.
    • Fetches complete user and post objects from user cache and post cache to build the fully hydrated news feed.
    • Media Content Storage (images, videos, etc.) should be from CDN
  6. Constructs the complete news feed with all necessary details.
  7. Returns the JSON-formatted news feed to the client.

Potential Cache Layers in a News Feed System

By dividing the cache tier into these specific layers, the news feed system can efficiently handle a high volume of requests, maintain fast access to critical data, and ensure that the user experience remains smooth and responsive.

  1. News Feed Layer:

    • Purpose: Stores IDs of news feeds.
    • Description: This layer holds the identifiers of posts that make up a user’s news feed.
  2. Content Layer:

    • Purpose: Stores every post’s data.
    • Description: Includes the full content of each post.
    • Hot Cache: Popular content is stored here for faster access.
  3. Social Graph Layer:

    • Purpose: Stores user relationship data.
    • Description: Manages and caches the relationships between users (e.g., friends, followers).
  4. Action Layer:

    • Purpose: Stores information about user interactions with posts.
    • Description: Includes data on whether a user liked a post, replied to it, or took other actions.
  5. Counters Layer:

    • Purpose: Stores counters for various metrics.
    • Description: Includes counts for likes, replies, followers, following, and other interactions.
Last updated on