System Design Interview Questions

A comprehensive resource of the most frequently asked system design interview questions at Google, Meta, Amazon, Netflix, and other top tech companies. Each question includes difficulty level, approach strategies, key concepts, and what interviewers are looking for.

How to use this resource: Start with questions in your target difficulty level. Read the approach outline, understand the key concepts, then practice designing the system out loud. The best preparation comes from doing realistic mock interviews where you explain your design under time pressure.

Practice with AI Mock Interviews →

Social Networks

Design Twitter

Medium Very High Frequency

Companies: Meta, Google, Netflix, Amazon, Uber, most FAANG companies

Key Concepts: Timeline generation, fanout strategies (push vs pull), caching, database sharding, celebrity problem, news feed ranking

Approach Outline

Start by clarifying functional requirements: posting tweets (280 chars), following users, timeline viewing, likes/retweets. For non-functional requirements, assume 300M daily active users, high read-to-write ratio (100:1), and real-time timeline updates within seconds.

The core challenge is timeline generation. For posting, store tweets in a distributed database (Cassandra/DynamoDB) and decide between fanout-on-write (push) or fanout-on-read (pull). Most designs use hybrid: fanout-on-write for regular users (pre-compute timelines and push to followers' feeds) but fanout-on-read for celebrities (too many followers to fanout). This solves the "celebrity problem" where a user with 50M followers would trigger 50M writes per tweet.

For timeline retrieval, check Redis cache first for pre-computed timelines. Cache hit means instant response. Cache miss means aggregate recent tweets from followed users, merge by timestamp, and cache the result. Use a CDN for media content (images/videos). Shard the database by user ID to distribute load. Implement pagination for timelines to avoid loading millions of tweets at once.

The tweet posting flow: User posts → Store in tweets DB → Fanout service reads user's followers → Write to followers' timeline cache → Send push notifications. Timeline retrieval: User requests timeline → Check Redis cache → Return cached timeline or fetch and merge from followed users' recent tweets → Cache result.

Critical Trade-offs

  • Fanout-on-write vs fanout-on-read: Write amplification vs read complexity. Hybrid approach balances both.
  • Eventual consistency: Accept slight delays in timeline updates for better scalability and availability.
  • Cache eviction: LRU policy for timeline cache, but keep active users' timelines hot.

Common Mistakes

  • Not discussing the celebrity problem and using only fanout-on-write
  • Forgetting to mention caching strategy for timelines
  • Ignoring media storage and CDN for images/videos
  • Not considering database sharding at scale
Practice designing Twitter with AI →

Design Instagram

Medium Very High Frequency

Companies: Meta, Pinterest, Snapchat, TikTok, ByteDance

Key Concepts: Image storage and processing, feed ranking algorithms, CDN strategy, news feed generation, engagement tracking

Approach Outline

Instagram is similar to Twitter but image-heavy with feed ranking. Start with requirements: posting photos (up to 10 per post), following users, feed viewing, likes/comments, stories (24-hour expiry), and explore page. For scale, assume 500M daily active users, heavily read-focused, with strict latency requirements for image loading.

The critical difference from Twitter is the image processing pipeline. When a user uploads a photo, store the original in blob storage (S3), then trigger an async job to generate multiple resolutions (thumbnail 150x150, medium 640x640, full 1080x1080) for different use cases (feed thumbnail vs full view). Use a CDN aggressively since images are static and cacheable. Consider serving images from the CDN edge closest to the user for minimal latency.

For the feed, implement a ranking algorithm instead of pure chronological order. Factors include: recency (newer posts rank higher), relationship strength (close friends rank higher), engagement likelihood (predict based on past behavior), and content type (user prefers photos vs videos). Use a machine learning model trained on user engagement data. Cache ranked feeds in Redis, pre-compute for active users. Separate feed generation service from image serving for independent scaling.

Stories require different architecture: 24-hour TTL, priority on latest stories, separate storage with expiration. Explore page uses a recommendation engine analyzing user interests, trending content, and collaborative filtering. Track all engagement events (likes, comments, saves, shares) in a separate analytics pipeline (Kafka + data warehouse) for feed ranking and recommendations.

Critical Trade-offs

  • Storage costs vs image quality: Multiple resolutions increase storage but improve UX and reduce bandwidth.
  • Chronological vs ranked feed: Ranked feed improves engagement but adds complexity and may create filter bubbles.
  • CDN costs vs latency: Aggressive CDN caching costs more but provides better user experience globally.
Practice designing Instagram with AI →

Design Facebook News Feed

Medium-Hard High Frequency

Companies: Meta, LinkedIn, Reddit, Twitter, Pinterest

Key Concepts: Feed ranking, personalization, ML models, content diversity, real-time updates, ad integration

Facebook News Feed is more complex than Twitter due to diverse content types (posts, photos, videos, shared articles, ads) and sophisticated ranking. Use hybrid fanout with heavy emphasis on personalized ranking. Feed generation service aggregates potential posts, then ranking service scores each post using ML model considering engagement probability, content freshness, relationship strength, content diversity, and ad placement. Prefetch and pre-rank feeds for active users. Handle ads separately with targeting criteria and frequency caps.

Practice designing Facebook Feed with AI →

Design TikTok

Hard Medium Frequency

Companies: TikTok, Instagram (Reels), YouTube (Shorts), Snapchat

Key Concepts: Short video streaming, recommendation algorithm, content moderation, viral content detection, video transcoding

TikTok combines video streaming with highly personalized recommendations. Video upload triggers transcoding to multiple formats and resolutions. The "For You" feed is the core feature—ML-based recommendation using collaborative filtering, content-based filtering, and real-time signals (watch time, completion rate, likes, shares). Prefetch next 3-5 videos for seamless swipe experience. Track engagement metrics aggressively to feed recommendation model. Implement content moderation (ML + human review) to filter inappropriate content. Handle viral detection to scale infrastructure proactively when videos explode in popularity.

Practice designing TikTok with AI →

Design Reddit

Medium Medium Frequency

Companies: Reddit, Hacker News, Stack Overflow, Discord

Key Concepts: Voting system, ranking algorithm (hot/top/controversial), threaded comments, subreddit isolation, moderation

Reddit's unique challenge is the voting-based ranking system. Posts have upvotes/downvotes that determine visibility. Implement ranking algorithms: "Hot" (upvotes with time decay), "Top" (net upvotes in time period), "Controversial" (high upvotes AND downvotes). Store posts in database with vote counts, update rankings periodically (not real-time to prevent gaming). Threaded comments require parent-child relationships in data model—use nested sets or path enumeration for efficient retrieval. Isolate subreddits as separate namespaces. Cache hot posts per subreddit. Implement moderation tools and spam detection.

Practice designing Reddit with AI →

Design LinkedIn Feed

Medium Medium Frequency

Companies: LinkedIn, Twitter, Meta

Key Concepts: Professional network graph, connection recommendations, job recommendations, feed ranking by professional relevance

LinkedIn combines social feed with job search and professional networking. Feed ranking prioritizes professional content—industry news, career updates, thought leadership. Use graph database for connection network and "People You May Know" recommendations (mutual connections, shared employers/schools). Job recommendation engine matches user profile (skills, experience, location) with job postings. Search functionality for people, companies, and jobs using Elasticsearch. Feed includes posts from connections, company pages, and sponsored content. Track engagement differently than consumer social—article reads and thoughtful comments weight higher than likes.

Practice designing LinkedIn with AI →

Infrastructure & Core Systems

Design a URL Shortener

Easy Very High Frequency

Companies: Google, Meta, Amazon, Microsoft, most companies

Key Concepts: Hashing, base62 encoding, key-value storage, collision handling, analytics tracking

URL shortener is the perfect starter question testing fundamental concepts. Generate short URLs using base62 encoding (0-9, a-z, A-Z) of auto-incrementing IDs or hash of long URL. Auto-incrementing IDs are simpler (no collisions) but reveal sequential nature. Hash-based needs collision handling but provides randomness. Store mappings in key-value store (Redis/DynamoDB) for fast lookups. Use load balancer and stateless API servers for scalability. Add analytics service to track clicks, referrers, geographic data. Consider URL expiration policy and cleanup job. For high scale, shard database and use distributed ID generator.

Read complete walkthrough →

Design a Rate Limiter

Easy-Medium Very High Frequency

Companies: Meta, Google, Stripe, Shopify, Twitter, AWS

Key Concepts: Token bucket algorithm, sliding window counter, Redis counters, distributed rate limiting

Choose between algorithms: Token Bucket (flexible, allows bursts), Fixed Window Counter (simple but allows burst at boundaries), Sliding Window Log (accurate but memory-intensive), or Sliding Window Counter (good balance). Implement in Redis for speed—store counters with TTL. For each request, check if user/IP has remaining quota. If yes, decrement and allow. If no, reject with 429 status. In distributed systems, use centralized Redis or accept slight inaccuracy with local counters + periodic sync. Support different rate limits per API endpoint and user tier (free vs paid). Return rate limit info in response headers (X-RateLimit-Remaining, X-RateLimit-Reset).

Read complete walkthrough →

Design a Distributed Cache

Medium-Hard Medium Frequency

Companies: Meta, Amazon, Microsoft, Google

Key Concepts: Consistent hashing, LRU/LFU eviction, cache invalidation, replication, hot keys

Implement distributed hash table using consistent hashing to map keys to cache nodes—minimizes key redistribution when nodes added/removed. Each cache node stores subset of data with LRU or LFU eviction policy when memory full. Replicate data across multiple nodes for fault tolerance (typically 2-3 replicas). Client library implements hash ring logic to route requests to correct node. Handle cache invalidation with TTL (time-based expiry) or write-through/write-back strategies. Address hot key problem (celebrity's profile cached everywhere) with replication to multiple nodes. Monitor cache hit ratio and adjust capacity accordingly.

Practice designing distributed cache with AI →

Design a Key-Value Store

Hard Medium Frequency

Companies: Amazon (DynamoDB), Google, Meta, Microsoft

Key Concepts: CAP theorem, eventual consistency, partitioning, replication, vector clocks, gossip protocol

Design inspired by Amazon's Dynamo. Use consistent hashing for partitioning data across nodes. Each key replicated to N nodes (typically 3). Accept writes even during network partitions (AP in CAP theorem)—use vector clocks to track causality and resolve conflicts. Gossip protocol for node discovery and failure detection. Merkle trees for efficient replica synchronization. Quorum-based reads/writes: W + R > N for strong consistency, or W=1, R=1 for high availability with eventual consistency. Handle read repairs and hinted handoff when nodes temporarily unavailable. This question tests deep understanding of distributed systems trade-offs.

Practice designing key-value store with AI →

Design a Load Balancer

Medium Medium Frequency

Companies: Netflix, Amazon, Google, Cloudflare

Key Concepts: Load balancing algorithms (round robin, least connections, consistent hashing), health checks, session persistence

Load balancer distributes incoming requests across multiple backend servers. Implement at Layer 4 (transport - TCP/UDP) or Layer 7 (application - HTTP). Algorithms include: Round Robin (simple, even distribution), Least Connections (route to server with fewest active connections), Weighted Round Robin (assign weights based on server capacity), or IP Hash (consistent routing per client). Health checks (HTTP polls or TCP checks) detect failed servers and remove from pool. Session persistence via sticky sessions (same client to same server) using cookies or IP hashing—needed for stateful applications. Support SSL termination to offload crypto from backends.

Practice designing load balancer with AI →

Design a Content Delivery Network (CDN)

Medium-Hard Medium Frequency

Companies: Cloudflare, Akamai, AWS (CloudFront), Netflix

Key Concepts: Edge servers, origin servers, cache invalidation, geographic routing, pull vs push CDN

CDN caches static content at edge locations globally to reduce latency and origin load. When user requests content, DNS routes to nearest edge server. If cached (cache hit), serve immediately. If not (cache miss), fetch from origin, cache at edge, and serve. Use pull CDN (fetch on first request) for less frequently accessed content, or push CDN (proactively push content to edges) for popular content. Cache invalidation via TTL or purge API. Implement cache hierarchy: edge caches → regional caches → origin. Use consistent hashing to distribute content across edge servers. Monitor cache hit ratio, origin load, and latency metrics.

Practice designing CDN with AI →

Communication Systems

Design WhatsApp / Messenger

Medium-Hard High Frequency

Companies: Meta, Slack, Discord, Telegram, Signal

Key Concepts: WebSocket for real-time, message queues, read receipts, end-to-end encryption, offline message delivery

Real-time chat requires persistent connections—use WebSocket for bidirectional communication. When user sends message, server receives via WebSocket, stores in Cassandra (sharded by conversation ID), and delivers to recipient's WebSocket if online. If recipient offline, queue message in Kafka for delivery when they reconnect. Store messages with metadata: timestamp, sender, conversation_id, status (sent/delivered/read). Read receipts via separate events through WebSocket. For group chats, fanout message to all participants. Handle media separately—upload to blob storage, send reference in message. Implement presence service (online/offline/typing) using Redis. Consider end-to-end encryption (keys never stored on server).

Practice designing WhatsApp with AI →

Design Slack

Hard Medium Frequency

Companies: Slack, Discord, Microsoft Teams, Atlassian

Key Concepts: Workspace isolation, channels, message threading, search, integrations, presence at scale

Slack is WhatsApp + workspaces + channels + search + integrations. Use WebSocket for real-time messaging. Store messages in Cassandra partitioned by channel_id. Each workspace is isolated tenant—separate data, billing, permissions. Elasticsearch for message search across channels (full-text search with filters). Threaded conversations require parent-child message relationships. Presence service tracks online status per workspace. File uploads to blob storage with virus scanning. Integration platform for bots and apps—webhook endpoints, slash commands, OAuth for third-party apps. Very active channels (1000+ msg/min) need optimizations: batching, backpressure, fan-out limits.

Practice designing Slack with AI →

Design Zoom

Hard Low Frequency

Companies: Zoom, Google Meet, Microsoft Teams, Amazon Chime

Key Concepts: WebRTC, SFU architecture, audio/video codecs, bandwidth adaptation, recording, screen sharing

Video conferencing uses WebRTC for peer-to-peer or Selective Forwarding Unit (SFU) architecture where server forwards streams without transcoding. Each participant sends video/audio to SFU, which forwards to other participants. Implement bandwidth adaptation—reduce quality when network poor. Audio mixing on server for large meetings. Support screen sharing as additional video stream. Recording requires capturing all streams and encoding to video file in cloud. Implement waiting room, breakout rooms, chat (like WhatsApp). Handle tens of thousands in webinar mode (few senders, many viewers—different architecture from meetings). This question tests real-time systems and media streaming knowledge.

Practice designing Zoom with AI →

Design an Email Service

Medium Low Frequency

Companies: Gmail, Outlook, Yahoo, Superhuman

Key Concepts: SMTP/IMAP protocols, email storage, spam filtering, search, attachments, labels/folders

Email system has sending and receiving paths. Sending: User composes → SMTP server sends to recipient's domain → Recipient's SMTP server receives. Receiving: Store in user's mailbox, index for search. Use blob storage for attachments. Implement spam filter using ML (Bayesian classifier) on email content. Search via Elasticsearch on subject, sender, body. Support labels (Gmail) or folders (Outlook) as tags on messages. Handle read/unread status, starring, archiving. Email storage is write-once-read-many—optimize for storage efficiency with compression. Sync across devices using IMAP. This is less common but tests email protocols and storage optimization.

Practice designing email service with AI →

E-commerce & Marketplace

Design Amazon Product Search

Medium-Hard Medium Frequency

Companies: Amazon, eBay, Shopify, Etsy, Walmart

Key Concepts: Elasticsearch, ranking algorithms, autocomplete, filters/facets, personalization

Index products in Elasticsearch with fields: title, description, category, brand, price, rating, reviews. Search query goes to Elasticsearch which returns matching products. Ranking algorithm combines relevance score (TF-IDF), popularity (sales, reviews), price, and personalization (user history). Implement autocomplete using prefix matching in Elasticsearch or separate trie data structure. Faceted search filters (category, price range, brand, rating) using aggregations. Cache popular search queries and their results in Redis. Update search index in near real-time as products added/changed. Personalize results using user's browsing history and purchase patterns.

Practice designing product search with AI →

Design Netflix / YouTube

Medium-Hard High Frequency

Companies: Netflix, YouTube, Amazon Prime, Hulu, Disney+

Key Concepts: Video transcoding, adaptive bitrate streaming (HLS/DASH), CDN, recommendation system, watch history

Video streaming combines storage, transcoding, CDN, and recommendations. Upload flow: Video uploaded to blob storage → Transcoding pipeline generates multiple resolutions (480p, 720p, 1080p, 4K) and formats → Chunks stored in blob storage. Streaming: User requests video → CDN serves video chunks → Player uses adaptive bitrate streaming (HLS/DASH) to adjust quality based on bandwidth. Recommendation engine uses collaborative filtering (users with similar tastes) and content-based filtering (genre, actors). Track watch history and use for both recommendations and "continue watching". Search via Elasticsearch. Handle massive bandwidth with aggressive CDN caching and optimal encoding.

Practice designing Netflix with AI →

Design Uber / Lyft

Hard High Frequency

Companies: Uber, Lyft, DoorDash, Instacart, Grab

Key Concepts: Geospatial indexing (geohash/quadtree), matching algorithms, real-time location tracking, ETA calculation, surge pricing

Uber's core challenge is matching riders to nearby drivers in real-time. Use geospatial database (PostGIS or Redis Geo) with geohashing/quadtree to index driver locations. When rider requests ride, query for nearby drivers within radius. Matching algorithm considers: distance, driver rating, ETA, car type. Use WebSocket for real-time location updates from driver app (every 4 seconds). Calculate ETA using routing service (Google Maps API or internal). Implement surge pricing based on supply/demand ratio in each zone. Store trip data for analytics and billing. Handle payment processing integration. Notification service for ride status updates. This tests geospatial algorithms and real-time systems.

Practice designing Uber with AI →

Design Food Delivery System

Hard Medium Frequency

Companies: DoorDash, Uber Eats, Instacart, Postmates, Deliveroo

Key Concepts: Restaurant search, order management, driver assignment, multi-stop routing, real-time tracking

Similar to Uber but with restaurant and order complexity. Customer flow: Search restaurants (Elasticsearch with filters) → Browse menu → Place order → Payment. Restaurant receives order notification, accepts, and prepares food. Driver assignment: Find nearby available driver, assign pickup (restaurant) and dropoff (customer) locations. Driver picks up order, delivers to customer. Real-time tracking for customer and restaurant. Handle multi-stop routes (driver picking up multiple orders). Implement order state machine (placed → accepted → preparing → ready → picked up → delivered). Rating system for drivers and restaurants. This combines search, matching, routing, and real-time tracking challenges.

Practice designing food delivery with AI →

Ready to Practice These Questions?

The best way to prepare for system design interviews is through realistic mock interviews. Practice these 20+ questions with our AI interviewer, get instant feedback, and build confidence before your real interview.

First interview is free. No credit card required.

Related Resources