Caching is a critical system design concept that speeds up data access by reducing database reliance. Hereβs a detailed breakdown of caching strategies, their workings, and where they shine in real-world applications. π
1. Cache-Aside (Lazy Loading) π€
-
How It Works:
- The application queries the cache first. If data isnβt available, it fetches it from the database, adds it to the cache, and returns it to the user. The cache is βpassiveβ and relies on the app for updates.
-
Techniques:
- Manual caching logic to decide when and what to cache.
- Cache invalidation when data changes to avoid stale entries.
-
Use Cases:
- E-commerce Websites: Amazon fetching product details lazily and caching them for future requests.
- Social Media Analytics: Twitter caching user metrics (impressions, retweets) when requested.
-
Software Examples:
- Redis, Memcached
- Application frameworks like Django that support manual caching.
-
Real-Life Analogy:
- You want a snack from your kitchen (cache). If itβs not there, you go to the store (database), buy it, and store it in your kitchen for next time.
2. Read-Through Cache π
-
How It Works:
- The cache system automatically handles database queries for missing data, stores it in the cache, and then serves it to the user.
-
Techniques:
- Transparent caching abstracts fetching logic from developers.
- Works well with TTL (time-to-live) to refresh cache entries periodically.
-
Use Cases:
- Streaming Platforms: Netflix caching metadata for shows and movies.
- News Websites: BBC caching headlines and fetching them automatically when expired.
-
Software Examples:
- Amazon DynamoDB DAX, Spring Cache
- CDNs (Cloudflare, Akamai)
-
Real-Life Analogy:
- You ask your librarian (cache) for a book. If they donβt have it, they fetch it from the main library (database) and keep it for future requests.
3. Write-Through Cache βοΈ
-
How It Works:
- Every write operation updates the cache and the database simultaneously, ensuring both stay in sync.
-
Techniques:
- Used with transactional systems to guarantee consistency.
- Provides immediate availability of updated data in the cache.
-
Use Cases:
- Banking Systems: Account balances updated in both cache and database.
- E-commerce Checkouts: Product stock updates applied instantly in cache and database.
-
Software Examples:
- Redis, Hazelcast
- Used in payment systems like Stripe or PayPal.
-
Real-Life Analogy:
- When you deposit cash at the bank, the teller (cache) updates your bank account and immediately updates the central system (database) so both are in sync.
4. Write-Behind Cache (Write-Back) π
-
How It Works:
- Data is written to the cache first, and the database is updated later in batches or asynchronously, prioritizing performance.
-
Techniques:
- Used for operations where high write performance is required.
- Works with eventual consistency as a trade-off.
-
Use Cases:
- Social Media Posts: Comments cached immediately but written to the database later.
- IoT Systems: Sensor data cached for immediate analytics, written to the database in batches.
-
Software Examples:
- Redis Streams for queueing updates.
- Used in apps like Instagram for managing likes and comments.
-
Real-Life Analogy:
- You drop off a package (data) at a delivery center (cache). They immediately note it as delivered but actually ship it (update the database) at the end of the day.
5. Cache Eviction Policies πͺ
-
How It Works:
- When the cache is full, eviction policies decide which data to remove to make space for new entries.
-
Techniques:
- Least Recently Used (LRU): Remove the least recently accessed data.
- Least Frequently Used (LFU): Remove data accessed the least number of times.
- First In, First Out (FIFO): Remove the oldest data first.
-
Use Cases:
- E-commerce Apps: Evict old product listings to cache trending items.
- Gaming Apps: Remove leaderboard data from older sessions.
-
Software Examples:
- Redis, Memcached (support LRU and TTL policies).
-
Real-Life Analogy:
- Your fridge (cache) is full. To make space for new groceries, you throw out expired items or the least-used leftovers.
6. Time-Based Expiration (TTL) β³
-
How It Works:
- Data in the cache expires after a specified time-to-live (TTL), ensuring fresh updates when queried again.
-
Techniques:
- Used with dynamic content that becomes stale quickly.
- Can be combined with refresh-ahead strategies.
-
Use Cases:
- Weather Apps: Forecast data expires after a day.
- Live Score Apps: Sports scores are cached for a short duration.
-
Software Examples:
- Memcached, Redis
-
Real-Life Analogy:
- Think of milk (data) in your fridge with an expiry date (TTL). Once expired, you buy a fresh carton (query the database).
7. Refresh-Ahead π
-
How It Works:
- Proactively refreshes data in the cache before it expires, ensuring no delays in fetching data when needed.
-
Techniques:
- Combines well with predictive algorithms to determine refresh timing.
- Reduces cache misses during peak usage.
-
Use Cases:
- Streaming Platforms (e.g., Spotify): Refresh playlists ahead of expiration for fast access.
- Inventory Management Systems (e.g., Shopify): Stock levels updated ahead of expiry to prevent stale data.
-
Software Examples:
- Redis Lua Scripts, Hazelcast
-
Real-Life Analogy:
- Imagine your fridge restocks milk automatically a day before it expires, so you always have fresh milk ready.
8. Distributed Caching ποΈ
-
How It Works:
- The cache is distributed across multiple servers, providing scalability, reliability, and faster access in large-scale systems.
-
Techniques:
- Use consistent hashing to balance cache load across servers.
- Replicate data across regions for fault tolerance.
-
Use Cases:
- Global Platforms (e.g., YouTube): Cache video metadata across regions for low latency.
- Online Games (e.g., Fortnite): Distribute leaderboard data across servers to handle millions of players.
-
Software Examples:
- AWS ElastiCache, Azure Cache for Redis
-
Real-Life Analogy:
- Think of a library chain (cache) spread across cities. You visit the nearest branch to get the book you need instead of waiting for the main library to deliver it.
Choosing the Right Strategy:
- Use Cache-Aside if you need full control over cache logic (e.g., analytics dashboards).
- Use Read-Through for automated data fetching in high-traffic systems (e.g., Netflix metadata).
- Use Write-Through for real-time consistency (e.g., banking apps).
- Use Write-Behind for high write performance (e.g., social media posts).
- Use TTL for dynamic data with a short lifespan (e.g., weather apps).
- Use Refresh-Ahead when data must always be ready and fresh (e.g., inventory systems).
- Use Distributed Caching for global scalability and fault tolerance (e.g., gaming leaderboards).
Other Posts
- SQL Practices:
- Part 1
- L0: Basic SQL
- L1: Intermediate SQL
- L2: Advanced SQL - Will Come soon
- Part 1
- System Design
- Implementation of ACID transaction in Database
- ACID Transactions in System Design
- Isolation in ACID Transaction
- Cache Strategies: A Complete Guide with Real-Life Examples π
- Load Balancing Algorithms with Examples
- Layer 4 vs Layer 7 Load Balancer
- The Journey from Edgar F. Codd to Modern SQL: How Relational Databases Changed the World
Top comments (0)