API Design Part 5: Caching Strategies
Master multi-layer caching architecture, HTTP cache headers, ETags, and cache invalidation patterns. Build fast, scalable APIs with proper caching.
Moshiour Rahman
Advertisement
API Design Mastery Series
This is Part 5 of our comprehensive API Design series.
| Part | Topic | Level |
|---|---|---|
| 1 | HTTP & REST Fundamentals | Beginner |
| 2 | Security & Authentication | Beginner |
| 3 | Rate Limiting & Pagination | Intermediate |
| 4 | Versioning & Idempotency | Intermediate |
| 5 | Caching Strategies | Intermediate |
| 6 | GraphQL & gRPC | Intermediate |
| 7 | Resilience & Observability | Advanced |
| 8 | Production Mastery | Advanced |
The Caching Decision Framework
| Question | If YES | If NO |
|---|---|---|
| Is data static? | CDN cache (1 year) | Continue checking… |
| Is data user-specific? | Private cache (short TTL) | Continue checking… |
| Is staleness acceptable? | stale-while-revalidate | no-cache, must-revalidate |
HTTP Cache Headers Deep Dive
// cache-headers.ts - Complete cache header management
interface CacheConfig {
type: 'public' | 'private' | 'no-store';
maxAge?: number;
sMaxAge?: number; // CDN/proxy cache time
staleWhileRevalidate?: number; // Serve stale while fetching
staleIfError?: number; // Serve stale on origin error
mustRevalidate?: boolean;
noCache?: boolean; // Always revalidate
immutable?: boolean; // Never changes (with fingerprint)
}
export function buildCacheControl(config: CacheConfig): string {
const directives: string[] = [];
if (config.type === 'no-store') {
return 'no-store';
}
directives.push(config.type);
if (config.maxAge !== undefined) {
directives.push(`max-age=${config.maxAge}`);
}
if (config.sMaxAge !== undefined) {
directives.push(`s-maxage=${config.sMaxAge}`);
}
if (config.staleWhileRevalidate !== undefined) {
directives.push(`stale-while-revalidate=${config.staleWhileRevalidate}`);
}
if (config.staleIfError !== undefined) {
directives.push(`stale-if-error=${config.staleIfError}`);
}
if (config.mustRevalidate) {
directives.push('must-revalidate');
}
if (config.noCache) {
directives.push('no-cache');
}
if (config.immutable) {
directives.push('immutable');
}
return directives.join(', ');
}
// Common patterns
export const cachePatterns = {
// Static assets with fingerprinting (1 year)
staticAsset: buildCacheControl({
type: 'public',
maxAge: 31536000,
immutable: true
}),
// API response for public data (5 min cache, serve stale for 1 hour)
publicApi: buildCacheControl({
type: 'public',
maxAge: 300,
sMaxAge: 600,
staleWhileRevalidate: 3600
}),
// User-specific data (1 min, private)
privateApi: buildCacheControl({
type: 'private',
maxAge: 60
}),
// Sensitive data (never cache)
sensitive: 'no-store',
// Data that must be fresh but can use conditional requests
alwaysValidate: buildCacheControl({
type: 'public',
noCache: true,
mustRevalidate: true
})
};
// ETag generation
export function generateETag(content: string | Buffer): string {
const hash = createHash('sha256')
.update(content)
.digest('hex')
.slice(0, 16);
return `"${hash}"`;
}
// Weak ETag for semantic equivalence
export function generateWeakETag(version: number, lastModified: Date): string {
return `W/"${version}-${lastModified.getTime()}"`;
}
// Conditional request handling
export function handleConditionalRequest(
req: Request,
currentETag: string,
lastModified: Date
): Response | null {
const ifNoneMatch = req.headers.get('If-None-Match');
const ifModifiedSince = req.headers.get('If-Modified-Since');
// ETag takes precedence
if (ifNoneMatch) {
const clientETags = ifNoneMatch.split(',').map(e => e.trim());
if (clientETags.includes(currentETag) || clientETags.includes('*')) {
return new Response(null, {
status: 304,
headers: {
'ETag': currentETag,
'Last-Modified': lastModified.toUTCString()
}
});
}
}
if (ifModifiedSince) {
const clientDate = new Date(ifModifiedSince);
if (lastModified <= clientDate) {
return new Response(null, {
status: 304,
headers: {
'ETag': currentETag,
'Last-Modified': lastModified.toUTCString()
}
});
}
}
return null; // Proceed with full response
}
Multi-Layer Caching Architecture
The Five Caching Layers
| Layer | Technology | Caches | TTL Range |
|---|---|---|---|
| 1. Browser | Service Worker, HTTP Cache | Private, per-user | 60s - 1 year |
| 2. CDN Edge | Cloudflare, Fastly | Public, geographic | s-maxage based |
| 3. API Gateway | Rate limiting, coalescing | Request dedup | Seconds |
| 4. Application | Redis | Computed results, sessions | Minutes to hours |
| 5. Database | Query cache, buffer pool | Query results | Query-specific |
Cache Invalidation Patterns
| Pattern | Use Case | Pros | Cons |
|---|---|---|---|
| TTL expiry | General caching | Simple | Stale data until expiry |
| Event-driven | Data changes | Immediate | Complex |
| Cache-aside | Read-heavy | Simple reads | Cache miss penalty |
| Write-through | Write-heavy | Consistent | Slower writes |
| Write-behind | Async writes | Fast writes | Data loss risk |
// cache-invalidation.ts - Event-driven invalidation
export class CacheInvalidator {
private redis: Redis;
private pubsub: Redis;
constructor(redis: Redis) {
this.redis = redis;
this.pubsub = redis.duplicate();
}
// Tag-based invalidation (like Cloudflare cache tags)
async invalidateByTags(tags: string[]): Promise<void> {
for (const tag of tags) {
const keys = await this.redis.smembers(`tag:${tag}`);
if (keys.length > 0) {
await this.redis.del(...keys);
await this.redis.del(`tag:${tag}`);
}
}
// Notify other instances
await this.redis.publish('cache:invalidate', JSON.stringify({ tags }));
}
// Set cache with tags
async setWithTags(
key: string,
value: string,
tags: string[],
ttl: number
): Promise<void> {
const multi = this.redis.multi();
multi.set(key, value, 'EX', ttl);
for (const tag of tags) {
multi.sadd(`tag:${tag}`, key);
multi.expire(`tag:${tag}`, ttl);
}
await multi.exec();
}
}
Interview Question: “What’s the hardest problem in caching?”
Strong Answer: “Cache invalidation - Phil Karlton’s famous quote. The challenge is keeping cached data consistent with source of truth. Strategies depend on consistency requirements:
-
TTL-based: Simplest, but stale data until expiry. Good for data that can be eventually consistent.
-
Event-driven: Publish invalidation on data change. Best for strong consistency but adds complexity and points of failure.
-
Cache tags: Associate entries with tags, invalidate all entries with a tag. Great for related data (invalidate user:123’s posts when user changes).
-
Versioned keys: Include version in cache key. On write, increment version. Reads always get fresh data but orphan entries.
The right answer depends on business requirements. Stock prices need real-time accuracy; product catalog can be minutes stale.”
Cache Key Design Patterns
Proper cache keys prevent collisions and enable efficient invalidation:
// cache-keys.ts - Structured cache key generation
const CACHE_KEYS = {
// User-specific data
user: (id: string) => `user:${id}`,
userProfile: (id: string) => `user:${id}:profile`,
userPreferences: (id: string) => `user:${id}:preferences`,
// Tenant-scoped data
tenantUsers: (tenantId: string) => `tenant:${tenantId}:users`,
tenantSettings: (tenantId: string) => `tenant:${tenantId}:settings`,
// Query results with parameters
userSearch: (tenantId: string, query: string, page: number) =>
`tenant:${tenantId}:search:users:${hashQuery(query)}:p${page}`,
// Version-tagged (for cache busting)
config: (version: string) => `config:v${version}`,
// Time-bucketed (auto-expires by design)
metrics: (date: string) => `metrics:${date}`,
};
// Hash long query strings to keep keys short
function hashQuery(query: string): string {
return createHash('md5').update(query).digest('hex').slice(0, 8);
}
| Pattern | Example | Use Case |
|---|---|---|
| Hierarchical | user:123:profile | Related data, wildcard invalidation |
| Tenant-scoped | tenant:abc:users | Multi-tenant isolation |
| Version-tagged | config:v3 | Config updates |
| Hash-based | search:a1b2c3d4 | Long query parameters |
| Time-bucketed | metrics:2025-01-01 | Time-series data |
Common Caching Mistakes
| Mistake | Problem | Fix |
|---|---|---|
| Cache everything | Memory exhaustion | Cache hot data, let cold data fall through |
| No TTL | Stale data forever | Always set expiration |
| Cache user data publicly | Privacy violation | Use private for user-specific data |
Ignoring Vary header | Wrong cached response | Vary: Authorization for user data |
| Cache validation errors | 4xx responses cached | Only cache successful responses |
| No cache metrics | Can’t measure effectiveness | Track hit/miss ratio |
| String keys with spaces | Cache key collisions | Normalize and hash keys |
Cache Warming Strategies
Pre-populate cache before traffic hits:
// cache-warmer.ts - Proactive cache warming
export class CacheWarmer {
async warmOnDeploy(): Promise<void> {
// Warm frequently accessed, expensive-to-compute data
const tasks = [
this.warmPopularProducts(),
this.warmCategoryTrees(),
this.warmConfigData(),
];
await Promise.all(tasks);
}
async warmOnSchedule(): Promise<void> {
// Run every hour - refresh data before TTL expires
const products = await db.product.findMany({
where: { views: { gte: 1000 } },
take: 100,
});
for (const product of products) {
const data = await this.computeProductData(product.id);
await cache.set(`product:${product.id}`, data, 3600);
}
}
async warmOnWrite(productId: string): Promise<void> {
// Immediately cache after database write
const data = await this.computeProductData(productId);
await cache.set(`product:${productId}`, data, 3600);
}
}
| Strategy | When | Benefit |
|---|---|---|
| On-deploy | After deployment | Prevents cold start latency |
| Scheduled | Before TTL expires | Always-warm cache |
| On-write | After data changes | Instant consistency |
| On-miss | Cache aside pattern | Simple, lazy loading |
Caching Quick Reference

What’s Next?
Now that you understand caching, Part 6: GraphQL & gRPC covers modern API protocols beyond REST.
Advertisement
Moshiour Rahman
Software Architect & AI Engineer
Enterprise software architect with deep expertise in financial systems, distributed architecture, and AI-powered applications. Building large-scale systems at Fortune 500 companies. Specializing in LLM orchestration, multi-agent systems, and cloud-native solutions. I share battle-tested patterns from real enterprise projects.
Related Articles
API Design Part 3: Rate Limiting & Pagination
Master rate limiting algorithms, production Redis implementations, and cursor pagination. Protect your API from abuse while efficiently serving large datasets.
System DesignRedis Caching: Complete Guide to High-Performance Data Caching
Master Redis caching for web applications. Learn cache strategies, data structures, pub/sub, sessions, and build scalable caching solutions.
System DesignAPI Design Part 6: GraphQL & gRPC
Master modern API protocols beyond REST. Learn when to use GraphQL for flexible queries, gRPC for high-performance microservices, and how to implement both in production.
Comments
Comments are powered by GitHub Discussions.
Configure Giscus at giscus.app to enable comments.