A practical guide to caching on the web

Web caching stores responses so later requests can be answered faster, with less bandwidth and less load on the origin. The hard part is not turning caching on. The hard part is d…

Web caching stores responses so later requests can be answered faster, with less bandwidth and less load on the origin. The hard part is not turning caching on. The hard part is deciding which responses may be reused, for how long, and by whom. This guide covers the controls that decide that, sensible defaults by response type, and how to debug what a cache is actually doing.

What can cache a response

A response can be cached by the browser, a shared proxy, a CDN, or a service worker. Browser caches are private to one user. Shared caches can serve many users from a single stored copy. Service worker caches are controlled by application code within an origin.

Shared caches need stricter rules because one stored response can be reused for another user. Personalised responses should be marked private, or not stored at all, unless the application has a safe and deliberate design for sharing them.

Freshness and validation

A cached response is fresh when its caching metadata says it can be reused without contacting the origin. Freshness is commonly controlled with Cache-Control max-age or s-maxage.

A stale response may still be useful if it can be validated. Validation asks the origin whether the stored response is still current. ETag with If-None-Match is the usual strong tool. Last-Modified with If-Modified-Since is also common but has lower precision.

When the origin returns 304 Not Modified, the cache reuses the stored body and updates the stored header fields from the 304 response.

Cache-Control is the main control surface

Cache-Control carries directives for caches.

  • max-age defines how long a response stays fresh, in seconds.
  • s-maxage applies to shared caches and overrides max-age there. Private caches ignore it.
  • no-store tells caches of any kind not to store the response.
  • no-cache means the response may be stored, but it must be validated with the origin before each reuse. It does not mean do not cache.
  • private means the response is intended for a single user and may be stored only in a private cache, never a shared one.
  • public means the response may be stored by a shared cache, subject to the rest of the rules.

Use these directives deliberately. no-cache and no-store are often confused, but they are not equivalent. One allows storage with mandatory revalidation, the other forbids storage entirely.

Good defaults by response type

Static assets with content hashed filenames can usually be cached for a long time, because a content change produces a new URL. The common pattern is a long max-age plus immutable for hashed assets, which lets caches skip revalidation while the response is fresh.

HTML documents usually need shorter freshness or validation, because they point to the current asset graph. APIs vary. Public, read-only data may be cacheable. User specific data should normally be private or no-store depending on sensitivity.

Authentication pages, account pages, and responses containing secrets should use no-store unless there is a specific reviewed reason not to. Note that a public directive will cause a response to an authenticated request to be stored in a shared cache, so use it with care.

Vary changes the cache key

Vary tells caches which request header fields affect the selected response. For example, Vary: Accept-Encoding is used when compressed and uncompressed variants exist. Vary: Origin matters when CORS policy depends on the Origin request header.

Use Vary carefully. A broad Vary value can wreck cache effectiveness by creating many variants. A missing Vary value can make a cache reuse the wrong response.

CDNs and shared caches

CDNs are shared caches placed close to users. They can cache static assets, public pages, and selected API responses, hide origin latency, and absorb traffic spikes.

Be explicit about shared cache behaviour. s-maxage can let a CDN hold a response for longer than the browser. private can prevent shared caching when a response is user specific. no-store covers responses that must not be stored anywhere.

Do not rely on a CDN default for sensitive content. Set headers at the origin so behaviour is portable and reviewable.

Revalidation is not a failure

A 304 response is a successful optimisation, not a miss. The client still contacts a cache or the origin, but it avoids downloading the full body when the stored copy is current.

This is useful for HTML, API responses, and assets whose URL does not change with content. It gives correctness without forcing a full download on every request.

Invalidation is harder than expiry

Expiry lets cached content age out naturally. Invalidation tries to remove or replace cached content before its normal lifetime ends. CDNs often provide purge APIs, but invalidation across browsers is harder, because browser caches are distributed across user devices.

For assets, prefer versioned or hashed URLs. For documents and APIs, use shorter freshness windows and validation. For urgent correctness, design URLs and headers so clients are never stuck with long lived incorrect responses.

Debugging cache behaviour

Inspect response headers first. Check Cache-Control, ETag, Last-Modified, Age, Expires, and Vary. In browser developer tools, confirm whether the response came from memory cache, disk cache, a service worker, the network, or a CDN.

Also inspect request headers. Conditional requests carry If-None-Match or If-Modified-Since. Cache bypasses may include Cache-Control request directives sent by the browser or developer tools.

Conclusion

Caching works best when the policy matches the data. Cache hashed static assets for a long time, validate HTML and changeable API responses, keep personalised data private, and avoid storing secrets. Use Cache-Control, validators, and Vary as the explicit contract between origin, browser, and shared caches.