Caching Strategy
Caching makes things fast by keeping a copy of data closer to where it is needed. It is also where two of the worst kinds of bug live: stale data and, worse for us, one tenant seeing another's data through a shared cache key. Cache on purpose, key correctly (including the tenant), and always know how the cache gets invalidated.
A cache trades freshness for speed. That trade is often worth it, but it is still a trade. So every cache needs three decisions made on purpose: what to cache, how it is keyed, and when it expires or is invalidated. There is an old saying that the two hardest things in computer science are cache invalidation and naming things, and it is largely true.
For a multi-tenant platform the cache key is a security concern, not just a correctness one. A key that leaves out the tenant can serve one customer's data to another straight from memory, bypassing every database check (see Multi-Tenancy). Get the key right first, then worry about speed.
Cache correctly and safely
- AlwaysInclude the tenant (and user or role where relevant) in the cache key for any tenant- or user-scoped data. A tenant-blind key can leak data across customers.
- DoDecide expiry and invalidation up front (a TTL, an explicit invalidation on change, or both) so stale data has a limited life.
- DoCache the right things: expensive, frequently read, rarely changing data. Do not cache cheap lookups or fast-changing values where staleness causes bugs.
- DoProtect cached sensitive data like the source. The cache is another copy with the same classification (encryption, access, retention) (see Data Classification).
- AvoidCaching highly sensitive data (secrets, special-category) unless you genuinely must, and then only with extra care and short lifetimes.
- NeverCache personalised or tenant-scoped responses under a shared key, or at a shared layer (CDN or proxy) that could serve them to the wrong user.
var key = $"customer:{id}";
return cache.GetOrAdd(key, () => Load(id)); // no tenant in key
If ids are not globally unique across tenants, or simply collide, one tenant can be served another's cached customer, bypassing the database's tenant checks entirely. That is a silent cross-tenant data leak.
var key = $"{ctx.TenantId}:customer:{id}";
return cache.GetOrAdd(key, TimeSpan.FromMinutes(5),
() => Load(ctx.TenantId, id));
The tenant is part of the key, the entry expires, and the loader is itself tenant-scoped. Fast and isolated.
Keep it from causing bugs
- DoMake the system correct without the cache. Treat caching as an optimisation, so a cache miss or flush hurts performance, not correctness.
- DoHandle the cache being unavailable gracefully (fall back to the source), and avoid stampedes when a popular key expires.
- ConsiderCaching at the layer that gives the best hit rate for the least staleness risk, and keeping TTLs short where freshness matters.
- AvoidCaching something whose staleness has security or compliance impact (for example, permissions or a customer's risk status) without very careful, short-lived invalidation.
Self-review checklist
- AskDoes the cache key include the tenant (and user) for anything scoped to them?
- AskHow and when does this entry get invalidated or expire?
- AskIs the system still correct if the cache is empty or down?
- AskDoes staleness here cause only a delay, or a security or compliance problem?