stale data, served with confidence
Cache Invalidation
Invalidation isn't a footnote — it's business logic, a work order that takes a process and happens now. One of the two hard problems.
A write that does not invalidate is a bug waiting for the next read.
01in the wild
In the wild
Stale Reads After a Write
Cache forever, never invalidate, and every reader sees the pre-update world.
example.py
cache = {}
# SMELL: cache forever, never invalidate -> stale reads
def get_user(uid):
if uid not in cache:
cache[uid] = db.fetch(uid)
return cache[uid] # update in db? caller never sees it
# RIGHT: invalidate on write; treat it as part of the work order
def update_user(uid, data):
db.write(uid, data)
cache.pop(uid, None) # the recall happens nowA write that doesn't invalidate the cache is a bug waiting for the next read. Invalidation is a process, and it happens right now.
// observed
no invalidation: reads show pre-update data forever on-write pop: next read is fresh from db
example.go
// SMELL: write the DB, forget the cache -> stale forever
func Update(id string, v Value) {
db.Write(id, v) // cache still holds the old value
}
// RIGHT: invalidate (or update) the cache as part of the write
func Update(id string, v Value) {
db.Write(id, v)
cache.Delete(id) // next read repopulates from db
}The cache is part of your write path, not a side concern. If a write doesn't touch it, the cache lies until eviction.
// observed
no delete: cached read stays stale indefinitely with delete: next read reflects the write
Thundering Herd on Expiry
Everything expires at the same instant and every miss stampedes the database.
example.go
// SMELL: TTL with no jitter -> thundering herd on expiry
cache.Set(key, val, 60*time.Second) // 10k keys expire at once
// RIGHT: add jitter so expirations spread out
ttl := 60*time.Second + time.Duration(rand.Intn(15))*time.Second
cache.Set(key, val, ttl)If everything expires at the same instant, every miss hits the database simultaneously. Jittered TTLs smear the load.
// observed
fixed TTL: synchronized stampede on the DB jitter: misses spread across a 15s window
example.py
# SMELL: every concurrent miss recomputes the same hot key
def get(key):
v = cache.get(key)
if v is None:
v = expensive(key) # 1000 misses -> 1000 calls
cache[key] = v
return v
# RIGHT: single-flight -- one loader, the rest wait on it
def get(key):
with inflight.lock(key): # per-key lock
v = cache.get(key)
if v is None:
v = expensive(key)
cache[key] = v
return vOn a cold hot key, every concurrent reader recomputes it. A per-key lock collapses the herd to a single load.
// observed
naive: N concurrent misses -> N expensive() calls single-flight: 1 call, the rest reuse the result
02cross-pollination
Where this compounds
Nondeterminism
- Stale-Cache Heisenbug × Impure Functions
- Hash-Seed Serialization Divergence × Impure Functions
Data Corruption
- Web Cache Poisoning × Unconstrained Inputs
03weakness catalog
Mapped weaknesses (CWE)
On its own, this defect is catalogued by MITRE as one or more of these weaknesses. The exploitable vulnerability usually appears only when it chains or combines with another.