Cache Adapters¶

Advanced caching patterns: consistent hashing, stampede protection, two-tier caching, and full-page HTTP caching. All adapters use PostgreSQL UNLOGGED tables or in-memory storage.

Overview¶

HyperDjango's cache adapter system builds on top of the core cache framework (LocMemCache and DatabaseCache) with production-ready distributed caching patterns:

Adapter	Purpose
`ConsistentHashRing`	Distribute keys across multiple cache nodes
`StampedeProtection`	Prevent thundering herd on cache miss (XFetch algorithm)
`TwoTierCache`	L1 in-memory + L2 database layered cache
`CacheMiddleware`	Full-page HTTP response caching

All adapters implement the CacheAdapter protocol:

class CacheAdapter(Protocol):
    def get(self, key: str, default: Any = None) -> Any: ...
    def set(self, key: str, value: Any, ttl: int | None = None): ...
    def delete(self, key: str) -> bool: ...
    def clear(self): ...
    def has(self, key: str) -> bool: ...

ConsistentHashRing¶

Native Zig ketama-compatible hash ring for distributing cache keys across multiple nodes. Uses SIMD-optimized scanning of a contiguous sorted array for cache-friendly binary search lookups. Benchmarks at 3x faster than the Python uhashring package.

Constructor¶

from hyperdjango.cache_adapters import ConsistentHashRing

# Basic: equal-weight nodes
ring = ConsistentHashRing(nodes={
    "cache1": cache_backend_1,
    "cache2": cache_backend_2,
    "cache3": cache_backend_3,
})

# With custom replicas and vnodes
ring = ConsistentHashRing(
    nodes={"shard1": cache1, "shard2": cache2},
    replicas=4,     # Number of hash function replicas per node
    vnodes=40,      # Virtual nodes per real node (higher = better distribution)
)

# With weight function (e.g., more capacity = higher weight)
ring = ConsistentHashRing(
    nodes={"big": cache_big, "small": cache_small},
    weight_fn=lambda name: 3 if name == "big" else 1,
)

Constructor Parameters¶

Parameter	Type	Default	Description
`nodes`	`dict[str, Any]`	`None`	Map of node name to cache backend instance
`replicas`	`int`	`4`	Hash function replicas per node
`vnodes`	`int`	`40`	Virtual nodes per real node
`weight_fn`	`Callable`	`None`	Function returning weight for a node name

Key Methods¶

get_node(key) -- returns the cache backend responsible for a key:

node = ring.get_node("user:42")    # Returns the cache backend object
await node.set("user:42", user_data, ttl=300)

get_node_name(key) -- returns the name of the responsible node:

name = ring.get_node_name("user:42")  # "cache2"

add_node(name, backend, weight=1) -- add a node to the ring:

ring.add_node("cache4", new_cache_backend, weight=2)
# Ring is automatically rebuilt after adding

remove_node(name) -- remove a node from the ring:

ring.remove_node("cache2")
# Keys previously on cache2 are redistributed to remaining nodes

get_stats() -- ring statistics including per-node point distribution:

stats = ring.get_stats()
# {
#     "total_points": 480,
#     "node_count": 3,
#     "distribution": {"cache1": 160, "cache2": 160, "cache3": 160}
# }

hash_key(key) -- hash a key to a ketama-compatible 32-bit integer (static method):

h = ConsistentHashRing.hash_key("user:42")  # 2847361052

Node Distribution¶

The hash ring uses ketama-compatible MD5 hashing. Each node gets vnodes * weight virtual points on the ring. When a key is looked up, it is hashed and the nearest virtual point clockwise determines the owning node.

With default settings (vnodes=40, equal weights), key distribution across 3 nodes is within 2% of perfectly uniform. Increasing vnodes improves uniformity at the cost of slightly more memory.

How It Works¶

Each node is hashed to multiple points on a 32-bit integer ring
Points are stored in a contiguous sorted array (batch sort, not insort)
Key lookup uses binary search on the sorted array (cache-friendly)
The Zig implementation uses SIMD for scanning when the ring is large
Adding/removing nodes triggers a rebuild of the sorted array

StampedeProtection (XFetch)¶

Prevents thundering herd on cache miss using the XFetch algorithm. When a cached value approaches expiry, individual requests have an increasing probability of recomputing it -- spreading regeneration across multiple requests instead of all hitting at once.

How XFetch Works¶

The standard cache stampede problem: when a popular key expires, all concurrent requests see a cache miss and simultaneously recompute the value, overwhelming the backend.

XFetch solves this by storing metadata alongside each cached value:

expires_at -- when the value actually expires
compute_time -- how long it took to compute the value

On each get(), XFetch calculates a probability of early recomputation:

P(recompute) = beta * compute_time * ln(random()) + expires_at <= now

The beta parameter controls aggressiveness:

Beta	Behavior
`0.5`	Conservative -- recompute very close to expiry
`1.0`	Default -- balanced early refresh
`2.0`	Aggressive -- start refreshing well before expiry

Usage¶

from hyperdjango.cache_adapters import StampedeProtection
from hyperdjango.cache import LocMemCache

backend = LocMemCache(max_entries=10000)
cache = StampedeProtection(backend=backend, beta=1.0)

# Set a value with compute time metadata
start = time.time()
value = await expensive_database_query()
compute_ms = (time.time() - start) * 1000

cache.set("dashboard:stats", value, ttl=300, compute_time_ms=compute_ms)

# Get -- may return None early to trigger one request to recompute
result = cache.get("dashboard:stats")
if result is None:
    # This request is the "chosen one" to recompute
    value = await expensive_database_query()
    cache.set("dashboard:stats", value, ttl=300, compute_time_ms=50)
    result = value

Get-or-Set Pattern¶

For the common pattern of fetching from cache or computing on miss:

async def get_dashboard_stats():
    result = cache.get("dashboard:stats")
    if result is not None:
        return result

    # Cache miss (or XFetch early expiry)
    start = time.time()
    stats = await compute_stats()
    compute_ms = (time.time() - start) * 1000
    cache.set("dashboard:stats", stats, ttl=300, compute_time_ms=compute_ms)
    return stats

API Reference¶

Method	Description
`get(key, default=None)`	Get value with probabilistic early expiry
`set(key, value, ttl=300, compute_time_ms=0)`	Set value with stampede metadata
`delete(key)`	Delete a cached value
`clear()`	Clear all cached values
`has(key)`	Check if a key exists (subject to early expiry)

TwoTierCache¶

Layered cache with L1 (in-process LocMemCache) and L2 (shared DatabaseCache). L1 provides sub-microsecond access for hot keys, while L2 provides shared storage visible to all server processes.

Architecture¶

Request
  │
  ├── L1 Hit (LocMemCache) ──→ Return immediately (~0.1 us)
  │
  ├── L1 Miss, L2 Hit (DatabaseCache) ──→ Promote to L1, return (~1 ms)
  │
  └── L1 Miss, L2 Miss ──→ Return default / compute value

Configuration¶

from hyperdjango.cache import LocMemCache, DatabaseCache
from hyperdjango.cache_adapters import TwoTierCache

l1 = LocMemCache(max_entries=1000)
l2 = DatabaseCache(db, table="hyper_cache")

cache = TwoTierCache(
    l1=l1,
    l2=l2,
    l1_ttl=10,  # L1 entries expire after 10 seconds
)

Parameter	Type	Default	Description
`l1`	`CacheAdapter`	required	Fast local cache (typically `LocMemCache`)
`l2`	`CacheAdapter`	required	Shared cache (typically `DatabaseCache`)
`l1_ttl`	`int`	`10`	L1 TTL in seconds (shorter = more consistent)

Operations¶

Sync API (when L2 is sync):

# Get: checks L1 first, then L2, promotes on L2 hit
value = cache.get("user:42")

# Set: writes to both L1 and L2
cache.set("user:42", user_data, ttl=300)

# Delete: removes from both tiers
cache.delete("user:42")

# Clear: clears both tiers and resets stats
cache.clear()

Async API (when L2 is async, like DatabaseCache):

value = await cache.aget("user:42")
await cache.aset("user:42", user_data, ttl=300)

Write-Through Behavior¶

On set(), the value is written to both L1 and L2. L1 uses the shorter l1_ttl, while L2 uses the full ttl. This means:

L1 entries expire quickly, limiting staleness across processes
L2 entries persist longer, reducing database recomputation
On L1 miss + L2 hit, the value is promoted back to L1

Statistics¶

stats = cache.get_stats()
# {
#     "l1_hits": 8432,
#     "l2_hits": 1204,
#     "misses": 364,
#     "total_requests": 10000,
#     "l1_hit_rate": 0.8432,
#     "l2_hit_rate": 0.1204,
#     "overall_hit_rate": 0.9636,
# }

A healthy two-tier cache should show L1 hit rate > 80% for frequently accessed keys. If L2 hit rate is high but L1 is low, consider increasing l1_ttl or max_entries.

CacheMiddleware¶

Full-page HTTP response caching middleware. Caches GET responses and serves them with X-Cache: HIT/MISS headers.

Configuration¶

from hyperdjango.cache_adapters import CacheMiddleware

app.use(CacheMiddleware(
    cache=my_cache,                   # Any CacheAdapter
    ttl=60,                           # Cache responses for 60 seconds
    exclude=["/admin", "/api/auth"],  # Don't cache these path prefixes
    cache_authenticated=False,        # Skip caching for logged-in users
    vary_headers=["Accept-Language"], # Include these headers in cache key
))

Parameter	Type	Default	Description
`cache`	`CacheAdapter`	required	Cache backend to use
`ttl`	`int`	`60`	Response cache TTL in seconds
`exclude`	`list[str]`	`[]`	Path prefixes to exclude from caching
`cache_authenticated`	`bool`	`False`	Whether to cache responses for authenticated users
`vary_headers`	`list[str]`	`[]`	Headers to include in the cache key

Cache Key Generation¶

Cache keys are built from:

Request path (/api/users)
Query string (?page=2&sort=name)
Vary headers (e.g., Accept-Language=en)
User identity (when cache_authenticated=True)

If the combined key exceeds 200 characters, it is hashed with MD5 to keep keys compact.

Example keys:

page:/api/users                           # Simple path
page:/api/users|page=2&sort=name          # With query string
page:/api/users|Accept-Language=en        # With vary header
page:a1b2c3d4e5f6...                      # MD5 hash for long keys

Cache-Control Headers¶

The middleware adds X-Cache headers to indicate cache status:

# Cache hit
X-Cache: HIT

# Cache miss (response was cached for next request)
X-Cache: MISS

What Gets Cached¶

Only GET requests
Only 2xx responses
Skips paths matching any exclude prefix
Skips authenticated users unless cache_authenticated=True

Adapter Registry¶

Register custom cache adapters by name for configuration-driven cache selection:

from hyperdjango.cache_adapters import register_adapter, get_adapter, list_adapters

# Register a custom adapter
register_adapter("custom", CustomAdapter)

# Retrieve by name
adapter_class = get_adapter("custom")
cache = adapter_class(**adapter_config)

# List all registered adapters
names = list_adapters()  # ["custom"]

This pattern is useful when cache backend selection is driven by configuration files or environment variables:

cache_type = os.environ.get("CACHE_BACKEND", "locmem")
adapter_class = get_adapter(cache_type)
if adapter_class is None:
    raise ValueError(f"Unknown cache backend: {cache_type}")
cache = adapter_class(**cache_config)

Architecture: How the Adapters Compose¶

The adapters are designed to layer on top of each other. Here's how they compose for different scale levels:

Single Server (< 500 rps)¶

Request → LocMemCache (in-process LRU, ~0.1μs)
              ↓ miss
          Database query (~1-5ms)

Just use LocMemCache directly. No adapters needed.

Multi-Server (500-5K rps)¶

Request → TwoTierCache
              ├─ L1: LocMemCache (per-process, ~0.1μs) — 95%+ hit rate
              └─ L2: DatabaseCache (shared PostgreSQL UNLOGGED, ~1-5ms)
                         ↓ miss
                     Database query

TwoTierCache gives each server process a fast local cache (L1) backed by a shared cache (L2) that all servers can read/write. L1 entries expire quickly (10s), triggering L2 lookups that re-promote hot keys. Result: most requests never leave the process.

High Traffic (5K+ rps)¶

Request → StampedeProtection (XFetch early recompute)
              ↓
          TwoTierCache
              ├─ L1: LocMemCache
              └─ L2: DatabaseCache
                         ↓ miss
                     Database query (spread across time, no thundering herd)

Add StampedeProtection when cache misses cause expensive queries that many users hit simultaneously. XFetch spreads recomputation across requests rather than all hitting at expiry.

Sharded (horizontal scaling beyond single DB)¶

Request → ConsistentHashRing (routes to correct shard)
              ├─ Shard 1: TwoTierCache + StampedeProtection
              ├─ Shard 2: TwoTierCache + StampedeProtection
              └─ Shard 3: TwoTierCache + StampedeProtection

ConsistentHashRing distributes keys deterministically across cache shards. Each shard can be a TwoTierCache with its own L1/L2. Adding/removing shards only reroutes ~1/N keys.

Writing Custom Cache Backends¶

Any object implementing the CacheAdapter protocol works with all adapters. The protocol is small — just get, set, delete, has, clear:

from hyperdjango.native import fast_json_dumps, fast_json_loads

class CustomCache:
    """Custom cache backend compatible with all HyperDjango cache adapters."""

    def __init__(self, client, default_ttl: int = 300):
        self.client = client
        self.default_ttl = default_ttl

    async def get(self, key: str, default=None):
        raw = await self.client.get(key)
        if raw is None:
            return default
        return fast_json_loads(raw)

    async def set(self, key: str, value, ttl: int | None = None):
        ttl = ttl or self.default_ttl
        await self.client.setex(key, ttl, fast_json_dumps(value))

    async def delete(self, key: str) -> bool:
        return bool(await self.client.delete(key))

    async def has(self, key: str) -> bool:
        return bool(await self.client.exists(key))

    async def clear(self):
        await self.client.flushdb()

# Use it as L2 in TwoTierCache:
cache = TwoTierCache(
    l1=LocMemCache(max_entries=5000),
    l2=CustomCache(client, default_ttl=300),
    l1_ttl=10,
)

# Or in a ConsistentHashRing:
ring = ConsistentHashRing(nodes={
    "shard-1": CustomCache(client_1),
    "shard-2": CustomCache(client_2),
})

Why PostgreSQL UNLOGGED Is Often Enough¶

HyperDjango's DatabaseCache uses PostgreSQL UNLOGGED tables. The advantages:

Zero additional infrastructure — uses the database you already have, no separate service to deploy, monitor, or secure.
Atomic operations — INSERT ... ON CONFLICT, UPDATE ... RETURNING, race-safe get_or_set.
Shared across servers — every app server reads/writes the same UNLOGGED table.
Connection pooling — pg.zig's pool is already in the request path; no second pool to size.
Write throughput — no WAL = 2-3x faster than regular tables.
Persistence — survives app restarts; cleared on DB crash (by design — this is a cache, not a database).

DatabaseCache with a TwoTierCache L1 (LocMemCache, ~0.1 μs hit) gives sub-millisecond reads for 95%+ of requests with no additional infrastructure.

Failure Modes¶

Scenario	What happens	Recovery
L1 (LocMemCache) full	LRU eviction of least-recently-used entries	Automatic — hot entries stay, cold entries evict
L2 (DatabaseCache) down	`TwoTierCache(fail_silently=True)` logs warning, serves from L1	Fix DB connection; L2 auto-resumes on next successful query
L2 (DatabaseCache) slow	L1 absorbs 95%+ of reads; only cache misses hit slow L2	Monitor `l2_errors` in cache stats
Hash ring node removed	~1/N keys reroute to other nodes; those keys are cache misses until recomputed	Automatic — consistent hashing limits blast radius
Hash ring node added	~1/N keys reroute to new node; slight bump in cache misses	Automatic — new node warms up quickly from promoted L2 hits
Cache stampede	StampedeProtection spreads recomputation; one request recomputes, others get slightly-stale value	Automatic via XFetch algorithm
UNLOGGED table truncated (DB crash recovery)	All L2 cache entries lost	Automatic — L1 continues serving; L2 repopulates from cache misses