Redis Objects: The Art of Memory Polymorphism
In C, a string and a list are fundamentally different types. But in Redis, you can run EXPIRE or OBJECT ENCODING on any key, regardless of its type. This is possible because of the Redis Object (robj) system.
1. Intuition: The Wrapper Pattern
Why this exists in Redis
If Redis stored raw data, it couldn't track "Last Access Time" for LRU or "Reference Counts" for memory sharing. Every value needs metadata.
Real-world problem it solves: Consistent memory management and polymorphism. It allows the database to "change its mind" about how data is stored (Encoding) without the user ever knowing.
The Analogy (The Shipping Container)
A shipping container (the robj) always looks the same from the outside. You can stack it, move it, and track its weight. But inside, it might contain a car (a raw string) or 1,000 toy bricks (a ziplist).
2. Internal Architecture
Components Involved (The robj structure)
Every object in the Redis source code (server.h) is defined as:
- Type: String, List, Set, etc.
- Encoding: How it's stored (Raw, Int, Ziplist).
- LRU: Last access time for eviction.
- Refcount: For memory sharing.
- Ptr: Pointer to the actual data.
How Redis designs this feature
It uses the Wrapper Pattern. Every value has a dedicated metadata header that allows for polymorphism and intelligent memory management.
Trade-offs
- Overhead: Every single value has at least 16 bytes of metadata overhead.
- Flexibility: This overhead allows for Encoding Switching (e.g., a List becoming a Linked List as it grows) without user interaction.
3. End-to-End Flow (VERY IMPORTANT)
Client → TCP → Event Loop → Command Queue → Parser → Execution → Data Store → Response
-
Client: Sends
SET counter 100. -
Parser: Extracts the string
"100". -
Object Factory:
- Tries to parse
"100"as a long integer. - Success! It creates an
robjwithtype: OBJ_STRINGandencoding: OBJ_ENCODING_INT. - Instead of a pointer to a string, the
ptrfield actually stores the integer 100 directly (optimization).
- Dictionary: The
robjis stored in the main Hash Map. - Later (INCR): When
INCRis called, Redis sees theINTencoding and performs math directly on the pointer value.
4. Internals Breakdown
Integer Sharing
Redis pre-allocates objects for integers from 0 to 9999 on startup. If you have 10,000 keys all set to 1, they all point to the exact same memory address. This saves massive amounts of RAM.
Encoding: embstr vs raw
- embstr: For strings < 44 bytes, the
robjand the string data are allocated in a single memory block. This is faster (1 allocation instead of 2). - raw: For larger strings, they get their own allocation.
5. Node.js Reimplementation (Hands-on)
Step 1: The RedisObject Class
class RedisObject {
constructor(type, encoding, value) {
this.type = type;
this.encoding = encoding;
this.value = value;
this.lru = Math.floor(Date.now() / 1000);
this.refcount = 1;
}
}Step 2: The Object Factory (Encoding Logic)
function createStringObject(value) {
// Optimization: Check if it's an integer
const num = Number(value);
if (Number.isInteger(num)) {
return new RedisObject('string', 'int', num);
}
// Optimization: Simulated embstr for small strings
if (value.length < 44) {
return new RedisObject('string', 'embstr', value);
}
return new RedisObject('string', 'raw', value);
}Step 3: Implementing INCR (Polymorphic Math)
function handleIncr(key) {
const obj = dictionary.get(key);
if (obj.type !== 'string') return "-ERR Wrong Type\n";
if (obj.encoding === 'int') {
obj.value++;
} else {
// Try to promote 'raw' to 'int'
const num = Number(obj.value);
if (isNaN(num)) return "-ERR Not an integer\n";
obj.value = num + 1;
obj.encoding = 'int';
}
return `:${obj.value}\r\n`;
}6. Performance, High Concurrency & Backpressure
High Concurrency Behavior
Since every object has a refcount, Redis can share common objects (like small integers 0-9999) across 1,000s of keys without increasing memory usage.
Memory Bottlenecks & Scaling
- Bottleneck: Pointer Chasing. If a Hash Table grows too large, the CPU has to jump to many different memory addresses to find a value, causing "Cache Misses".
- Scaling: As memory fills, Redis must use Active Rehashing to grow its internal tables without blocking the user.
7. Redis vs. Our Implementation: What we Simplified
- Manual Memory Control: Redis uses jemalloc to allocate exact byte sizes. We rely on the V8 engine, which often allocates much more memory than needed to simplify garbage collection.
- Small-Int Sharing: Redis pre-allocates the first 10,000 integers to save memory. In Node, every number is a 64-bit float.
8. Why Redis is Optimized
Redis is optimized for Packing. By using Bit-fields and Unions in C, it packs metadata into the smallest possible space, ensuring that the "Metadata-to-Data" ratio remains lean.
- Bitfields: Redis uses C bitfields (
unsigned type:4) to pack the metadata into exactly 16 bytes. JavaScript objects are much heavier (often 40-80 bytes per object). - Pointers: Redis uses raw pointers. We use JavaScript references, which V8 manages for us (Garbage Collection).
9. Edge Cases & Failure Scenarios
- Memory Fragmentation:
embstrreduces fragmentation because it's a single allocation, but largerawstrings that frequently resize can create "pockets" of unusable RAM. - Refcount Overflow: In theory,
refcountcould overflow a 32-bit integer. Redis handles this by making the object "permanent" if it hits the maximum value. - Integer Promotion Overhead: Constantly switching between
intandraw(e.g.,APPENDto an integer key) causes frequent re-allocations and CPU spikes.
8. Summary & Key Takeaways
- Encodings are dynamic: Redis selects the most efficient format based on data size and type.
- Metadata is the cost of features: Those 16 bytes enable TTL, LRU, and Refcounting.
- Polymorphism starts here: The
robjis why Redis feels so flexible.
Next Step: Memory isn't infinite. How does Redis decide what to kill when it's full? Let's explore LRU Eviction & Memory Management.