Understanding Redis Key Expiration

1/14/2025

Redis has a clever way of managing expired keys without checking every single one. Instead of a comprehensive scan, it uses a probabilistic approach that's both efficient and effective.

The 20 Keys Algorithm

Here's how Redis handles key expiration:

Randomly picks 20 keys from the database
Checks if these keys have expired
Deletes any expired keys it found
Keeps track of how many were expired

The magic happens in the decision making:

If more than 25% of the sampled keys were expired (so 5+ keys out of 20), Redis repeats the process
It continues this cycle until the sample shows less than 25% expired keys

Why This Works

The beauty of this approach lies in probability. If you're finding that more than 5 out of 20 random keys are expired, it's a strong indicator that:

There's significant cleanup work to be done
The database likely has many more expired keys
It's worth spending CPU cycles on more cleanup

When the sample shows fewer expired keys:

It suggests most expired keys have been cleared
Further scanning would yield diminishing returns
System resources are better spent elsewhere

The Numbers Make Sense

The choice of 20 keys isn't random:

It's large enough to be statistically meaningful
Small enough to not impact performance
Provides a good balance between accuracy and speed

The 25% threshold is equally well-considered:

High enough to justify continued cleanup
Low enough to avoid excessive CPU usage
Provides a clear, binary decision point

In Practice

This sampling approach means Redis can:

Keep memory usage in check
Avoid performance bottlenecks
Handle millions of keys efficiently
Clean up expired keys gradually
Maintain responsiveness under load

Without checking every key, Redis maintains a clean dataset while keeping CPU usage reasonable - a perfect example of how probability can solve real-world engineering challenges.

tagarwal.pro

The 20 Keys Algorithm

Why This Works

The Numbers Make Sense

In Practice