Understanding Redis Key Expiration
1/14/2025
Redis has a clever way of managing expired keys without checking every single one. Instead of a comprehensive scan, it uses a probabilistic approach that's both efficient and effective.
The 20 Keys Algorithm
Here's how Redis handles key expiration:
- Randomly picks 20 keys from the database
- Checks if these keys have expired
- Deletes any expired keys it found
- Keeps track of how many were expired
The magic happens in the decision making:
- If more than 25% of the sampled keys were expired (so 5+ keys out of 20), Redis repeats the process
- It continues this cycle until the sample shows less than 25% expired keys
Why This Works
The beauty of this approach lies in probability. If you're finding that more than 5 out of 20 random keys are expired, it's a strong indicator that:
- There's significant cleanup work to be done
- The database likely has many more expired keys
- It's worth spending CPU cycles on more cleanup
When the sample shows fewer expired keys:
- It suggests most expired keys have been cleared
- Further scanning would yield diminishing returns
- System resources are better spent elsewhere
The Numbers Make Sense
The choice of 20 keys isn't random:
- It's large enough to be statistically meaningful
- Small enough to not impact performance
- Provides a good balance between accuracy and speed
The 25% threshold is equally well-considered:
- High enough to justify continued cleanup
- Low enough to avoid excessive CPU usage
- Provides a clear, binary decision point
In Practice
This sampling approach means Redis can:
- Keep memory usage in check
- Avoid performance bottlenecks
- Handle millions of keys efficiently
- Clean up expired keys gradually
- Maintain responsiveness under load
Without checking every key, Redis maintains a clean dataset while keeping CPU usage reasonable - a perfect example of how probability can solve real-world engineering challenges.