Implementing Distributed Messaging with Redis Streams

Sayedur Rahaman

.NET | Azure | Gen AI | Angular | IaC

Published Dec 3, 2024

Redis is often recognized for its blazing-fast in-memory and distributed caching capabilities, but its utility extends far beyond being just a cache. As modern applications demand real-time data processing and seamless communication between components, Redis emerges as a versatile tool for building scalable systems. That’s all good, But what problems we can solve with Streams?

Use Cases:

Task distribution in an event-driven architecture
Decouple microservice communication
Building a lightweight but robust msg queueing system
Maintain Pub/Sub in Kubernetes scaling
Skip the headache of managing for Kafka or RabbitMq

We’re gonna explore the de-facto standards you need to maintain to build a scalable Redis Pub/Sub mechanism. We’ll dive beyond theories, and look into practical use cases and their implementation. Before that, let's summarize what the article has to offer:

Streams management
Consumer group utilization
Fault tolerance & Retry mechanism
Idempotency
Cleanup & resource utilization
Monitoring & DepOps checklist

What Pub/Sub Problem Do We Need to Solve?

Let’s break down some of the big hurdles first. Imagine you’re at a party (aka Kubernetes) where everyone’s shouting like Pub/Sub. Here’s what can go wrong:

I. Redis Pub/Sub vs. Queue

In a queue, messages are like emails that each consumer reads once and forgets. In Redis Pub/Sub, More like everyone gets the message at once. And that’s not always great when each Pod only needs to hear it once. It’s duplicating your work, and this is gonna be a problem.

ii. Scaling with Kubernetes: Only One Pod Should Process Each Message

K8s likes to scale out, and before you know it, you have 10 Pods, each fighting to process the same message like it’s the last slice of pizza. What we want here is for one Pod to get that message, process it, and tell the others, “I got this!” Easier said than done.

How Redis Streams Comes to the Rescue 🎉

Redis Streams turns our Pub/Sub into a VIP system, where each Pod knows exactly what it should handle and when. Here’s what makes Redis Stream the hero we need:

i. Pub/Sub Model

Unlike classic Pub/Sub that sprays messages around like everywhere, Redis Streams keeps things orderly. It lets you create consumer groups, which means messages are organized and only delivered to one Pod in each group. No more “everyone gets everything” madness.

ii. Consumer Group

Think of consumer groups as the friend group you message when you want something specific done. You send a message to the group, but only one person (aka Pod) picks it up and handles it. No duplicates, no confusion.

iii . Message Ownership

In Redis Streams, message ownership is tracked via consumer groups, where each consumer claims messages by consuming them.

Unacknowledged messages are stored in the Pending Entries List (PEL), indicating they are awaiting processing confirmation.
If a message remains unacknowledged for too long, it can be reclaimed by another consumer using the command.
The command helps monitor pending messages, allowing for automatic reassignment. This ensures that failed or idle consumers don't cause message loss. Regular cleanup using guarantees efficient message processing in distributed systems.

Understanding Message States in a Redis Stream

Redis Streams provides detailed control over message states in a consumer group, ensuring reliable message delivery, tracking, and error handling. In a consumer group, each message passes through several states, each with a purpose:

1. New (Unprocessed)

When a message is first published to a stream, it’s considered New. It’s waiting to be claimed by any consumer in the group. During this phase:

The message is available for all consumers in the group.
It remains in this state until a consumer reads it and claims ownership.
New messages are essentially queued and waiting for their turn to be processed.

2. Pending

Once a consumer reads and starts working on a message, that message enters the Pending state. This means:

The message is “locked” to that consumer (it’s claimed).
It’s now in the Pending Entries List (PEL), where Redis tracks which consumer is processing it.
A message remains in this state until it’s either acknowledged by the consumer as processed or retried if processing fails.

Tip: You can inspect the Pending Entries List (PEL) to see which messages are currently in this state using the command. It shows which messages are being worked on and which might need a retry.

3. Acknowledged (Processed)

After a consumer finishes processing the message, it sends an acknowledgment back to Redis, moving the message to the Acknowledged state:

Redis marks the message as processed by removing it from the PEL.
Acknowledged messages can be deleted from the stream to free up memory, or they can be retained for a certain period, depending on your retention policy.

Best Practice: Always acknowledge messages to keep the PEL from growing indefinitely. Unacknowledged messages can clutter the system and impact performance.

4. Failed (Unacknowledged or Timeout)

If a consumer fails to acknowledge a message due to an error or a timeout, the message is flagged as Failed:

Redis keeps the message in the PEL and marks it as unacknowledged.
You can detect failed messages by checking the PEL periodically, using a scheduled task to identify unprocessed messages based on their age.
Failed messages need to be reassigned to other consumers if the original consumer is unable to retry.

5. Reclaimed (Retried)

For failed or long-pending messages, Redis allows another consumer to claim ownership, which puts the message in the Reclaimed state:

Consumers can use the command to take over messages that have been in the PEL beyond a specified duration (e.g., 10 minutes).
Reclaimed messages re-enter the processing flow, giving other consumers a chance to complete processing.

Message Lifecycle in a Consumer Group

To help visualize the above states, here’s a common lifecycle flow of a message through Redis Streams in a consumer group:

Message Publish: A new message is added to the stream.
Consumer Claim (New → Pending): A consumer reads and claims the message, moving it from New to Pending.
Processing & Acknowledgment (Pending → Acknowledged):

If processed successfully, the consumer acknowledges the message.
Redis removes it from the PEL, and it’s considered fully processed.

4. Error or Timeout (Pending → Failed):

If the message isn’t acknowledged within a certain time (e.g., due to a consumer error or crash), it stays in Pending but is flagged as “Failed” or “Timeout.”

5. Retry (Failed → Reclaimed → Pending):

A different (or the same) consumer can reclaim the message, retry processing, and acknowledge it.
Once acknowledged, the message lifecycle ends.

Managing Redis Streams like a Pro

Now that we’ve got Redis Streams in our toolkit, let’s roll up our sleeves and dive into some code. Here are the steps to keep Redis Pub/Sub sane and reliable in a scalable system.

Managing Message States Effectively in a Distributed Environment

Set Idle Timeouts for Consumers: Define reasonable processing time thresholds to identify stuck or failed messages.
Automate Retries for Pending Messages: Use to periodically check for long-pending messages and retry them if needed.
Implement Robust Error Handling in Consumers: Each consumer should log errors or retry messages as appropriate, keeping the stream from backing up.
Clean Up Processed Messages Regularly: Define a retention policy for acknowledged messages, clearing old data to optimize Redis memory use.

Redis Streams offer powerful controls over message lifecycle states, enabling fault tolerance, retries, and efficient resource use — especially valuable in a containerized, distributed world.

1. Keep It Idempotent: Don’t Process Twice

Make your consumers idempotent. In English? Make sure that if a message gets processed twice, it doesn’t mess up anything. Here’s how to check if we’ve already handled a message.

2. Publish to Consumer Groups Like a Pro

Publishing is easy, but first, make sure the consumer group exists! Otherwise, Redis throws a fit. Here’s how to keep things smooth.

3. Acknowledge Like: Yeah it’s taken care of

Locks are the unsung heroes here. Without them, you’ll end up with that Groundhog Day effect, where each Pod might keep grabbing and processing the same message. With Redis Streams, we can solve this by setting up consumer groups and assigning ownership, so each message is taken care of once.

When a Pod finishes processing, it needs to acknowledge the message. This tells Redis, “I’m done here, you can move on.”

4. Delete Processed Messages

Processed messages are like old receipts — good to have, but eventually, they need to go. Clean up by deleting them from Redis.

5. Retry Failed or Stuck Messages

Sometimes a message fails, or a Pod drops the ball (or dies 💀). No problem: we need a retry setup that knows when to check on pending messages and reassign them if needed. This way, even if things go sideways, the job will get done.

DevOps Practices for Managing in Production

As a DevOps manager, you know it’s not just about getting code to run but making sure it does so reliably and at scale. Here’s how to get Redis Pub/Sub production-ready in a way that’ll make your on-call life a whole lot easier.

1. Set Up Monitoring and Alerting for Redis Streams

If there’s one thing DevOps loves, it’s visibility. Redis Streams introduce concepts like pending messages, unacknowledged states, and consumer groups — all things you’ll want to keep an eye on. Here are a few best practices:

Use Redis metrics to monitor stream sizes, pending message counts, and consumer health. Tools like Prometheus or Grafana can hook into Redis to give you real-time metrics on your stream’s status.
Set up alerts for specific thresholds, like when a pending message queue grows too long or when consumers are failing to acknowledge messages. This lets you catch issues before they escalate.

Sample Redis Metric Setup (Prometheus)

Tip: Check out Redis Exporter for Prometheus, which makes this setup a breeze.

2. Fine-Tune Redis Configuration for High Availability

Redis’s configuration can make or break your setup in high-load, containerized environments. Here’s what you should consider:

Enable Redis persistence if message loss isn’t acceptable. Redis uses RDB and AOF files for persistence, so you can decide between them based on your latency tolerance.
Optimize memory settings: Set up max memory and eviction policies carefully. For a high-load app, use to keep Redis from crashing by discarding less relevant data first.
Use Redis Sentinel or Cluster mode for high availability. Sentinel helps detect failures and trigger failovers, while Redis Cluster is ideal for horizontal scaling.

3. Optimize Message Processing with Autoscaling

For dynamic environments, set up Kubernetes to auto-scale Pods based on metrics like CPU, memory, and Redis streams length. You can use KEDA (Kubernetes Event-Driven Autoscaling) to autoscale based on custom metrics, including Redis pending message count.

This way, Kubernetes scales Pods only when there’s actual work to do, optimizing resource use and reducing costs.

4. Use Network Policies to Protect Redis

Redis often becomes a core dependency, so its security is critical. Set up network policies in Kubernetes to restrict access to Redis Pods, allowing only specific trusted services to connect.

Conclusion

In conclusion, Redis Streams provides an efficient and scalable solution for implementing distributed messaging systems. It offers powerful features like message persistence, message acknowledgment, and consumer group support. These features help developers build robust, fault-tolerant, and high-performance messaging architectures.

Redis Streams is well-suited for handling real-time data, event-driven architectures, and decoupling microservices. Its simplicity, low latency, and ease of integration make it an excellent choice for distributed messaging. With careful planning and proper implementation, Redis Streams can enhance the reliability and efficiency of messaging workflows. This ensures seamless communication across distributed systems.

Implementing Distributed Messaging with Redis Streams

Sayedur Rahaman

.NET | Azure | Gen AI | Angular | IaC

Use Cases:

What Pub/Sub Problem Do We Need to Solve?

How Redis Streams Comes to the Rescue 🎉

Understanding Message States in a Redis Stream

Message Lifecycle in a Consumer Group

Managing Redis Streams like a Pro

Managing Message States Effectively in a Distributed Environment

1. Keep It Idempotent: Don’t Process Twice

2. Publish to Consumer Groups Like a Pro

3. Acknowledge Like: Yeah it’s taken care of

4. Delete Processed Messages

5. Retry Failed or Stuck Messages

DevOps Practices for Managing in Production

1. Set Up Monitoring and Alerting for Redis Streams

Sample Redis Metric Setup (Prometheus)

2. Fine-Tune Redis Configuration for High Availability

3. Optimize Message Processing with Autoscaling

4. Use Network Policies to Protect Redis

Conclusion

Insights from the community

Others also viewed

Redis - Beyond Caching #1 (PubSub, EazyHack and Swags)

Caching with Redis & Memcached

Optimizing Performance: Redis Enterprise for High-Volume Read and Write Intensive Workloads

Using Redis with Rust

AWS update of Week 42 (16 Oct - 22 Oct)

Understanding Redis Master-Replica with Sentinel: A Comprehensive Guide

Creating a Redis cluster in Kubernetes

Redis Sentinel vs Redis Cluster

Redis: From a Simple Key-value Store to the One-stop Database Solution

Scaling Cache with Redis

Explore topics