When and where to Message Queues — [Notes]

Tarun Jain

7 min readFeb 24, 2024

Handle Traffic Spikes: Message queues are effective at handling traffic spikes. Without message queues, it’s possible that your system might have broken because of traffic spikes. This would result in the loss of the messages. The usage of a message queue, however, allows your server to process messages at its own pace and eliminates the need to handle traffic surges on the server./
‘smooth out’ the peak load by quickly and efficiently handling the intake of new requests
We can dynamically scale our front-end and back-end as needed (e.g., if the line goes out the door, we can add more cashiers — if people are ordering lots of fancy drinks, we can add more workers). — example, for Uber can bring servers down at night and scale servers at day time
Workers can ‘look ahead’ at the queue to find simple optimizations (e.g., steam milk for multiple orders at once).
Workers can ‘specialize’ in specific tasks (e.g., a worker especially good at making Frappuccinos can selectively service these orders).
Protection for service outages and failures — As traditional monolithic services get spit into more-and-more microservices, each of which can (and will) fail, queues provide a critical buffer to mitigate temporary service outages. Without queues, a failure in one component can quickly cascade and bring down the whole service.
Security Isolation — allowing front-end servers to issue read requests directly but forcing write operations through a message queue to limit the blast radius of attacks.
Reduce coupling — Finally, message queues like Google Pub/Sub are excellent tools to reduce coupling between components. They enable producers to signal that an event has occurred without worrying about consumers. The solution to the classic ‘Design Twitter’ systems design problem leverages this. It uses a message queue to notify interested parties (including services that build indexes, send mails, etc.) that a new tweet has been created. Subscribers can be added/removed anytime without having to rewrite the core service.
In most cases, the use of a queue in a service is not directly exposed to the caller — it tends to be hidden behind an asynchronous API or long-running operation (submit and wait for response). And while languages and frameworks have gotten pretty good about abstracting this, it is still a lot easier to get the actual response instead of a ticket we need to poll. There are a lot more hidden complexities on the producer side of things.
Measuring Quality of Service (QoS) — Let’s go back to the Starbucks example. If all we measure is how long it takes the cashier to ring up the transaction or the time it takes for a barista to make a drink, we have lost sight of how our customers experience the system: the total time it takes them to get their drink. Unfortunately, though, all too often, service owners ignore this fact. When you introduce queues into your system’s design, you must also invest in a system to track and report on the end-to-end performance. Doing this well is not simple — and made even more complex when it is not considered upfront and needs to be retrofitted to an existing design.
Queue backlogs — One of the worst live-site incidents I was involved with was due to an improperly bounded queue. As discussed above, queues allow us to smooth out peak load — but they aren’t a silver bullet for scale issues. A large, sustained load beyond what the back-end can handle will result in a queue growing and growing. When designing a system with queues, we need to put a bound on the queue size, ensure our front-ends gracefully handle ‘queue full’ messages, and test our back-ends to know how long it will take to service a full queue. If you don’t do this, make sure you are good at apologizing — like me, you may be sent on an apology tour and be asked to explain to each of your customers why the system went down.
Messages get lost and/or duplicated — This subject probably requires its own article. All too often, I find engineers believe message queues are infallible. They point to the documentation that states that order is guaranteed and that messages are never lost. But even if this was true in principle — it rarely is in practice. While the queue itself may not mess up, you can bet your bonus that something else will go wrong (either a coding bug in the producer/consumer, a disaster recovery drill, or an overly tired SRE manually deleting messages from the queue at 2 am in a desperate attempt to get the service back online). These practical realities shouldn’t be ignored — and we need to build functionality into our system to detect and repair inconsistencies.
Increase Reliability: In the event of a system crash, message queues also have the option to store the messages in their memory. The messages would be kept in the queue. For instance, if the client that was supposed to receive the message broke down for any reason. The message queues can play the message once the client’s machine is up, and the client can then process the message from the time since it was down.
Ordering guarantees: An ordering guarantee is provided by the message queue. For instance, FIFO (First-in-First-Out). It indicates that the earlier-enqueued message will be delivered sooner.
Batch Processing: You also have the option to process messages batch-wise. You may, for instance, create a condition that processes 100 messages at once. This can have a number of advantages, such as the ability to bulk update the database rather than doing so individually for each of the 100 messages.
Fan-out: The payment service sends data to three downstream services for different purposes: payment channels, notifications, and analytics. This fan-out method is like someone shouting a message across a room; whoever needs to hear it, does. The producer simply drops the message on the queue, and the consumers process the message at their own pace.
Rate Limiting — In a flash sale, there can be tens of thousands of concurrent users placing orders simultaneously. It is crucial to strike a balance between accommodating eager customers and maintaining system stability. A common approach is to cap the number of incoming requests within a specific time frame to match the capacity of the system. Excess requests might be rejected or asked to retry after a short delay. This approach ensures the system remains stable and doesn’t get overwhelmed. For requests that make it through, message queues ensure they’re processed efficiently and in order. If one part of the system is momentarily lagging, the order isn’t lost. It’s held in the queue until it can be processed. This ensures a smooth flow even under pressure.
Horizontal Scalability — Since the services are decoupled, we can scale them independently based on demand. Each service can serve in a different capacity, so we can scale based on their planned QPS (query per second) or TPS (transaction per second).
Message Persistence — Message queues can also be used as middleware that stores messages. If the upstream service crashes, the downstream service can always pick up the messages from the message queue to process. In this way, the recovery function is moved out of each service and becomes the responsibility of the message queue.
Topic Log compaction — https://medium.com/swlh/introduction-to-topic-log-compaction-in-apache-kafka-3e4d4afd2262
→ or can do compaction in memory at application level

💡 Request-response architectures can become overwhelmed with multiple requests, causing delays and poor user experience.
💡 Message queues provide a way to offload and distribute processing tasks, improving scalability and response times.
💡 Queues are especially beneficial for long-running or resource-intensive processes, allowing other services to handle the workload.
💡 Separating concerns and not burdening web servers with heavy processing tasks can improve overall system performance.
💡 Using queues can enhance user experience by providing immediate feedback and reducing waiting times.
💡 Queue usage is recommended when requests are unpredictable or expected to increase significantly in the future.

Some scenarios where message queues are used?

Message Queues are used in a variety of scenarios, some of which are:

Inter-system communication: Message Queues are used to facilitate inter-system communication between various systems, applications, and services. They give systems a means to communicate asynchronously without having to wait for a reply.
Event-driven architecture: The implementation of event-driven architecture patterns, in which systems respond to events by producing and consuming messages, uses message queues.
Distributed systems: Message queues are used to create distributed systems, where numerous systems coordinate their communications with one another.
Microservices architecture: Message Queues are used to promote communication between microservices, allowing them to collaborate to serve a broader requirement.
Background processing: Long-running tasks are processed via message queues so that the main application can keep processing requests while the background jobs are being completed.
Data integration: Message Queues are used to integrate different systems, allowing them to exchange data in a reliable and scalable manner.
Publish-subscribe pattern: Message Queues are used to implement the publish-subscribe pattern. Publishers send messages to a message queue, and subscribers receive messages from the queue.

When and where to Message Queues — [Notes]

Some scenarios where message queues are used?

Source/References

Written by Tarun Jain

No responses yet