So you have learned (see my prevoius posts/videos) about the pros and cons of event-driven architecture, and you are still are looking to start your journey by either incrementally strangling an existing system or using EDA for a greenfield project.
In this post, I will introduce you to the concept of a broker and run you through different types of brokers you could choose to be the middleware for your event-driven systems.
I won’t go into too much detail about the different implementations or service offerings. Still, hopefully, it will arm you with more knowledge than I had early in my EDA journey.
Now before I go through some of the types of brokers, here are four points to consider. Please comment if you feel there are others that I have left off, and please, for the community, please explain why you feel they are necessary.
(Disclosure: Some of the links below are affiliate links. This means that, at zero cost to you, I will earn an affiliate commission if you click through the link and finalize a purchase.)
Points to consider
Types of brokers
Before we dive into the types of brokers, let’s answer why do we need a broker in the first place?
A broker allows services to communicate in a loosely coupled manner.
Standard Message Brokers
When using a standard message broker:
A publisher publishes messages to the broker, and then the broker assigns the message to a queue or multiple queues based on an identifier that’s published alongside the message.
Then based on the subscription configuration that consumers provide to the broker when they subscribe to a particular queue or topic, the broker assigns/pushes messages to the consumers either in a load-balanced or a fan-out style. Finally, once a consumer consumes a message, they acknowledge the message.
With standard brokers, messages are deleted from the broker once the services have acknowledged them.
- Event routing
- Consumers to acknowledge specific messages
- No replayability
- No message retention after acknowledgement
Log Based Brokers
Now with standard brokers, the messages are deleted. However, what happens if you don’t want to delete messages but instead want to retain messages?
Well, this is where log-based brokers come into play.
In Designing Data-Intensive Applications the author introduces log-based message brokers as a hybrid. He describes logs as combining the low latency facilities of notifications with the storage capabilities of a database.
And what are these log-based brokers? Well, they are logs of immutable events. The best example I think of is a finance ledger, or your bank account statement, where each record is an immutable fact, whether money has gone in or out.
Compared to a standard message broker, when a publisher publishes events, a log based-broker writes the event to the end of the log. The log retains events in the log for as long as the retention period is.
This history of events in an immutable ordered ledger allows consumers to consume events at their own pace, and consumers can even reply to messages by resetting consumer offsets.
Being able to reply to events by resetting a consumer index is an advantageous ability.
- Reply messages
- Full history (depending on the retention period)
- Consumers need to acknowledge messages before moving to the next one or batch.
- Can’t selectively acknowledge messages
- Retention can become costly (but some platforms can perform compaction of events)
- Load balancing consuming, the portioning of data requires thought and due diligence
Event-Brokers? Event Streams?
Often people don’t separate brokers, but in the book Building Event-Driven Microservices, the author divides the two into two different camps:
- Message brokers
- Event brokers
In my view, it is much better to think of log-based brokers as event brokers or, better yet, the company behind the commercial offering of Kafka, Confluent, calls Kafka, a log-based broker, as a streaming platform
And AWS categorises and groups both their managed version of Apache Kafka and their native streaming service Kensis under their analytics section, not under the application integration section.
Why? Because using event streaming middleware is so much more than a message-broker or an event bus.
Let us take Kafka Streams which is a framework/library on top of Kafka. It takes a log that can be looked at the same way as the changelog of a database. And Kafka Streams provide you with the infrastructure which allows you to create state tables easily and real-time stream processing to a group, join or aggregate events into new topics to trigger processes or state tables.
However, tempted you are to write it yourself, infrastructure code should come from a vendor or the community. Common problems will already have solutions. Platforms like Kafka allow developers to stand on the shoulders of giants and focus on delivering value to the customer and not focusing on infrastructure.
Platforms like Kafka lowered the cost of entry of stream processing because now the infrastructure exists to create stream processors and state tables from event stream quickly. This capability has empowered developers in the last few years to implement microservices quickly. Developers can now focus on the application logic rather than building the infrastructure logic around keeping a datastore up to date with event-sourcing. ,
Furthermore, it takes less effort to ensure that distributed data is consistent. For example, you can now run temporal (time-related) queries and create ephemeral (short-lived, temporary) materialised views within the same middleware with the event stream. Thus removing some of the data inconsistency, latency and application development headaches that would occur if you were trying to event-sourced or keep data up to date at an application layer. Instead, you can do it at the middleware.
The benefits and enablers of stream processing are discussed brilliantly by one of the key contributors to Kafka presented a fantastic presentation called “The database inside out.”
Before we land the plane, let’s be transparent if you are going to embark on the event-driven architecture journey, events will be an essential part of your strategy. And suppose you don’t choose wisely the event-streaming platform or bus as part of the middleware backbone. In that case, you will kick yourself later, just like I did a few years ago while I was still a developer when I implemented a solution using a standard message broker to trigger processing on events.
Everything was excellent in terms of performance and scalability until we found a bug in the business logic two months after go-live. Once we fixed the bug, we had to reprocess all the events, which were no longer in the broker’s queue. Getting the events back into the queue was not a straightforward endeavour, and republishing became quite costly in development and time.
Choosing one or the other?
Now log-based or stream-based are not the silver bullet for all solutions which require loose coupled asynchronous communication.
I have seen a Kafka being used to send time-sensitive events that needed to be consumed by all running subscribers when a non-log based broker should have been used, such as RabbitMQ, AWS SQS or AWS SNS or even REDIS.
I want to point out that you need to choose the right tool for the job. You do not want to be a purist. There is nothing wrong with having a hybrid system that uses message brokers and an event streaming platform.