Pub/Sub is a great mechanism to establish microservice communication. By focusing on defining good, reliable topics of interest, microservices can come and go without disrupting the entire ecosystem. AWS offer two great services that help build a highly scalable, reliable and recoverable pub/sub system: Simple Notification Service and Simple Queue Service.
Introduction to microservice communication
When service A communicate with B via direct call, we can infer that there is a dependency in A coupled to B. If B is temporarily unavailable or under heavy load, it could disrupt the stability of A.
Writing coupled microservices usually means dealing with the worst of both worlds. If instead the system was built like a monolith, it would still be coupled to each other, but without the problems of a distributed system. On a monolith, a component is never unavailable or unreachable and the communication is just one function call or object instantiation away. By splitting two services apart, they become independent of each other at the cost of creating a distributed system.
With a Publisher / Subscriber system, there’s the possibility of system B to track (or subscribe) to events that happen in A without a direct coupling. The Pub/Sub system becomes a broker and serves as a contract between both parties. If system A is starting to decay, it can be fully replaced by a newer, smarter, better version without affecting B if the messages that A publishes can still be respected or kept backward compatible.
AWS Simple Notification Service is where publishers will notify about events that happens. That notification happens on a topic. As of this moment, SNS can notify subscribers by http, https, email, sms, sqs and lambda. with the HTTP(S) protocol, two services could communicate with each other via SNS without the need of SQS at all. So why bring SQS to the mix?
When using HTTP(S) as the subscription protocol for SNS, AWS will offer retries in case the subscriber cannot be reached or responds with an error. However, there’s quite some limitation on what can be achieved with this model. Particularly attention if the subscriber has a bug and cannot process the notification at all. There could be a lot of notifications lost until a patch is released. However, when subscribing an SQS to an SNS topic, AWS will guarantee the delivery to SQS. Perhaps if the worker/consumer of SQS has a bug, it will fail to process the message, but the message can stay in the queue up to 14 days and can still be delivered to a Dead Letter Queue. This mechanism offers a much safer option to never lose messages. Some microservices work with the premise that messages can always be replayed, but what if some subscribers already processed those events? Of course, replaying systems are a valid option so long the subscribers can be idempotent. For this article, I will focus primarily on avoiding message loss instead.
Laravel has a Queue system out of the box and offers a native driver that can communicate with Amazon SQS. When pushing a message to SQS, Laravel will store enough information inside the message that allows the worker to understand and process the Job (message) correctly. However, when fetching a message from SQS that was inserted by SNS, the message will lack the
job attribute which is used by the Worker system to know which Job to process. If only two systems are communicating through SNS, one could argue that the publisher system could specify the
job key so the Laravel subscriber would know what to do. Unfortunately that is a strong violation of the Pub / Sub system because the publisher would be required to know an implementation detail of the subscriber. It also doesn’t work with two or more subscribers, unless we want all our subscribers to have the same Job namespace to process a topic event.
To get around this, one interesting option is to tie the SNS
Topic ARN to a specific Job class. Whenever a new message from topic user-authenticated is sent to SQS, the Laravel worker system should use class
ProcessUserAuthenticatedJob. To achieve this, it’s necessary to create a custom Laravel Queue driver. I decided to call this custom driver
sns to give the sense of working messages from AWS SNS (through SQS).
In a Service Provider we can hook an
afterResolving event on Laravel’s container. That way whenever the
QueueManager class is resolved out of the container for the first time we can make sure to add our custom driver to it.
JobMap class is responsible for mapping a Topic ARN to a Job Handler.
The callback given to the
QueueManager with the name of
sns should return an implementation of
Illuminate\Queue\Connectors\ConnectorInterface. In here I decided to extend from
SqsConnector in order to take advantage of the
getDefaultConfiguration() method already provided by Laravel Worker system. There’s no actual need for this inheritance except to borrow that method.
As per Laravel’s interface, a Queue Connector should return a Queue object. Since we wish to work messages from Sqs, it is a lot easier to just inherit from Laravel’s
SqsQueue object and override the
Everything else to make the Queue object work already come out of the box from the
SqsQueue class. Note that we’ve loaded the
queue.map from the Service Provider, sent it to the
SnsQueue and now we’re injecting it into the
Note: we could have easily used a Service Locator / Facade strategy to automatically resolve the mapping from inside the job itself and personally I don’t have any problems with that. However, if you’re working on a diverse team that dislike Facade, the explicitly dependency injection allows the Job class to not be dependent on / coupled to an external resource.
Lastly, with the
SnsJob class we can extend from
SqsJob provided by Laravel to default to the normal workflow of working messages from Sqs while overriding the
payload method to inject the
job attribute. That’s where the
queue.map configuration comes in play. Whenever a specific Sns Topic injects a message to the Queue, we can instruct Laravel which job to run.
To make sure everything works as intended, we can leverage localstack with an
SQS container locally. Configuring Localstack is beyond the scope of this article and will not be covered. The test I would like to run looks a bit like the following class.
setUp method is pushing a message to localstack SQS container. The test blueprint can be translated as:
– Arrange: Add SNS topic to the
queue.map on Laravel’s config system
– Act: Invoke Laravel’s Queue Worker system.
– Assert: Check that the message was correctly obtained.
The content of
message.json can be seen below.
The last missing piece is the
ProvidesSqs trait that helps setup the localstack integration for as many test class as we need.
sendMessageToSqs is invoked, it will:
- Make an AWS Sqs Client.
- Create a new Sqs on localstack.
- Purge the Queue in case there’s messages from previous tests.
- Send a message to the Queue
- Configure Laravel Queue system to use
snsdriver by default.
- Configure Laravel Queue system to work messages from this queue.
Compiling everything together, we have:
- An external system that publishes messages to AWS SNS.
- A Queue that subscribes to the SNS Topic
- A Laravel project that is interested in working those messages on the background
- A Service Provider that creates a new
snsdriver for Laravel Worker
- A SnsConnector, SnsQueue and SnsJob that assembles the worker system
- A ProvidesSqs trait that uses localstack to write tests.
One caveat to pay attention: You cannot use the
sns driver to push messages to the queue. It is a read-only process that will rely on the
TopicARN key to figure out what job to run.
Hope you enjoy the article and the technical details. If you have any questions find me on Twitter at DeleuGyN.