REST Request-Response Gateway

This post outlines how you might create a Request-Response Gateway in Kafka using the good old correlation ID trick and a shared response topic. It’s just a sketch. I haven’t tried it out.

A Rest Gateway provides an efficient Request-Response bridge to Kafka. This is in some ways a logical extension of the REST Proxy, wrapping the concepts of both a request and a response.

What problem does it solve?

So you may wonder: Why not simply expose a REST interface on a Service directly? The gateway lets you access many different services, and the topic abstraction provides a level of indirection in much the same way that service discovery does in a traditional request-response architecture. So backend services can be scaled out, instances taken down for maintenance etc, all behind the topic abstraction. In addition the Gateway can provide observability metrics etc in much the same way as a service mesh does.

You may also wonder: Do I really want to do request response in Kafka? For commands, which are typically business events that have a return value, there is a good argument for doing this in Kafka. The command is a business event and is typically something you want a record of. For queries it is different as there is little benefit to using a broker, there is no need for broadcast and there is no need for retention, so this offers little value over a point-to-point interface like a HTTP request. So the latter case we wouldn’t recommend this approach over say HTTP, but it is still useful for advocates who want a single transport and value that over the redundancy of using a broker for request response (and yes these people exist).

This pattern can be extended to be a sidecar rather than a gateway also (although the number of response topics could potentially become an issue in an architecture with many sidecars).



Above we have a gateway running three instances, there are three services: Orders, Customer and Basket. Each service has a dedicated request topic that maps to that entity. There is a single response topic dedicated to the Gateway.

The gateway is configured to support different services, each taking 1 request topic and 1 response topic.

Imagine we POST and Order and expect confirmation back from the Orders service that it was saved. This work as follows:

Exactly the same process can be used for GET requests, although providing streaming GETs will require some form of batch markers or similar, which would be awkward for services to implement probably necessitating a client-side API.

If partitions move, whist requests are outstanding, they will timeout. We could work around this but it is likely acceptable for an initial version.

This is very similar to the way the OrdersService works in the Microservice Examples

Event-Driven Variant

When using an event driven architecture via event collaboration, responses aren’t based on a correlation id they are based on the event state, so for example we might submit orders, then respond once they are in a state of VALIDATED. The most common way to implement this is with CQRS.

Websocket Variant

Some users might prefer a websocket so that the response can trigger action rather than polling the gateway. Implementing a websocket interface is slightly more complex as you can’t use the queryable state API to redirect requests in the same way that you can with REST. There needs to be some table that maps (RequestId->Websocket(Client-Server)) which is used to ‘discover’ which node in the gateway has the websocket connection for some particular response.

Posted on June 7th, 2018 in Blog, Kafka/Confluent

  1. Ben Howes September 10th, 2018
    10:35 GMT

    The particular issue which I’ve been attempting to solve over the past few days is when you have multiple HTTP containers running behind a load balancer, with a request/reply message pattern in kafka (i.e. a produced message expects a response with a correlation id). I’m trying to work out how to make that work with multiple identical HTTP serving containers running and making sure response messages get back to the correct container.

    Adding a unique id to the reply topic works (of course), but I am concerned about having an unbound number of topics as a result. Is that a problem to be concerned with? Is there a better way to do this?

  2. ben September 10th, 2018
    14:14 GMT

    Hey Ben.

    Yes that’s a problem you should be concerned with. You need to route each message to the correct broker for your request, and route back the response. So if you have three nodes in your gateway 1/3 of the requests would be owned by each one (this is a Kafka thing: all three are configured to be in the same consumer group and Kafka will assign 1/3 of the partitions to each consumer / gateway instance).

    The Kafka API will let you work out where to route requests to:

    So just to underline this – the idea of this approach is there is only one response topic shared by all services.

  3. Srikanth October 4th, 2018
    19:53 GMT

    Ben, the image doesnt get loaded in this article as the image seems to be restricted in access to only the owner of the image.

    Can you please repost the image so that it can be viewed by public.

  4. ben October 26th, 2018
    8:18 GMT

    Thanks – fixed.

  5. Sundar February 10th, 2019
    13:41 GMT

    Hi ben
    I have a similar issue not finding suitable eg for request reply using kafka.
    I have a camel microservices which acts as proxy router validate and fwd the xml payload here to different topic. Eg provide req, cease req, change req topic. Provide microservices written in camel listen provide req topic and send the response to proxy router microservices via provide resp topic. I need eg using correlation Id usability can advise with example code? . Or I need springboot kafka correlation Id eg

  6. Nick October 29th, 2019
    1:19 GMT

    Hi Ben,

    thanks for this post.

    There is one aspect though that you have glossed over a bit but which I consider pretty important.

    What happens if the gateway nodes need to be scaled out, hence all partitions get rebalanced, so the waiting web requests and the responses don’t match up.

    In this case, you are looking at a global outage as ALL requests at this point of time will time out.

    On the other hand, to assume that the number of gateway nodes always stays the same would be reckless and not really feasible in a cloud native world.

    any advice ?

  7. Ivan Mushketyk April 30th, 2020
    21:36 GMT

    Hi Ben,

    Thank you for sharing your ideas!

    Instead of using Kafka for sending a response, would it be better to use pub/sub with Redis to get a confirmation? This would allow avoiding the problem with rebalancing mentioned by Nick.

    Also, is there a reason to keep responses persisted in Kafka? Surely avoiding other services makes it easier from the operational perspective, but I feel like a simpler pub/sub solution would be easier to work with.

  8. ⭕ SЕNDING 0,7576 ВТС. Receive > ⭕ April 7th, 2024
    22:39 GMT


Have your say

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Safari hates me
IMPORTANT! To be able to proceed, you need to solve the following simple problem (so we know that you are a human) :-)

Add the numbers ( 9 + 13 ) and SUBTRACT two ?
Please leave these two fields as-is:

Talks (View on YouTube)