Do not use AEM as a proxy for backend calls

Since I am working with AEM CS customers, I came a few time across the architecture pattern, that requests made to a site to passed all the way through to the AEM instance (bypassing all caches), and then AEM does an outbound request to a backend system (for example a PIM system or other API service, sometimes public, sometimes via VPN), collects the result and sends back the response.

This architectural pattern is problematic in a few ways:

  1. AEM handles requests with a threadpool, which has an upper limit of requests it will handle (by default 200). That means that at any time the number of such backend requests is limited by the amount of AEM instances. In AEM CS this number is variable (auto-scaling), but even in an auto-scaling world there is an upper limit.
  2. The most important factor in the number of such requests AEM can handle per second is the latency of the backend system call. For example if your backend system responds always in less than 100ms, your AEM can handle up to 2000 of such proxy requests per second. If the latency is more likely 1 second, it’s only up to 200 proxy requests per second. This can be enough, this can be way too small.
  3. To achieve such a throughput consistently, you need to have agressive timeouts; if you configure your timeouts with 2 seconds, your guaranteed throughput can only be up to 100 proxy requests/seconds.
  4. And next to all those proxy requests your AEM instances also need to handle the other duties of AEM, most importantly rendering pages and delivering assets. That will reduce the number of threads you can utilize for such backend calls.

The most common issue I have seen with this pattern is that in case of backend performance problems the AEM threadpool of all AEM instances are consumed within seconds, leading almost immediately to an outage of the AEM service. That means, that a problem on the backend or on the connection between AEM and the backend takes down your page rendering abilities, leaving you with what is cached at the CDN level.

The common recommendation we make in these cases is quite obvious: introduce more agressive timeouts. But the actual solution to this problem is a different one:

Do not use AEM as a proxy.

This is a perfect example for a case, where the client (browser) itself can do the integration. Instead of proxy-ing (=tunneling) all backend traffic through AEM, the client could approach the backend service directly. Because then the constraints AEM has (for example the number of concurrent requests) do no longer apply for the calls to the backend. Instead the backend is exposed directly to the endusers, and uses whatever technology is suitable for that; typically it is exposed via an API gateway.

If the backend gets slow, AEM is not affected. If AEM has issues, the backend is not directly impacted because of it. AEM does not even need to know that there is a backend at all. Both systems are entirely decoupled.

As you see, I pretty much prefer this approach of “integration at the frontend layer” and exposing the backend to the endusers over any type of “AEM calls the backend systems”. Mostly because such architectures are less complex and easier to debug and analyze. And that should be your default and preferred approach, whenever this required.

Disclaimer: Yes, there are cases where the application logic requires AEM to do backend calls; but in these cases it’s questionable if such requests need to be done synchronously in requests, meaning that an AEM request needs to do a backend call to consume its result. If these request can be done async, then the whole problem vector I outlined above simply does not exist.

Note: In my opinion hiding the hostnames of your backend system is also not a good reason for such an backend integration. Also “the service is just available from within our company network and AEM accesses it via VPN” is not a good reason, too. In both cases you can achieve the same with an publicly accessible API gateway, which is specifically designed to handle such usecases and all security-relevant implications of it.

So, do not use AEM as a simple proxy!

3 thoughts on “Do not use AEM as a proxy for backend calls

  1. Hi Jörg

    Thanks for this blog. I have a similar situation right now in my project. The team already has a system which has all the API calls as a proxy servlet. There is no microservice available at this point of time. We want to move this to cloud , but we might have to migrate this approach as is. What potential performance issues can we face when we move these servlets to AEM as Cloud as is ?

    1. As outlined in the post, a pure proxy servlet in AEM can lead to severe performance problems of the overall site, up to unavailability of the AEM instances. To provide better statements about the impact, I would need more detailled information about this usecase.

Comments are closed.