How to deal with the “TooManyCallsException”

I randomly see the question “We get the TooManyCallsException while rendering pages, and we need to increase the threshold for the number of inclusions to 5000. Is this a problem? What can we do so we don’t run into this issue at all?”

Before I answer this question, I want to explain the background of this setting, why it was introduced and when such a “Call” is made.

Sling rendering is based on Servlets; and while a single servlet can handle the rendering of the complete response body, that is not that common in AEM. AEM pages normally consistent of a variety of different components, which internally can consist of distinct subcomponents as well. This depends on the design approach the development has choosen.
(It should be mentioned that all JSPs and all HTL scripts are compiled into regular Java servlets.)

That means that the rendering process can be considered as tree of servlets, and servlets calling other servlets (with the DefaultGetServlet being the root of such a tree when rendering pages). This tree is structured along the resource tree of the page, but it can include servlets which are rendering content from different areas of the repository; for example when dealing with content fragments or including images, which require their metadata to be respected.

It is possible to turn this tree into a cyclic graph; and that means that the process of traversing this tree of servlets will turn into a recursion. In that case request processing will never terminate, the Jetty thread pool will quickly fill up to its limit, and the system will get unavailable. To avoid this situation only a limited number of servlet-calls per request is allowed. And that’s this magic number of 1000 allowed calls (which is configured in the Sling Main Servlet).

Knowing this let me try to answer the question “Is it safe to increase this value of 1000 to 5000?“. Yes, it is safe. In case your page rendering process goes recursive it terminates later, which will increase a bit the risk of your AEM instance getting unavailable.

Are there any drawbacks? Why is the default 1000 and not 5000 (or 10000 or any higher value)?” From experience 1000 is sufficient for the majority of applications. It might be too low for applications where the components are designed very granular which in turn require a lot of servlet calls to properly render a page.
And every servlet call comes with a small overhead (mostly for running the component-level filters); and even if this overhead is just 100 microseconds, 1000 invocations are 100 ms just for the invocation overhead. That means you should find a good balance between a clean application modularization and the runtime performance overhead of it.

Which leads to the next question: “What are the problematic calls we should think of?“. Good one.
From a high-level view of AEM page renderings, you cannot avoid the servlet-calls which render the components. That means that you as an AEM application developer cannot influence the overall page rendering process, but you can only try to optimise the rendering of individual (custom) components.
To optimise these, you should be aware, that the following things trigger the invocation of a servlet during page rendering:

  • the <cq:include>, <sling:include> and <sling:forward> JSP tags
  • the data-sly-include statement of HTL
  • and every method which invokes directly or indirectly the service() method of a servlet.

A good way to check this for some pages is the “Recent requests” functionality of the OSGI Webconsole.