I sometimes hear the question how to implement cache invalidation for the CDN. Or the question is why AEM CS still operates with a dispatcher layer when it now has a more powerful CDN in front of it.
The questions are very different, but the answer is in both cases: the CDN is no replacement for the dispatcher, and the dispatcher does not replace the CDN. They serve different purposes, and they combination of these two can be a really good package. Let me explain this.
The dispatcher is very traditional cache. It’s fronting the AEM systems and the cache status is actively maintained by cache invalidation so it always delivers current data. But from an end-user perspective this cache is often far away in terms of network latency. If my AEM systems are hosted in Europe, and end-users from Australia are reaching it, the latency can get huge.
The CDN is the contrary, it serves the content from many locations across the world, being as close to the end-user as possible. But the CDN cache invalidation is cumbersome, and for that reason most often TTL-based expiration is used. That means, you have to accept that there is a chance, that new content is already available, but the CDN can still deliver old content.
Not everyone is happy with that; and if that’s a real concern, short TTLs (in the range of a few minutes) are the norm. That means, that many files on the CDN will get stale every few minutes, which results in cache misses; and a cache miss on the CDN goes back to origin. But of course the reality is, that not many pages change every 10 minutes; actually very few. But customers want to have that low TTL just in case a page was changed, and that change needs to get visible to all endusers as soon as possible. .
So you have a lot of cache misses on the CDN, which trigger a re-fetch of the file from origin, and and because many of the files have not changed, you refetch the exactly same binary which got stale seconds ago. Actually a waste of resources, because your origin system delivers the same content over and over again to the CDN a consequence of these misses. So you could keep your AEM instances busy all the time, re-rendering the same requests over and over, always creating the same response.
Introducing the dispatcher caching, fronting the actual AEM instance. If the file has not changed, the dispatcher will deliver the same file (or just HTTP 304 not modified, which even avoids sending the content again). And it’s fast, much faster than letting AEM rendering the same content again. And if the file has actually changed, it’s rendered once and then reused for all the future CDN cache misses.
The combination of these 2 types of caching approaches help you to deliver content from the edge while at the same time having a reasonable latency for content updates (that means the time between replicating a change to the publish instances until all users across the world can see it) without the need to have a huge number of AEM instances in the background.
So as a conclusion, using the CDN and the dispatcher cache is a good combination, if setup properly.


You must be logged in to post a comment.