Basic performance tuning: Caching

Many CQ installations I’ve seen start with the default configuration of CQ. This is in fact a good decision, because the default configuration can handle small and middle installations very well. And additionally you don’t have to maintain a bunch of configuration files and settings; and finally most CQ hotfixes (which are delivered without the QA) are only tested with default installations.

So when you start with your project and you have a pristine CQ installation, the performance of both publishing and authoring instances are usually very good, the UI is responsive, page load times in the 2-digit miliseconds. Great. Excellent.

When your site grows, when the content authors start their work, you need to do your first performance and stress tests using numbers provided by the requirements (“the site must be able to handle 10000 concurrent requests per second with a maximal response time of 2 seconds”). You either can overcome such requirements by throwing hardware on the problem (“we must use 6 publishers each on a 4-core machine”) or you just try to optimize your site. Okay, let’s try it with optimization first.

Caching is a thing which comes to mind first. You can cache on several layers of the application, be it application level (caches builtin into the application, like the outputcache of CQ 3 and 4), the dispatcher cache (as described here in this blog), or on the users system (using the browser cache). Each cache layer should decrease the number of requests in the remaining caches, so that in the end only the requests get through, which cannot be handled in a cache, but must be processed in CQ. Our goal is to move the files into a cache which is nearest to the enduser; then loading of these files is faster than if the load is performed from a location which is 20 000 kilometers away.

(A system engineer may also be interested in that solution, because it will offload data traffic from the internet connection. Leaves more capacity for other interesting things …)

If you start from scratch with performance tuning, grasping for the low-hanging fruits is the way to go. So you start into an iterative process, which contains of the following steps:

Identify requests which can be handled by a caching layer which is placed nearer to the enduser.
Identify actions, which allows to cache these requests in a cache next to the user.
Perform these actions
Measure the results using appropriate tools
Start over from (1)

(For a more broader view to performance tuning, see David Nueschelers post on the Day developer site)

As an example I will go through this cycle on the authoring system. I start with a random look at the request.log, which may look like this:

09/Oct/2009:09:08:03 +0200 [8] -> GET /libs/wcm/content/welcome.html HTTP/1.1
09/Oct/2009:09:08:06 +0200 [8] <- 200 text/html; charset=utf-8 3016ms
09/Oct/2009:09:08:12 +0200 [9] -> GET / HTTP/1.1
09/Oct/2009:09:08:12 +0200 [9] <- 302 - 29ms
09/Oct/2009:09:08:12 +0200 [10] -> GET /index.html HTTP/1.1
09/Oct/2009:09:08:12 +0200 [10] <- 302 - 2ms
09/Oct/2009:09:08:12 +0200 [11] -> GET /libs/wcm/content/welcome.html HTTP/1.1
09/Oct/2009:09:08:13 +0200 [11] <- 200 text/html; charset=utf-8 826ms
09/Oct/2009:09:08:13 +0200 [12] -> GET /libs/wcm/welcome/resources/welcome.css HTTP/1.1
09/Oct/2009:09:08:13 +0200 [12] <- 200 text/css 4ms
09/Oct/2009:09:08:13 +0200 [13] -> GET /libs/wcm/welcome/resources/ico_siteadmin.png HTTP/1.1
09/Oct/2009:09:08:13 +0200 [14] -> GET /libs/wcm/welcome/resources/ico_misc.png HTTP/1.1
09/Oct/2009:09:08:13 +0200 [15] -> GET /libs/wcm/welcome/resources/ico_useradmin.png HTTP/1.1
09/Oct/2009:09:08:13 +0200 [15] <- 200 image/png 8ms
09/Oct/2009:09:08:13 +0200 [16] -> GET /libs/wcm/welcome/resources/ico_damadmin.png HTTP/1.1
09/Oct/2009:09:08:13 +0200 [16] <- 200 image/png 5ms
09/Oct/2009:09:08:13 +0200 [13] <- 200 image/png 17ms
09/Oct/2009:09:08:13 +0200 [14] <- 200 image/png 17ms
09/Oct/2009:09:08:13 +0200 [17] -> GET /libs/wcm/welcome/resources/welcome_bground.gif HTTP/1.1
09/Oct/2009:09:08:13 +0200 [17] <- 200 image/gif 3ms

Ok, it looks like that some of such requests must not be handled by CQ: the PNG files and the CSS files. These files usually never change (or at least change very seldom, maybe on a deployment or when a hotfix is deployed). But for the usual daily work of an content author they can be assumed to be static, but we must of course provide a way that we enable the authors to fetch a new one, when an update to one them occurs. Ok, that was step 1: We want to cache the PNG and the CSS files which are placed below /libs.

Step 2: How can we cache these files? We don’t want to cache them within CQ (that wouldn’t bring any improvement), so remains dispatcher and browser cache. In this case I recommend to cache them in the browser cache for 2 reasons:

These files are requested more than once during a typical authoring session, so it makes sense to cache directly in the browser cache.
Latency of the browser cache is ways lower than the latency of any load from the network.

As an additional restriction which speaks against the dispatcher:

There are no flusing agents for authoring mode, so we cannot use the dispatcher that easily. So in the case of tuning an authoring instance we cannot use the dispatcher cache.

And to make any changes to these files made on the server visible to the user, we can use the expiration feature of HTTP. This allows us to specify a time-to-live, which basically tells any interested party, how long we consider this file up-to-date. When this time is reached, every party, which cached it, should remove it from cache and refetch.
This isn’t the perfect solution, because a browser will drop the file from its cache and refetch it from time to time, although the file is still valid and up-to-date.
But there’s still an improvement, if the browser fetches this files every hour instead of twice a minute (when a page load occurs).

Our prognose is, that the browser of an authoring user won’t perform that much requests on files anymore; this will increase the rendering performance of the page (the files are fetched from the fast browsercache instead from the server), and additionally the load on the CQ will decrease, because it doesn’t need to handle that much requests. Good for all parties.

Step 3: We implement this feature in the apache webserver, which we have placed in front of our CQ authoring system and add the following statements:

<LocationMatch /libs>
ExpiresByType image/png "access plus 1 hour"
ExpiresByType text/css "access plus 1 hour"
</LocationMatch>

Instead of relying on file extensions we specify here the expiration by the MIME-type in these rules. The files are considered to be up-to-date for an hour, so the browser will reload these files every hour. This value should be ok also in case these files are changed once. And if everything fails, the authoring users can drop their browser cache.

Step 4: We measure the effect of our changes using 2 different strategies: First we observe the request.log again and check if these requests appear further on. If the server is already heavy loaded, we can additionally check for a decreasing load and an improved response times for the remaining requests. As a second option we take a simple use case of an authoring user and run it with Firefox’ Firebug extension enabled. This plugin can visualize how and when the load of the parts of a page happen, and display the response times quite exactly. You should see now, that the number of files requested over the network has decreased and the load of a page and all its emnbedded objects is faster than before.

So with an quick and easy-to-perform action you have decreased the page load times. When I added expiration headers to a number of static images, javascripts and css files on a publishing instance, the number of requests which went over the wire went down to 50%, the pageload times also decreased, so that even during a stress test the site still had a good performance. Of course, dynamic parts must be handled by their respective systems, but if we can offload requests from CQ, we should do this.

So as a conclusion: Some very basic changes to the system (some configuration adjustments to the apache config) may increase the speed of your site (publishing and authoring) dramatically. Such changes as described are not invasive to the system and are highly adjustible to the specific needs and requirements of your application.

One thought on “Basic performance tuning: Caching”

Chris Pilsworth says:

November 17, 2009 at 18:00

Hi Jörg,

The technique you describe is one we have used with great success on some of our projects. However, we use longer expiration times and use the CQ4 Mapper to rewrite the URLs when new versions of static application resources such as CSS, JS, PNG are published.

I have written a little bit more about it on my blog:

http://cqstuff.blogspot.com/2009/11/improving-day-cq-performance-using.html

Regards,

Chris

Comments are closed.

Share this:

Published by Jörg

One thought on “Basic performance tuning: Caching”