Is CRX 1.4.2 production ready?

The contentbus technology was the standard storage backend till the CQ 4.x-series Although file-based storage wasn’t the great deal even in the late 1990s (mysql was already invented, postgres existed plus at least half a dozen enterprise database systems), Day choose to store the content objects in individual files, hidden by a abstraction layer. Of course it took some time of tuning and making experiences, but the contentbus proved to be a reliable storage which had the big point, that with an editor on the filesystem you can solve nearly all problems (we used more than once sed to fix our default.map)

But some points were still open:

  • Online backup isn’t possible. The documentation simply states: “Shutdown the CQ, copy your files, and startup again”. Although you can speed up the copy, if you replace with it with a snapshot on filesystem layer, but this need to restart doesn’t make it enterprise-ready. (Databases offer online-backup since at least a decade).
  • Contentbus requires a lot of filesystem I/O (mainly the system calls open and close). Having a lot of these operations slows down the processing. A small number of larger files would reduce this administrative overhead in the filesystem.
  • Memory usage of contentbus artifacts: Some artifacts like the default.map and zombie.map have in-memory data-structures, which grow as the underlying files grow (or vice-versa). The more content you have the more memory is used. Even if only a small part of this content is in active use. This doesn’t scale well.
  • The contentbus offers cluster support, but only with 2 nodes; with more nodes the overall performance will even degrade! According to the cluster documentation for CRX 1.4, Day tested CRX in a clustered setup with 6 nodes. If the performance loss is acceptable (that means,  6 nodes offer more performance than 5 nodes), this would be a real good solution to scale your authoring systems.

So we decided that’s time to evaluate if CRX would be at least as good as the contentbus. The TAR persistence manager adresses mainly the backup issue, we hope that we get some performance improvements as well.

So currently I’m doing a test setup of CQ 4.2.0 and CRX 1.4.2, for which Day offered (just in time :-)) a knowledge base article.

Everything is content

This is the philosophy behind Communique. Everything is content. Templates are content, user data are content, user ACLs are content and — of course — content is content. Interstingly the compiled JSPs are also content, so you can remove them easily with the installation of a single package and force the JVM to recompile them again.

If all these elements can be handled the same way, you can use a single mechanism to install, update and also remove these elements. The CQ package tool is a great thing to deploy new template code (java code) plus the additional ESP files and the other static image snippets. You can access parts of the system which are not reachable by the authoring GUI. But behind the scenes it looks quite different:

  • Content (the things you have Communique for) is versioned and can be added and removed online.
  • Code isn’t versioned. If the installation of a package were an atomic operation, one could easily switch between different template version. Going back to an older template version would be quite easy, just undo the template installation and restore the version which is live then. Sadly not possible, one solves this be cleaning all template places and re-installing the older template version.
  • Day hotfixes and services: a weird construct. Because you cannot exchange them at runtime, these are extrated into a directory system/bin.new; when restarting the content of system/bin is zipped into a file system/bin.$TIMESTAMP.zip and then the content of system/bin.new is copied over to system/bin. A stable mechanism (updates are performed before the CQ core and the services actually start), but it’s really hard to undo hotfixes. No GUI anymore; you need to find the right zip files and manually unzip it to system/bin.

Oh, another thing: older versions of a handle are not content. No possibility to create a package of handles including its history (aka the versions). Only the most recent versions are included.

Dispatcher caching and content structure

The dispatcher behaviour as described before has one major drawback which may have an implication on the design of your content structure.

Invalidation is performed on a whole directory and all its subtrees!

So if you invalidate a file A in a directory, the file B in the same directory is also invalidated. Also file C in the subdirectory D is invalidated. All these invalidated files need to be refetched from your CQ instance again, until the caching will work again. This happens, although you invalidated only file A. Files B and C aren’t changed at all, but they are refechted! So a single invalidation can create a lot of load on your CQ instance(s).

But this may be a side-effect, which you want to have. Because when the change of handle A may also affect handle B and C (maybe your changed the title of handle A which is required by handle B and C to render a correct navigation). This is the easiest way to invalidate all dependent handles when you change a single one.

So the point is to find a balance between permanently flushing your whole cache when you activate a single handle and taking care of your dependencies and get files marked invalid when an update should happen.

And here comes the already mentioned parameter \statfileslevel. If we assume that you have a well-designed content structure you set this parameter as high as possible to minimize the number of files which are invalidated if a single file is invalidated. On the other hand you should set it to a level so that all files are invalidated which depend on the invalidated file.

Knowing this you should arrange your content as follows:

  • Group your content and use hierarchies. This allows you increase your statfileslevel.
  • Minimize the number of dependencies between your handles so you can keep the statfileslevel high. Keep dependent handles in the same directory and try to avoid dependencies to handles in your parent directory (or even higher), so you don’t need to decrease the statfileslevel.

Note: The parameter \statfileslevel has a global scope.