About a year ago I wrote an improved version of backup for CRX 2.1 and CRX 2.2. The approach is to reduce the amount of data which is considered by the online backup mechanism. With CRX 2.3 this apprach can still be used, but now an even better way is available.
A feature of the online backup — the blocking and unblocking of the repository for write operations — is now available not only to the online backup mechanism, but can be reached via JMX.
So, by this mechanism, you can prevent the repository from updating its disk structures. With this blocking enabled you can backup all the repository and then unblock afterwards.
This allows you to create a backup mechanism like this:
- Call the blockRepositoryWrites() method of the “com.adobe.granite (Repository)” Mbean
- Do a filesystem snapshot of the volume where the CRX repository is stored.
- Call “unblockRepositoryWrites()
- Mount the snapshot created in step 2
- Run your backup client on the mounted snapshot
- Umount and delete the snapshot
And that’s it. Using a filesystem snapshot instead of the online backup accelerates the whole process and the CQ5 application is affected (step 1 – 3) only for a very small timeframe (depending on your system, but should be done in less than a minute).
Some notes:
- I recommend snapshots here, because they are much faster than a copy or rsync, but of course you can use these as well.
- While the repository writes are blocked, every thread, which wants to do a write operation on the repository, will be blocked, read operations will work. But with every blocked write operation you’ll have one thread less available. So in the end you might run into a case, where no threads are available any more.
- Currently the UnlockRepositoryWrites call can be made only by JMX and not by HTTP (to the Felix Console). That should be fixed within the next updates of CQ 5.5. Generally speaking I would recommend to use JMX directly over HTTP-calls via curl and the Felix Console.
Please note that this JMX method is intended for cases where one needs to roll his own backup procedure. In most cases, the built-in backup will be the better choice.
If you feel the need to create your own backup you should also note that blocking write access to the repo is intended for only very short periods of time (i.e. sub second if possible). You should perform the majority of the backup without blocking write access and only use the block method at the very end in order to catch the latest writes.