Managing repository growth

On almost every project there is this time, when (all of a sudden) the disk space assigned to AEM instances becomes full; in most (!) cases such a situation is detected early on so there is time to react, which often means just adding more disk capacity.

But after that the questions arise: Why is our repository so large? Why does it consume more than the initially estimated disk space? What content exactly is causing this problem? What went wrong so we actually got into that situation?

From my point of view there are 3 different views on this situation, which can be an answer:

  • Disk space is not managed well, that means that unnecessary stuff is consuming lot of space outside of the repository.
  • The maintenance jobs are not executed.
  • The estimation for content and content growth was not realistic.

And in most cases you need to check all these 3 views to answer the question “Why are we running out of space?”

Disk space is not managed well
This is an operations problems in the first place, because non-repository data is basically consuming disk space which has been planned for the repository. Often seen:

  • Heapdumps and log files are not rotated or removed.
  • Manually created backups files have just been copied to a folder next to the original one once, but in meanwhile they are not useful any more, because they are totally out of date.
  • The regular backup process is creating temporary files, which are not cleaned up; or the backup process itself consumes temporary disk space, which lets the disk consumption spike.

So this can just be handled by careful working and in-time purging of old data.

The maintenance jobs are not executed
Maintenance jobs are an essential part of the ongoing job to remove the unnecessary fat from you AEM instance, be it on the content level or on a repo level. It includes

  • workflow purge
  • audit log purge
  • repository compaction (if you use TarMK)
  • datastore GC (if you use a datastore)

You should always keep an eye on these; the Maintenance Dashboard is a great help here. But do not rely on it blindly!

Your estimation for content and content growth was not realistic
That’s a common problem; you have to give an initial hardware sizing, which also includes the amount of disk space used by the repository. You do your best to include all relevant parameters, you add some buffer on top. But that is an estimation on the beginning of the project, when you don’t know all the requirements and their impact on disk consumption in detail. But that’s what you said, and changing them afterwards is always problematic.

Or AEM is used differently than initially anticipated and all the assumptions you have based your initial hardware sizing are not longer true. Or you just forgot to add the versioning of the assets to your calculation. Or…
There are a lot of cases where in retrospective the initial sizing of the required disk space was just incorrect. In that case you have only chance: Redo the calculation right now! Take your new insights and create a new hardware sizing. And then implement it and add more hardware.

And the only way I see to avoid such situations is: Do not make estimations for the next 3 years! Be a agile and review your current numbers every 3 months; during this review you can also determine the needs for the next months and plan accordingly. Of course this assumes that you are flexible in terms of disk sizing, so for any non-trivial setup the use of SAN as storage technology (plus a volume manager) is my preferred choice!

Of course this does not free yourself from working on the cleanup of the systems and running all required maintenance jobs; but it will make the review of the used disk space a regular task; so you should see deviations from your anticipated plan much earlier.