(Note: This post is not about getting content from environment A to B or from your AEM 6.5 to AEM CS.)
The requirements towards content and component structure evolve over time; the components which you started initially with might not be sufficient anymore. For that reasons the the components will evolve, they need new properties, or components need to be added/removed/merged, and that must be reflected in the content as well. Something which is possible to do manually, but which will take too much work and is too error-prone. Automation for the rescue.
I already came across a few of those “automated content migrations”, and I have found a few patterns which don’t work. But before I start with them, let me briefly cover the one pattern, which works very well.
The working approach
The only working approach is a workflow, which is invoked on small-ish subtrees of your content. It skips silently over content which does not need to be migrated, and reports every situation which got migrated. It might even have a dry-run mode, which just reports everything it would change. This approach has a few advantages:
- It will be invoked intentionally on author only, and only operates a single, well-defined subtree of content. It logs all changes it does.
- It does not automatically activate every change it has done, but requires activation as a dedicated second step. This allows to validate the changes and activate it only then.
- If it fails, it can repeatedly get invoked on the same content, and continue from were it has left.
- It’s a workflow, with the guarantees of a workflow. It cannot time out as a request can do, but will complete eventually. You can either log the migration output or store it as dedicated content/node/binary data somewhere. You know when a subtree is migrated and you can prove that it’s completed.
Of course this is not something you can simply do, but it requires some planning in both designing, coding and the execution of the content migration.
Now, let’s face the few things which don’t work.
Non-working approach 1: Changing content on the fly
I have seen page rendering code, which tries to modify the content it is operating on, removing old properties, adding new properties either with default values and other values.
This approach can work, but only if the user has write permissions on the content. As this migration happens at the first time the rendering is initiated with write permissions (normally by a regular editor on the authoring system), it will fail in every other situation (e.g on publish if the merging conditions exist there as well). And you will have a non-cool mix of page rendering and content-fixup code in your components.
This is a very optimistic approach, over which you don’t have any control, and for that reason you probably can never remove that fixup code, because you never know if all content has already been changed.
Non-working approach 2: Let’s do it on startup
Admitted, I have seen this only once. But it was a weird thing, because a migration OSGI service was created, which executed the content migration in its activate() method. And we came across it because this activate delayed the entire startup to a situation, which caused our automation to run into a timeout, because we don’t expect a startup of an AEM instance to take 30+ minutes.
Which is also its biggest problem and which makes it unusable: You don’t have any control over this process, it can be problematic in the case of clustered repositories (in AEM CS authoring) and even if the migration has already been completed, the check if there’s something to do can take quite long.
But hey, when you have it already implemented as service, it’s quite easy to migrate it to a workflow and then use the above recommended approach.
Let me know if you have found other cases of working or non-working approaches for content migration; but in my experience it’s always the best way to make this an explicit task, which can be planned, managed and properly executed. Everything else can work sometimes, but definitely with a less predictable outcome.





You must be logged in to post a comment.