AEM 6.0 and Apache Oak: What has changed?

One of the key features of AEM6.0 on the technical side is the use of Apache Oak as a much more scalable repository. It supports the full semantic of JCR 2.0, so all CQ 5.x applications should continue to work. And as an extension of this feature, there is of course mongoDB, which you can use together with Oak.

But, as with ever major reimplementation, something has changed. Things, which worked well on Jackrabbit 2.x and CRX 2.x might behave differently. Or to put in other words: Jackrabbit 2.x allowed you to do some things, which are not mandated by the JCR 2.0 specification.

One of the most prominence examples for this is the visibility of changed nodes. In CRX 2.x when you have an open JCR session A, and in a different session B some nodes are changed, you will see these changes immediately in session A. That’s not mandated by the specification, but Jackrabbit supports it.

Oak introduced the concept of MVCC (multi version concurrency control), which makes that each session only sees a view of the repository, which has been the most recent one the session has been created, but it’s not updated on-the-fly with the changes performed by other sessions. So this is a static view. If you want to get the most recent view of the repository, you need to call explicitly “session.refresh()”.

So, what’s the effect of this?
You may run into subtle inconsistencies, because you don’t see changes performed by others in your session. In most cases, only long-running sessions are really affected by this, because for them it’s often intended to react on changes from the outside, and that you can react on changes made by other threads (e.g. you can check if a certain node has already been created by another session). So if you already have followed the best practices established in the last 1-2 years, you should be fine, as long-running sessions have been discouraged. I also already showed, how such a long-running session might affect performance when used in a service context.

Oak supports you with some more “features” to spot such problems more easily. First, it prints a warning to the log, when a session is open for more than 1 minute. You can check the log and review the use of this sessions. A session being open more than 1 minute is normally a clear sign, that something’s wrong and that you should think about creating sessions with a smaller lifespan. On the other hand you can imagine also cases, where a session open for some more time is the right solution. So you need to carefully evaluate each warning.
And as second “feature”, Oak is able to propagate changes between sessions, if these changes are performed by a single thread (and only by a single thread).
But consider these features (especially the change propagation) as transient features, which won’t be supported forever.

This is one of the interesting parts of the changes in Apache Oak compared to Jackrabbit 2.x, you can find some more in in the Jackrabbit/OAK Wiki. It’s really worth to have a look at when you start with your first AEM 6.0 project.