The biggest change in AEM 6.0 compared to its prior versions is the use of Apache Oak as repository implementation instead of Apache Jackrabbit version 2.x; although both implement the JCR 2.0 API (Oak not completely yet, but the „important“ parts are there), there a number of differences between them.
In the area of scalability the most notable change is the use of the MVCC (multi version concurrency control, and proven approach taken from the relational database world) in Oak. It decouples sessions from the global repository state and are the basis for the scalability of the repository. But it comes with the price, that sessions should be used only by a single thread. It is a only a „should“, because Oak detects any usage of multiple threads accessing a single session and then serializes the access to it.
(For the records: The same recommendation already applied to Apache Jackrabbit 2.x, but the impact was never that high, mostly because it wasn’t that scalable as Oak now is.)
This isn’t a real limitation, but it requires careful design of any application. In the context of AEM normally it isn’t a problem at all, because all incoming HTTP requests use a dedicated session on their own. While this is true for the request, there is often functionality, which doesn’t follow this pattern.
I put a common pattern for this development pattern to Github, including a recommended implementation and a discouraged implementation. The problem in the discouraged example lies in the fact, that the repository session (in the example hidden behind the resource resolver abstraction) is opened once at the startup of the service by the thread, which does the activation of all services. But then resources are handed out to every other thread requesting the getConfiguration() method. If every request is doing this call, they all get synchronized here, thus limiting the scalability.
In the recommended example this problem is mitigated in a way, that each call to getConfiguration() opens a new session, reads the required resource and then closes the session. Here the session and its data is hold completely inside a thread, and there’s no need for synchronization anymore.
That’s the theory part, but how can you detect easily if you have this problem as well? The easiest way is to set the logging for the class org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate to DEBUG. Every time Oak detects the problem, that a session is used by multiple threads, it prints a stack trace to the log. If this happens on write access, it uses the WARN level, in case of reads the DEBUG level.
23.02.2015 09:21:56.916 *WARN* [0:0:0:0:0:0:0:1 [1424679716845] GET /content/geometrixx/en/services.html HTTP/1.0] org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate Attempt to perform hasProperty while another thread is concurrently reading from session-494. Blocking until the other thread is finished using this session. Please review your code to avoid concurrent use of a session.
java.lang.Exception: Stack trace of concurrent access to session-494
at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:276)
at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
at org.apache.jackrabbit.oak.jcr.session.NodeImpl.hasProperty(NodeImpl.java:812)
at org.apache.sling.jcr.resource.JcrPropertyMap.read(JcrPropertyMap.java:350)
...
If you want to have a scalable AEM application, you should carefully watch out for these log messages and optimize the use of shared sessions.





You must be logged in to post a comment.