Writing backwards compatible software

On last week’s adaptTo() conference I discussed the topic of AEM content migration with a few attendees, and why it’s a tough topic. I learned that the majority of these adhoc-migrations are done, because they are mandated by changes in the components themselves. And therefor migrations are required to adhere to the new expectations of the component. My remark “can’t you write your components in a way, that they are backwards compatible” was not that well received … it seems that it this is a hard topic for many.

And yes, writing backwards compatible components is not easy, because it comes with a few prerequisites:

The awareness, that you are making a change, which breaks compatibility with existing software and content. While the breakages in “software” can be detected easily, the often-times very loose contract between code and content is much harder to enforce. With some experience in that area you will develop a feeling for that, but especially less experienced folks can make such changes inadvertently, and you will detect that problem way too late.
You need to have a strategy which tells how to handle such a situation. While the AEM WCM Core Components introduced a versioning model, which seems to work quite nicely, an existing codebase might not be prepared for this. It forces some more structure and thoughts how to design your codebase, especially when it then comes to Sling Models and OSGI services, and where to have logic, so you don’t duplicate it.
And even if you are prepared for this situation, it’s not for free, you will end up with new versions of components which you need to maintain. Just breaking compatibility is much easier, because you still will have just 1 component.

So I totally get if you don’t care about backwards compatibility at all, because you are in the end the only consumer of your code, and you can control everything. You are not a product developer, where backwards compatibility needs to have a much higher priority.

But backwards compatibility gives you one massive benefit, which I consider as quite important: It gives you the flexibility to perform a migration to a time which is a good fit. It’s not that you need to perform this migration before, in the midst or immediately after a deployment. You deploy the necessary code, and then migrate thecontent when its convenient. And if that migration date is pushed further for whatever reason, it’s not a problem at all, because this backwards compatibility allows you to decouple the technical aspect of it (the deployment) from the actual execution of the content-migration. And for that you don’t need to re-scope deployments and schedules.

So maybe this is just me with the hat of a product developer, who is so focused on backwards compatibility. And in the wild the cost over backwards-compatibility is much higher than the flexibility it allows. I don’t know. Leave me a comment if you want to share your opinion.

This was 2024

Wow, another year has passed. Time for a recap.

My personal goal for 2024 in this blog was to post more often and more consistently, and I think that I was successful at that. When I counted correctly, it were 20 posts in 2024. The consistency in the intervals could be better (a few just days apart, other multiple weeks), but unlike in some other years I never really felt, that I was lagging way behind. So I am quite happy with it and will try to do the same in 2025.

This year I adopted 2 ideas from other blogs:

A blog post series, which is planned as such. In January and February I posted 5 posts on Modeling Performance Tests (starting here). This approach worked quite well, mostly because I spent enough time to write them before I made the first post public. If I know upfront that topics are large enough, I will continue with this type.
The “top N things …” type of posts. I don’t particular like this type of posting, because very often they just scream for attention and clicks, without adding much value. I used that approach 2 times (The new AEM CS feature in 2024 I love most and My top 3 reasons why page rendering is slow) ; and then mostly to share links to other pages. It can work that way, but that will never be my favorite type of blog post.

The most successful blog post of 2024: As I did not add any page analytics to this page (I would need a cookie banner then), I have only some basic statistics from WordPress. The top 3 requested pages besides the start page in 2024 were:

CQ development patterns – Sling ResourceResolver and JCR Sessions (written in 2013)
Do not use AEM as proxy for backend calls (of 2024)
How to analyze “Authentication support missing” (of 2023)

Interesting that a 10 year old article was requested most often. Also WordPress showed me that LinkedIn was a significant source of traffic, so I probably should continue to announce blog posts there. (If you think I should also do announcements elsewhere, let me know.)

And just today I saw the latest video from Tad Reeves, where he mentioned my article on performance testing in AEM CS. Thank you Tad, I really appreciate your feedback and the recognition!

That’s for 2024! I wish you all a relaxing break and a successful year 2025!

Sling model performance (part 4)

I think it’s time for another chapter in the topic of Sling Model performance, just to document some interesting findings I have recently made in the context of a customer project. If you haven’t read them, I recommend you to check the first 3 parts of this series here:

In this blog post I want to show the impact of inheritance in combination with Sling Models.

Sling Models are simple Java POJOs, and for that reason all features of Java can be used safely. I have seen many projects, where these POJOs inherit from a more or less sophisticated class hierarchy, which often reflect the component hierarchy. These parent classes also often consolidate generic functionality used in many or all Sling Models.

For example many Sling Models need to know the site-root page, because from there on they build links, the navigation, read global properties from etc. For that reason I have seen in many parent classes code like this:

public class AbstractModel {

  Page siteRoot;

  public void init() {
    siteRoot = getSiteRoot();
    // and many more initializations
  }
}

And then this is used like this by a Sling Model called ComponentModel:

public class ComponentModel extends AbstractModel {

  @PostConstruct
  public void init() {
    super();
  }
  ...
}

That’s all straight forward and good. But only until 10 other Sling Models also inherit from the AbstractModel, and all of them also invoke the getSiteRoot() method, which in all cases returns a page object representing the same object in the repository. Feels redundant, and it is. And it’s especially redundant, if a Model invokes the init() method of its parent and does not really need all of the values calculated there.

While in this case the overhead is probably small, I have seen cases where the removal of this redundant code brought down the rendering time from 15 seconds to less 1 second! That’s significant!

For this reason I want to make some recommendations how you can speed up your Sling Models when you use inheritance.

If you want or need to use inheritance, make sure that the parent class has a small and fast init method, and that it does not add too much overhead to each construction of a Sling Model.
I love Java Lambdas in this case, because you can pass them around and only invoke them when you really need their value. That’s ideal for lazy evaluation.
And if you need to calculate values more than once, store them for later reuse
- in the request properties, if you adapt from a request
- or in the ResourceResolver’s propertyMap, if you adapt from a Resource.

Java interfaces, OSGI and package versions

TL;DR Be cautious when implementing interfaces provided by libraries, you can get problems when these libraries are updated. Check for the @ProviderType and @ConsumerType annotations of the Java interfaces you are using to make sure that you don’t limit yourself to a specific version of a package, as sooner or later this will cause problems.

One of the principles of object-oriented programming is the encapsulation to hide any implementation details. Java uses interfaces as a language feature to implement this principle.

OSGI uses a similar approach to implement services. An OSGI service offers its public API via a Java interface. This Java interface is exported and therefor it is visible to your Java code. And then you can use it how it is taught in every AEM (and modern OSGI) class like this:

@Reference
UserNotificationService service;

With the magic of Declarative Service a reference to an implementation of UserNotificationService is injected and you are ready to use it.

But if that interface is visible and with the power of Java at hand, you can create an instance of that class on your own:

public class MyUserNotificationService implements UserNotificationService {
...
}

Yes, this is possible and nothing prevents you from doing it. But …

Unlike Object-oriented programming, OSGI has some higher aspirations. It focuses on modular software, dedicated bundles, which can have an independent lifecycle. You should be able to extend functionality in a bundle without the need that all other code in other bundles needs to be recompiled. So a binary compatibility is important.

Assuming that the framework you are using comes with the UserNotificationService which like this

package org.framework.user;
public interface UserNotificationService {
  void notifyUserViaPopup (User user, NotificationContent notification);
}

Now you decide to implement this interface in your own codebase (hey, it’s public and Java does not prevent me from doing it) and start using it in your codebase:

public class MyUserNotificationService implements UserNotificationService {
  void notifyUserViaPopup (User user, NotificationContent notification) {
    ..
  }
}

All is working fine. But then the framework is adjusted and now the UserNotificationService looks like this:

package org.framework.user;
public interface UserNotificationService { // version 1.1
  void notifyUserViaPopup (User user, NotificationContent notification);
  void notifyUserViaEMail (User user, NotificationContent notification);
}

Now you have a problem, because MyUserNotificationService is no longer compatible to the UserNotificationService (version 1.1), because MyuserNotificationService does not implement the method notifyUserViaEmail. Most likely you can’t load your new class anymore, triggering interesting exceptions. You would need to adjust MyUserNotificationService and implement the missing method to make it run again, even if you would never need the notifyUserViaEmail functionality.

So we have 2 problems with that approach:

It will be only detected on runtime, which is too late.
You should not be required to adapt your code to changes in the other of some one else, especially if this is just an extension of the API you are not interested in at all.

OSGI has a solution for 1, but only some helpers for (2). Let’s check first the solution for (1).

Package versions and references

OSGI has the notion of “package version” and it’s best practice to provide version numbers for API packages. That means you start with a version “1.0” and and people start to use it (using service references). And when you make a compatible change (like in the example above you add a new method to the service interface) you increase the package version by a minor version to 1.1 and all existing users can still reference this service, even if their code was never compiled against the version 1.1 of the UserNotificationService. This is backwards-compatible change. If you are making a backwards-incompatible change (e.g removing a method from the service interface), you have to increase the major version to 2.0.

When you build your code and use the bnd-maven-plugin (or the maven-bundle-plugin) the plugin will automatically calculate the import range on the versions and store that information in the target/classes/META-INF/MANIFEST.MF. If you just reference services, the import range can be wide like this:

org.framework.user;version=([1.0,2)

which translates to: This bundle has a dependenty to the package org.framework.user with a version equal or higher than 1.0, but lower than (excluding) 2. That means that a bundle with this import statement will resolve with package org.framework.user 1.1. If you OSGI environment only exports org.framework.user in version 2.0, your bundle will not resolve.

(Much more can be written in this aspect, and I simplified a lot here. But the above part is the important part when you are working with AEM as a consumer of the APIs provided to you.)

Package versions and implementing interfaces

The situation gets tricky, when you are implementing exported interfaces. Because that will lock you to a specific version of the package. If you implement the MyUserNotificationService as listed above, the plugins will calculate the import range like this:

org.framework.user;version=([1.0,1.1)

This will basically lock you to that specific version 1.0 of the package. While it does not prevent changes to the implementation of any implementations of the UserNotificationService in your framework libraries, it will prevent any change to the API of it. And not only for the UserNotificationService, but also for all other classes in the org.framework.user package.

But sometimes the framework requires you to implement interfaces, and these interfaces are “guaranteed” to not change by the developers of it. In that case the above behavior does not make sense, as a change to a different class in the same package would not break any binary compatibility for these “you need to implement these interface” classes.

To handle this situation, OSGI introduced 2 java annotations, which can added to such interfaces and which clearly express the intent of the developers. They also influence the import range calculation.

The @ProviderType annotation: This annotation expresses that the developer does not want you to implement this interface. This interface is purely meant to be used to reference existing functionality (most likely provided by the same bundle as the API); if you implement such an interface, the plugin will calculate a a narrow import range.
The @ConsumerType annotation: This annotation shows the intention of the developer of the library that this interface can be implemented by other parties as well. Even if the library ships an implementation of that service on its own (so you can @Reference it) you are free to implement this interface on your own and register it as a service. If you implement such an interface with this annotation, the version import range will be wide.

In the end your goal should be not to have a narrow import version range for any library. You should allow your friendly framework developers (and AEM) to extend existing interfaces without breaking any binary compatibility. And that also means that you should not implement interfaces you are not supposed to implement.

Performance tests modelling (part 2)

This is is the second blog post in the series about performance test modelling. You can find the overview over this series and links to all its articles in the post “Performance tests modelling (part 1)“.

In this blog post I want to cover the aspect of “concurrent users”, what it means in the context of a performance test and why its important to clearly understand its impact.

Concurrent users is an often used measure to indicate the the load put to a system, expressed by usage in a definition, how many users are concurrently using that system. And for that reason many performance tests provide as quantitative requirement: “The system should be able to handle 200 concurrent users”. While that seems to be a good definition on first sight, it leaves many questions:

What does “concurrent” mean?
And what does “user” mean?
Are “200 concurrent users” enough?
Do we always have “200 concurrent users”?

Definition of concurrent

Let’s start with the first question: What does “concurrent” really mean on a technical level? How can we measure that our test indeed does “200 concurrent users” and not just 20 or 1000?

Are there any server-side sessions which we can count and which directly give this number? And that we setup our test in a way to hit that number?
Or do we have to rely on more vague definitions like “users are considered concurrent when they do a page load less than 5 minutes apart”? And that we design our test in that way?

Actually it does not matter at all, which definition you choose. It’s just important that you explicitly define which definition you use. And what metric you choose to understand that you hit that number. This is an important definition when it comes to implementing your test.

And as a side-note: Many commercial tools have their own definition of concurrent, and here the exact definition does not matter as well, as long as you are able to articulate it.

What is a user?

The next question is about “the user” which is modeled in the test; to simplify the test and test executions one or more “typical” user personas are created, which visit the site and perform some actions. Which is definitely helpful, but it’s just that: A simplification, because otherwise our model would explode because of the sheer complexity and variety of user behavior. Also sometimes we don’t even know what a typical “user” does on our site, because that system will be brand-new.

So this is a case, where we have a huge variance in the behavior of the users, which we should outline in our model as a risk: The model is only valid if the majority of the users are behaving more or less as we assumed.

But is this all? Are really all users do at least 10% of the actions we assume they do?

Let’s brainstorm a bit and try to find answers for these questions:

Does the google bot behave like that? All the other bots of the search engines?
What about malware scanners which try to hit a huge list of WordPress/Drupal/… URLs on your site?
Other systems performing (random?) requests towards your site?

You could argue, that this traffic has less/no business value, and for that reason we don’t test for it. Also it could be assumed that this is just a small fraction of the overall user traffic, and can be ignored. But that is just an assumption, and nothing more. You just assume that it is irrelevant. But often these requests are not irrelevant, not all all.

I encountered cases where not the “normal users” were bringing down a system, but rather this non-normal type of “user”. An example for that are cases where the custom 404 handler was very slow, and for that reason the basic undocumented assumption “We don’t need to care about 404s, as they are very fast” was violated and brought down the site. All performance tests passed, but the production system failed nevertheless.

So you need to think about “user” in a very broad sense. And even if you don’t implement the constant background noise of the internet in your performance test, you should list it as factor. If you know that a lot of this background noise will trigger a HTTP statuscode 404, you are more likely to check that this 404 handler is fast.

Are “200 concurrent users” enough?

One information every performance has is the number of concurrent users which the system must be able to handle. But even if we assume, that “concurrent” and “users” are both defined as well, is this enough?

First, on what data is this number based on? Is it a number based on data derived from another system, which the new system should replace? That’s probably the best data you can get. Or when you build a new system, is it based on good marketing data (which would be okay-ish), based on assumptions of the expected usage or just numbers we would like to see (because we assume that a huge number of concurrent users means a large audience and a high business value)?

So probably this is the topic which will be discussed the most. But the number and the way how that number is determined should be challenged and vetted. Because it’s one the corner-stones of the whole performance test model. It does not make sense to build a high performance and scalable system when afterwards you find out that the business numbers we grossly overrated, and a smaller and cheaper solution would have delivered the same results.

What about time?

A more important is aspect which is often overlooked is the timing; how many users are working on the site at every moment? Do you need to expect the maximum number 8 hours every day or just during the peak days of the year? Do you have a more or less constant usage or only during business hours in Europe?

This heavily depends on the type of your application and the distribution of your audience. If you build an intranet site for a company only located in Europe, the usage during the night is pretty much “zero”, and it will start to increase at 0600 in the morning (probably the Germans going to work early :-)), hitting the max usage between 09 and 16 o’clock and going to zero at latest at 22 o’clock. The contrast to it is a site visited world-wide by customers, where we can expect a higher and almost flat line; of course with variations depending on the number of people being up.

This influences your tests as well, because in both cases you don’t need to simulate spikes, that means a 500% increase of users within 5 minutes. On the other hand, if you plan for large marketing campaigns addressing millions of users, this might exactly be the situation you need to plan and test for. Not to mention if you book a slot during the Superbowl break.

Why is this important? Because you need to test only scenarios which you will expect to see in production. And ignore scenarios which we don’t have any value for you. For example it’s a waste of time and investment to test for a sudden spike in the above mentioned intranet case for the European company, while it’s essential for marketing campaigns to test a scenario, where such a spike comes on top of the normal traffic.

Summary

“N concurrent users” itself is not much information; and while it can serve as input, your performance test model should contain a more detailed understanding of that definition and what it means to the performance test. Otherwise you will focus just on a given number of users of this idealistic type and ignore every other scenario and case.

In the part 3 I cover how the system and the test data itself will influence the result of the performance test.

Sling Model Exporter & exposing ResourceResolver information

Welcome to 2024. I will start this new year with a small advice regarding Sling Models, which I hope you can implement very easy on your side.

The Sling Model Exporter is based on the Jackson framework, and it can serialize an object graph, with the root being the requested Sling Model. For that it recursively serializes all public & protected members and return values of all simple getters. Properly modeled this works quite well, but small errors can have large consequences. While missing data is often quite obvious (if the JSON powers an SPA, you will find it not properly working), too much data being serialized is spotted less frequently (normally not at all).

I am currently exploring options to improve performance, and I am a big fan of the ResourceResolver.getPropertyMap() API to implement a per-resourceresolver cache. While testing such an potential improvement I found customer code, in which the ResourceResolver is serialized via the Sling Model Exporter into JSON. In that case the code looked like this:

@SlingModel
public class MyModel {
 @Self
 Resoruce resource;
 
 ResourceResolver resolver;

 @PostConstruct
 public void init() {
   resolver = resource.getResourceResolver();
 }
}

(see this good overview at Baeldung of the default serialization rules of Jackson.)

And that’s bad in 2 different aspects:

Security: The serialized ResourceResolver object contains next to the data returned by the public getters (e.g. the search paths, userId and potentially other interesting data) also the complete propertyMap. And this serialized cache is probably nothing you want to expose to the consumer of this JSON.
Exceptions: If the getProperty() cache contains instances of classes, which are not publicly exposed (that means these class definitions are hidden within some implementation packages), you will encounter ClassNotFound exceptions during serialization, which will break the export. And instead a JSON you get an internal server error or a partially serialized object graph.

In short: It is not a good idea to serialize a ResourceResolver. And honestly, I have not found a reason to say why this should be possible at all. So right now I am a bit hesitant to use the propertMap as cache, especially in contexts where the Sling Model Exporter might be used. And that blocks me to work on some interesting performance improvements 😦

To unblock this situation, we have introduced a 2 step mechanism, which should help to overcome this situation:

In the latest AEM as a Cloud Service release 14697 (both in the cloud as well as in the SDK) a new WARN message has been added when your Model definition causes a ResourceResolver to be serialized. Search the logs for this message “org.apache.sling.models.jacksonexporter.impl.JacksonExporter A ResourceResolver is serialized with all its private fields containing implementation details you should not disclose. Please review your Sling Model implementation(s) and remove all public accessors to a ResourceResolver.“
It should contain also a referecene to the request path, where this is happening, so it should be easily possible to identify the Sling model class which triggers this serialization and change that piece of code so the ResourceResolver is not serialized anymore. Note, that the above message is just a warning, the behavior remains unchanged.
As a second measure also functionality is implemented, which allows to block the serialization of ResourceResolver via the Sling Model Exporter completely. Enabling this is a breaking change for all AEM as a Cloud Service customers (even if I am 99.999% sure that it won’t break any functionality), and for that reason we cannot enable this change on the spot. But at some point this is step is necessary to guarantee that the above listed 2 problems will never happen.

Right now the first step is enabled, and you will see this log message. If you see this log message, I encourage you to adapt your code (the core components should be safe) so ResourceResolvers are no longer serialized.

In parallel we need to implement step 2; right now the planning is not done yet, but I hope to activate step 2 some time later in 2024 (not before mid of the year). But before this is done, there will be formal announcements in the AEM release notes. And I hope that with this blog post and the release notes all customers have adapted their implementation, so that setting this switch will not change anything.

Update (January 19, 2024): There is now a piece of official AEM documentation covering this situation as well.

3 rules how to use an HttpClient in AEM

Many AEM applications consume data from other systems, and in the last decade the protocol of choice turned out to the HTTP(S). And there are a number of very mature HTTP clients out, which can be used together with AEM. The most frequently used variant is the Apache HttpClient, which is shipped with AEM.

But although the HttpClient is quite easy to use, I came across a number of problems, many of them result in service outages. In this post I want to list the 3 biggest mistakes you can make when you use the Apache HttpClient. While I observed the results in AEM as a Cloud Service, the underlying effects are the same on-prem and in AMS, the resulting effects can be a bit different.

Reuse the HttpClient instance

I often see that a HttpClient instance is created for a single HTTP request, and in many cases it’s not even closed properly afterwards. This can lead to these consequences:

If you don’t close the HttpClient instance properly, the underlying network connection(s) will not be closed properly, but eventually timeout. And until then the network connections stays open. If you using a proxy with a connection limit (many proxies do that) this proxy can reject new requests.
If you re-create a HttpClient for every request, the underlying network connection will get re-established every time with the latency of the 3-way handshake.

The reuse of the HttpClient object and its state is also recommended by its documentation.

The best way to make that happen is to wrap the HttpClient into an OSGI service, create it on activation and stop it when the service is deactivated.

Set agressive connection- and read-timeouts

Especially when an outbund HTTP request should be executed within the context of a AEM request, performance really matters. Every milisecond which is spent in that external call makes the AEM request slower. This increases the risk of exhausting the Jetty thread pool, which then leads to non-availability of that instance, because it cannot accept any new requests. I have often seen AEM CS outages because a backend was not responding slowly or not at all. All requests should finish quickly, and in case of errors must also return fast.

That means, timeouts should not exceed 2 second (personally I would prefer even 1 second). And if your backend cannot respond that fast, you should reconsider its fitness for interactive traffic, and try not to connect to it in a synchronous request.

Implement a degraded mode

When your backend application responds slowly, returns errors or is not available at all, your AEM application should be react accordingly. I had the case a number of times that any problem on the backend had an immediate effect on the AEM application, often resulting in downtimes because either the application was not able to handle the results of the HttpClient (so the response rendering failed with an exception), or because the Jetty threadpool was totally consumed by those requests.

Instead your AEM application should be able to fallback into a degraded mode, which allows you to display at least a message, that something is not working. In the best case the rest of the site continues to work as usual.

If you implement these 3 rules when you do your backend connections, and especially if you test the degraded mode, your AEM application will be much more resilient when it comes to network or backend hiccups, resulting in less service outages. And isn’t that something we all want?

Recap: AdaptTo 2023

It was adapTo() time again, the first time again in an in-person format since 2019. And it’s definitely much different from the virtual formats we experienced during the pandemic. More personal, and allowing me to get away from the daily work routine; I remember that in 2020 and 2021 I constantly had work related topics (mostly Slack) on the other screen, while I was attending the virtual conference. That’s definitely different when you are at the venue 🙂

And it was great to see all the people again. Many of the people which are part of the community for years, but also many new faces. Nice to see that the community can still attract new people, although I think that the golden time of the backend-heavy web-development is over. And that was reflected on stage as well, with Edge Delivery Services being quite a topic.

As in the past years, the conference itself isn’t that large (this year maybe 200 attendees) and it gives you plenty of chances to get in touch and chat about projects, new features, bugs and everything else you can imagine. The location is nice, and Berlin gives you plenty of opportunities to go out for dinner. So while 3 days of conference can definitely be exhausting, I would have liked to spend much more dinners with attendees.

I got the chance to come on stage again with one of my favorite topics: Performance improvement in AEM, a classic backend topic. According to the talk feedback, people liked it 🙂
Also, the folks of the adaptTo() recorded all the talks and you can find both the recording and the slide deck on the talk’s page.

The next call for papers is already announced to start in February ’24), and I will definitely submit a talk again. Maybe you as well?

AEM CS & dedicated egress IP

Many customers of AEM as a Cloud Service are used to perform a first level of access control by allowing just a certain set of IP addresses to access a system. For that reason they want that their AEM instances use a static IP address or network range to access their backend systems. AEM CS supports with this with the feature called “dedicated egress IP address“.

But when testing that feature there is often the feedback, that this is not working, and that the incoming requests on backend systems come from a different network range. This is expected, because this feature does not change the default routing for outgoing traffic for the AEM instances.

The documentation also says

Http or https traffic will go through a preconfigured proxy, provided they use standard Java system properties for proxy configurations.

The thing is that if traffic is supposed to use this dedicated egress IP, you have to explicitly make it use this proxy. This is important, because by default not all HTTP Clients do this.

For example, the in the Apache HTTP Client library 4.x, the HttpClients.createDefault() method does not read the system properties related proxying, but the HttpClients.createSystem() does. Same with the java.net.http.HttpClient, for which you need to configure the Builder to use a proxy. Also okhttp requires you to configure the proxy explicitly.

So if requests from your AEM instance is coming from the wrong IP address, check that your code is actually using the configured proxy.

AEM article review December 2022

I am doing this blog now for quite some time (the first article in this blog dates back to December 2008! That was the time of CQ 5.0! OMG), and of course I am not the only one writing on AEM. Actually the number of articles which are produced every months is quite large, but I am often a bit disappointed because many just reproduce some very basic aspects of AEM, which can be found at many places. But the amount of new content which describe aspects which have barely been covered by other blog posts or the official product documentation is small.

For myself I try to focus on such topics, offer unique views on the product and provide recommendations how things can be done (better), all based on my personal experiences. I think that this type of content is appreciated by the community, and I get good feedback on it. To encourage the broader community to come up with more content covering new aspects I will do a little experiment and promote a few selected articles of others. I think that these article show new aspects or offer a unique view on certain on AEM.

Depending on the feedback I will decide i will continue with this experiment. If you think that your content also offers new views, uncovers hidden features or suggests best practices, please let me know (see the my contact data here). I will judge these proposals on the above mentioned criteria. But of course it will be still my personal decision.

Let’s start with Theo Pendle, who has written an article on how to write your own custom injector for Sling Models. The example he uses is a real good one, and he walks you through all the steps and explains very well, why that is all necessary. I like the general approach of Theos writing and consider the case of safely injecting cookie values as a valid for such a injector. But in general I think that there are not many other cases out there, where it makes sense to write custom injectors.

Also on a technical level John Mitchell has his article “Using Sling Feature Flags to Manage Continous Releases“, published on the Adobe Tech Blog. He introduces Sling Features and how you can use them to implement Feature Flags. And that’s something I have not seen used yet in the wild, and also the documentation is quite sparse on it. But he gives a good starting point, although a more practical example would be great 🙂

The third article I like the most. Kevin Nenning writes on “CRXDE Lite, the plague of AEM“. He outlines why CRXDE Lite has gained such a bad reputation within Adobe, that disabling CRXDE Lite is part of the golive checklist for quite some time. But on the other hand he loves the tool because it’s a great way for quick hacks on your local development instance and for a general read-only tool. This is an article every AEM developer should read.
And in case you haven’t seen it yet: AEM as a Cloud Service offers the repository browser in the developer console for a read-only view on your repo!

And finally there is Yuri Simione (an Adobe AEM champion), who published 2 articles discussing the question “Is AEM a valid Content Services Plattform?” (article 1, article 2). He discusses an implementation which is based on Jackrabbit/Oak and Sling (but not AEM) to replace an aging Documentum system. And finally he offers an interesting perspective on the future of Jackrabbit. Definitely a read if you are interested in a more broader use of AEM and its foundational pieces.

That’s it for December. I hope you enjoy these articles as much as I did, and that you can learn from them and get some new inspiration and insights.