The web, an eventually consistent system

For many large websites, CDNs are the foundation for delivering content quickly to their customers around the world. The ability of CDNs to cache responses close to consumers also allows these sites to operate on a small hardware footprint. However, compared to what they would have to invest if they operated without a CDN and delivered all content through their own systems, this comes at a cost: your CDN may now deliver content that is out of sync with your origin because you changed the content on your own system. This change is not done in an atomic fashion. This is the same “atomic” as in the ACID principle of database implementations.
This is a conscious decision, and it is caused primarily by the CAP theorem. It states that in a distributed data storage system, you can only achieve 2 of these 3 guarantees:

  • Consistency
  • Availability
  • Partition tolerance

And in the case of a CDN (which is a highly distributed data storage system), its developers usually opt for availability and partition tolerance over consistency. That is, they accept delivering content that is out of date because the originating system has already updated it.

To mitigate this situation the HTTP protocol has features built-in which help to mitigate the problem at least partially. Check out the latest RFC draft on it, it is a really good read. The main feature is called “TTL” (time-to-live) and means that the CDN delivers a version of the content only for a configured time. Afterwards the CDN fetches a new version will from the origin system. The technical term for this is “eventual consistent” because at that point the state of the system with respect to that content is consistent again.

This is the approach all CDNs support, and it works very reliable. But only if you accept that you change content on the origin system and that it will reach your consumers with this delay. The delay is usually set to a period of time that is empirically determined by the website operators, trying to balance the need to deliver fresh content (which requires a very low or no TTL) with the number of requests that the CDN can answer instead of the origin system (in this case, the TTL should be as high as possible). Usually it is in the range of a few minutes.

(Even if you don’t use a CDN for your origin systems, you need these caching instructions, otherwise browsers will make assumptions and cache the requested files on their own. Browsing the web without caching is slow, even on very fast connections. Not to mention what happens when using a mobile device over a slow 3G line … Eventual consistency is an issue you can’t avoid when working on the web.)

Caching is an issue you will always have to deal with when creating web presences. Try to cache as much as possible without neglecting the need to refresh or update content at a random time.

You need to constantly address eventual consistency. Atomic changes (that means changes are immediately available to all consumers) are possible, but they come at a price. You can’t use CDNs for this content; you must deliver it all directly from your origin system. In this case, you need to design your origin system so that it can function without eventual consistency at all (and that’s built in into many systems). Not to mention the additional load it will have to handle.

And for this reason I would always recommend not relying on atomic updates or consistency across your web presence. Always factor in eventual consistency in the delivery of your content. And in most cases even business requirements where “immediate updates” are required can be solved with a TTL of 1 minute. Still not “immediate”, but good enough in 99% of all cases. For the remaining 1% where consistency is mandatory (e.g. real-time stock trading) you need to find a different solution. And I am not sure if the web is always the right technology then.

And as an afterthought regarding TTL: Of course many CDNs offer you the chance to actively invalidate the content, but it often comes with a price. In many cases you can invalidate only single files. Often it is not an immediate action, but takes seconds up to many minutes. And the price is always that you have to have the capacity to handle the load when the CDN needs to refetch a larger chunk of content from your origin system.

How to use Runmodes correctly (update)

Runmodes are an essential concept within AEM; they form the main and only way to assign roles to AEM instances; the primary usecase is to distinguish between the author and publish role, and another common usecase is also to split between PROD, Staging and Development environments. Technically it’s just a set of strings which are assigned to an instance, and which are used by the Sling framework at a few occassions, the most prominent being the Sling JCR Installer (which handles the /apps/myapp/config,/apps/myapp/config.author, etc. directories).

But I see other usecases; usecases where the runmodes are fetched and compared against hardcoded strings. A typical example for it:

boolean isAuthor() {
return slingSettingsService.getRunmodes().contains("author");
}

From a technical point of view this is fully correct, and works as expected. The problem arises when some code is based on the result of this method:

if (isAuthor()) {
// do something
}

Because now the execution of this code is hardcoded to the author environment; which can get problematic, if this code must not be executed on the DEV authoring instances (e.g. because it sends email notifications). It is not a problem to change this to:

if (isAuthor() && !isDevelopmentEnvironment()) {
// do something
}

But now it is hardcoded again 😦

The better way is to rely on the OSGI framework soley. Just make your OSGI components require some configuration and define the configuration for the runmodes required.

@Component(configurationPolicy=ConfigurationPolicy.REQUIRED)
public class myServiceImpl implements myService {
//...

This case requires NO CODING at all, instead you can just use the functionality provided by Sling. And this component does not even activate if the configuration is not present!

Long story short: Whenever you see a reference to SlingSettingsService.getRunmodes(), it’s very likely used wrongly. And we can generalize it to “If you add a reference to the SlingSettingsService, you are doing something wrong”.

There are only a very few cases where the information provided by this service is actually useful for non-framework purposes. But I bet you are not writing a framework 🙂

Update (Oct 17, 2019): In a Twitter discussion Ahmed Musallam and Justin Edelson pointed out, that there are usecases around where this actually useful and the right API to use. Possibly yes, I cannot argue about that, but these are the few cases I mentioned above. I have never encountered them personally. And as a general rule of thumb it’s still applicable. Because every rule has its exceptions.

You think that I have written on that topic already? Yes, I did, actually 2 times already (here and here). But it seems that only repetition helps to get this message through. I still find this pattern in too many codebases.

Ways to achieve content reuse in AEM

Whenever an AEM project starts, you have a few important decisions to make. I already wrote about content architeture (here and here) and its importance to a succesful project and an efficient content development and maintenance process. A part of this content architecture discussion is the aspect of content reuse.

Content reuse happens on every AEM project, often it plays a central role. And because requirements are so different, there are many ways to achieve content reuse. In this blog post I want to outline some prominent ways you can use to reuse content in AEM. Each one comes with some unique properties, thus pay attention to them.

I identified 2 main concepts of content reuse: Reuse by copy and Reuse by reference.

Reuse by copy

The AEM multisite manager (MSM) is probably the most prominent approach for content reuse in AEM. It’s been part of the product for a long time and therefor a lot of people know it (even when you just have started with AEM, you might came across its idioms). It’s an approach which creates independent copies of the source, and helps you to keep these copies (“livecopies”) in sync with the original version (“blueprint”) . On the other hand side you still can work with the copies as you like, that means modify them, create and delete parts of the pages or even complete pages. With the help of the MSM you can always get back to the original state, or a change on the blueprint can be propagated to all livecopies (including conflict handling). So you can could call this approach “managed copy”.

The MSM is a powerful tool, but comes with its own set of complexity and error cases; you and your users should understand how it works and what situations can arise out of it. It also has performance implications, as copies are created; also rolling out changes on the blueprint to livecopies can be complex and consume quite some server resources. If you don’t need have the requirement to modify the copies, the MSM is the wrong approach for you!

Unlike the MSM the language copy approach just creates simple copies; and when these copies have been created there is no relationship anymore between the source and the target of the language copy. It’s an “un-managed copy”. Personally I don’t see much use in it in a standalone way (if used as part of a translation workflow, the situation is different).

Reuse by reference

Reuse by reference is a different approach. It does not duplicate content, but just adds references and then the reference target is injected or displayed. Thus a reference will always display the same content as the reference target, deviations and modifications are not possible. Referencing larger amount of content (beyond the scope of parts of a single page) can be problematic and hard to manage, especially if these references are not marked explicitly as such.

The main benefit of reuse by reference is that any change to the reference target is picked up immediately and reflected in the references; and that the performance impact is negligible. Also the consistency of the display of the reference with the reference target is guaranteed (when caching effects are ignored).

This approach is often used for page elements, which have to be consistent all over a site, for example for page headers of footers. But also the DAM is used in this way, even if you don’t embed the asset itself into the page, but rather just add a reference to it into the page).

If you implement reuse by reference, you always have to think about dispatcher cache invalidation, as in many cases a change to a reference target it not propagated to all references, thus the dispatcher will not know about it. You often have to take care of that by yourself.

Having said that, what are the approaches in AEM to implement reuse by reference?


Do it on your own: In standard page rendering scripts you already do includes, typically of child nodes inside the page itself. But you can also include nodes from different parts of the repository, no problem. That’s probably the simplest approach and widely used.

Another approach are Content Fragments and Experience Fragments. They are more sophisticated approaches, and also come with proper support in the authoring interface, plus components to embed them. That makes it much easier to use and start with, and it also offers some nice features on top like variants. But from a conceptual point if view it’s still a reference.

A special form of reuse by reference is “reuse by inheritance“. Typically it is implement by components like the iparsys or (when you code your own components) by using the InheritanceValueMap of Sling. In this case the reference target is always the parent (page/node). This approach is helpful when you want to inherit content down the tree (e.g from the homepage of the site to all individual pages); with the iparsys it’s hte content of a parsys, with the InheritanceValueMap it’s properties.

What approach should I choose?

The big differentiator of the reuse by copy and reuse by reference is the question if reused content should be adapted or changed at the location where it should be reused. As soon as you need to have the requirement “I would like to change the content provided to me”, the you need to have reuse by copy. And in AEM this normally mans “MSM”. Because content is not created once, but needs to be maintained and updated. At scale MSM is the best way to do it. But if you don’t have that requirement, use reuse by reference.
You might even use both approaches, “reuse by copy” to manage the reuse of content over different sites, and “reuse by reference” for content within a site.

Your friendly business consultant is a can help you find out which reuse strategy makes sense for your requirements.

Why JCR search is not suited for site search

Many, especially larger websites have an integrated search functionality, which let’s users directly find content of the site without using external search engines like Google or Bing. If properly implemented and used, it can be a tremendous help to get visitors directly to the information they need and want.

I’ve get questions in the past how one can implement such a site search using JCR queries. And at least in the last years my answer always was: don’t use JCR search for that. Let me elaborate on that.

JCR queries are querying the repository, but not the website

With JCR query you are querying the repository, but you don’t query the website. That sounds a bit strange, because the website lives in the repository. This is true, but in reality the website is much more. A rendered page consists of data stored below a cq:Page node. And more data from other parts of the repository. For example you pull in assets to a page, and you also add some metadata from the assets into the rendered page. You add references to other pages and include data from there.

This means, that the rendered page contains much meaningful and relevant information which can and should be leveraged from a search function to deliver the best results. And this data is not part of the cq:page structure in the repository.

Or to put in other words: You do SEO optimization for your page to deliver the most relevant results hoping that its rank on Google gets higher and more relevant for users searching for specific terms. Do you really think that your own integrated site search should deliver less relevant data for the same search?

As a site owner I do not expect that Google delivers for a certain search keyword combination a page A as highest ranked page on my site, but my internal search a different page B which is clearly less relevant for that keywords.

That means, that you should provide your site search the same information and metadata as to Google. And for JCR queries you only have the repository structure and the information stored there, and you should not optimize this as well for relevant search results, but the JCR repository structure aims for different goals (like performance, maintainability, evolvability and others).

JCR queries implement functionality not needed for site search

The JCR query implementation needs to take some concepts into account, which are often not relevant for site search, but which are quite expensive. Just to name a few:

  • Nodetype inheritance and Mixins:  On every search there are checks for nodetypes, sometimes with the need to traverse the hierarchy and check the mixin relationships. That’s overhead.
  • ACL checks: Every search result needs to be acl checked before returning it, which can be huge overhead, especially if in the most simple case all (relevant) content is public and there should be no need to do such checks at all.
  • And probably much more.

JCR is not good at features which I would expect from a site search

  • Performance is not always what I expect from a site search.
  • With the current Oak implementation you should test every query if it’s covered by indexes; as site search queries are often unpredictable (especially if you do not only allow for a single search term, but also want to include wild cards etc) you’ll always have the risk that something unexpected happens. And then it’s not only about bad performance if your query is not covered by a matching index. It can also be that you deliver the wrong search results (or no search results at all).
  • Changing index definitions (even adding synonyms or stopwords) are necessarily an admin task, and if improperly done, they impact the overall system. Not to mention the need of reindexing 😦

From my point of view if you cannot solely rely on external search engines (Google, Bing, DuckDuckGo, …) you should implement your site search not on JCR queries. It often causes more trouble than adding a dedicated Solr instance which crawling your site and which is embedded into your site to deliver the search results.  You can take this HelpX article as starting point how to integrate Solr into your site. But of course any other search engine is possible as well.

Do I need a dedicated instance for page preview?

Every now and then there is this question about how to integrate a dedicated preview instance into the typical “author – publish” setup. Some seem to be confused why there is no such instance in the default setups, which allows you to preview content exactly as ob publish, but just not visible yet to the public.

The simple answer to this is: There should be no need to have such a preview instance.

When creating content in AEM, you work in an full WYSIWYG environment, which means that you always should have perfect view of the context your content lives in.Everything should be usable, and even more complex UI interfaces (like single page applications) should allow you to have a proper preview. Even most integrations should work flawlessly. So getting the full picture should alwaysbe possible on authoring itself, and this must not be the reason to introduce a preview publish.

 Another reason often brought up in these discussions are approvals. When authors finish their work, they need to get an approval by someone who is not familiar with AEM. The typical workflow is then outlined like “I drop herthe link, she clicks the link, checks the page and then responds with an OK or not. And then I either implement her remarks or activate the page directly”.

 The problem here is that this is an informal workflow, which happens on a differnet medium(chat, phone, email) and which is not tracked within AEM. You don’t use the ways which are offered by the product (approval workflows), which leaves you without any audit trail. One could ask the question if you have a valid approval process at all then…

Then there’s the aspect of “Our approvers are not familar and not trained with AEM!”.Well, you don’t have to train them much of AEM. If you have SSO configured and the approvers get email notifications, approving itself is very easy: Click to the link on the inbox, select the item you want to preview, open it, review it and then click approve or reject in the inbox for it. You can definitely explain that workflow in a 5 minute video.

Is there no reason to justify a dedicated preview instance? I won’t argue that there will be never the need for such a preview instance, but in most cases you don’t need it. I am not aware of any right now.

If you think you need a preview instance: Please create a post over at the AEM forum, describe your scenario, ping me and I will try to show you that you can do it easier without it 🙂

Content architecture: dealing with relations

In the AEM forums I recently came across a question about slow queries. After some back and forth I understood that the poster wanted to do thousands of such queries to render a page. When rendering a product page he wanted to references the assets associated to it.

For me the approach used by the poster was straight forward, based on the assumption that the assets can reside anywhere within the repository. But that’s rarely the case. The JCR repository is not a relational database, where all you have are queries. With JCR you can also iterate through the structure. It’s a question about your content architecture and how you map it to AEM.

That means, that for such requirements like described you can easily design your application in a way, that all assets to a product are stored below the product itself.

Or for each product page there is a matching folder in the DAM where all the assets reside. So instead of a JCR query you just do a lookup of a node at a fixed location (in the first example below the subnode “assets”) or you can compute the path for the assets (/content/dam/products/prodcut_A/assets). That single lookup will always be more performant than a query, plus it’s also easier for an author to spot and work with all assets belonging to a product.

Of course this is a very simplified case. Typically requirements are more complex, also asset reuse is often required. This approach does not work that easy anymore.
And there is no real recipe for it, but ways how to deal with it.

In case of creating such relations between content we often use tags. Content having the same tag are related, and can be added automatically in the list of related content or assets. Using tags as a level of indirection is ok and in the context of the forum post also quite performant (albeit the resolution itself is powered by a single query).

Another approach to come up with modelling the content structure is to look at the workflows the authoring users are supposed to use. Because they also need to understand the relationship between content, which normally leads to something intuitive. Looking at these details might also give you a hint how it can be modeled; maybe just having the referenced assets as paths as part of the product is already enough.

So, as already said in an earlier post, there are many ways to come up with a decent content architecture, but rarely recipies. In most cases it pays of to invest time into it and consider the effects it has on the authoring workflow, performance and other operational aspects.

Creating the content architecture with AEM

In the last post I tried to describe the difference between the information architecture and content architecture; and from an architectural point of the view the content architecture is quite important, because based on that your application design will emerge. But how can you get to a stable and well-thought content structure?

Well, there’s no bullet-proof approach for it. When you design the content architecture for an AEM-based application it’s best to have some experience with the hierarchical approach offered by the repository approach. I will try to outline a process which might help you to get you there.
It’s not a definite guideline and I will never guarantee that it will work for you, as it is just based on my experience with the projects I did. But I hope that it will give some input and can act as a kind of checklist for you. My colleague Alex Klimetschek did a presentation at the adaptTo() conference 2012 about it.

The tree

But before we start, I want to remind you of the fact, that everything you do has to fit into the JCR tree. This tree is typically a big help, because we often think in trees (think of decision trees, divide-and-conquer algorithms, etc), also the URL is organized in a tree-ish way. Many people in IT are familiar with the hierarchical way filesystems are organized, so it’s both an comfortable and easy-to-explain approach.

Of course there are cases, where it makes things hard to model; but you are hit that problem, you should try to choose a different approach. Building any n:m relation in the AEM content tree is counter-intuitive, hard to implement and typically not really performant.

Start with the navigation

Coming from the information architecture you typically have some idea, how the navigation in the site should look like. In the typical AEM-based site, the navigation is based on the content tree; that means that traversing the first 2-3 levels of your site tree will create the navigation (tree). If you map it the other way around, you can get from the navigation to the site tree as well.

This definition definitivly has impact on your site, as now the navigation is tied to your content structure; changing one without the other is hard. So make your decision carefully.

Consider content-reuse

As the next step consider the parts of the website, which have to be identical, e.g. header and footer. You should organize your content in a way, that these central parts are maintained once for the whole site. And that any change on them can be inherited down the content tree. When you choose this approach, it’s also very easy to implement a feature, which allows you to change that content at every level, and inherit the changed content down the tree, effectively breaking the inheritance at this point.

If you are this level, also consider the fact of dispatcher invalidation. Whenever you change such a “centralized” content, it should be easily possible to purge the dispatcher cache; in the best case the activation of the changed content will trigger the invalidation of all affected pages (not more!), assuming that you have your /statefilelevel parameter set correctly.

Consider access control

As third step let’s consider the already existing structure under the aspect of access control, which you will need on the authoring environment.
On smaller sites this topic isn’t that important, because you have only a single content team, which maintains all the page. But especially in larger organizations you have multiple teams, and each team is responsible for dedicated parts of the site.

When you design your content structure, overlay the content structure with these authoring teams, and make sure, that you can avoid any situation, where a principal has write access to a page, but not to any of the child pages. While this is not always possible, try to follow this guidelines regarding access control:

  • When looking from the root node in the tree to node on a lower level, always add more privileges, but do not remove them.
  • Every author for that site should have read access to the whole site.

If you have a very complicated ACL setup (and you’ve already failed to make it simpler), consider to change your content structure at this point, and give the ACL setup a higher focus than for example the navigation.

My advice at this point: Try to make your ACL setup very easy; the more complex it gets the more time you will spend in debugging your group and permission setup to find out, what’s going on in a certain situation; also the harder it will be to explain it to your authors.

Multi-Site with MSM

As you went now through these 3 steps, you are through with it and already have some idea how your final content structure needs to look like. There is another layer of complexity if you need to maintain multiple sites using the multi-site-manager (MSM). The MSM allows you to inherit content and content structure to another site, which is typically located in a parallel sub-tree of the overall content tree. Choosing the MSM will keep your content structures consistent, which also means, that you need to plan and setup your content master (in MSM terms it is called the blueprint) in a way, that the resulting structure is well-suited for all copies of it (in MSM: live copies).

And on top of the MSM you can add more specifics, features and requirements, which also influence the content structure of your site. But let’s finish here for the moment.

When you are done with all these exercises, you already have a solid basis and considered a lot of relevant aspects. Nevertheless you should still ask others for a second opinion. Scrutiny pays really off here, because you are likely to live with this structure for a while.

Information architecture & content architecture

Recently I had a discussion in the AEM forums about how to reuse content. During this discussion I was reminded again at the importance of the way how you structure content in your repository.

For this often the term “information architecture” is used, but from my point of view that’s not 100% correct. Information architecture handles the various aspect how your website itself is structured (in terms of navigation, layout but also content). It’s most important aspect is the efficient navigation and consumption of the content on the website by end users (see the wikipedia article for it, ). But it doesn’t care about aspects like content reuse (“where do I maintain the footer navigation”), relations between websites (“how can I reduce work to maintain similar sites”), translations or access control for the editors of these systems.

Therefor I want to introduce the term “content architecture“, which deals with questions like that. The information architecture has a lot of influence, but it’s solely focused on the resulting website; the content architecture focusses on way, how such sites can be created and maintained efficiently.

In the AEM world the difference can be made visible very easily: You can see the information architecture on the website, while you can see the content architecture within CRXDE Lite. Omitting any details: The information architecture is the webpage, the content architecture the repository tree.
If you have some experience with AEM you know that the structure of the website typically matches some subtree below /content. But in the repository tree you don’t find a “header” node at the top of every subtree of a “jcr:content” node of a page, same with the footer. This piece of the resulting rendered website is taken from elsewhere, but not maintained as part of every page, although the information architecture mandates, that every page has a header and a footer.

Besides that the repository also holds a lot of other supporting “content”, which is important for a information architecture but not directly mandated by it. You have certain configuration which controls the rendering of a page; for example it might control which contact email address is displayed at the page footer. From an information architecture point of view it’s not important, where it is stored; but from a content architecture it is very important, because you might have the chance to control it at a single location, which then takes effect for all pages. Or at multiple locations, which result in changing it for individual pages. Or in a per-subtree configuration, where all pages below a certain page are affected. Depending on the requirement this will result in different content architectures.

Your information architecture will influence your content architecture (in some areas it even be a 1:1 relation), but the content architecture goes way beyond it, and deals with other “*bilities” like “manageability”, “evolvability” (how future proof is the content if there will be changes to information architecture?) or “customizability” (how flexible in terms of individualization per page/subsite is my content architecture?).

You can see, that it’s important to be aware of the content architecture, because it will have a huge influence on your application. Your application typically has a lot if built-in assumptions about the way content is structured. For example: “The child nodes below the content root node form the first-level navigation of the site”. Or “the homepage of the site uses a template called ‘homepage'” (which is btw also not covered by any information architecture, but an essential part of the content architecture).

In the JCR world there is the second rule of David’s model: “Drive the content hierarchy, don’t let it happen”. That’s the rule I quote most often, and even though it’s 10 years old, it’s still very true. Because it focusses on the aspect of managing the content tree (= content architecture), and that you should decide carefully considering the consequences of it.

And rest assured: It’s easier to change your application than to change the content tree! (At least if it’s designed properly. If it isn’t, … It’s even harder to change them both.)

AEM and docker – a question of state

The containerization of the IT world continues. What started with virtualization in the early 2000s has reached with Docker a state, where it’s again a hype topic.

Therefor it’s natural that people also started to play with AEM in docker (https://adapt.to/2016/en/schedule/running-aem-in-docker.html, https://www.linkedin.com/pulse/running-aem-docker-satendra-singh and many more).

Of course I was challenged with the requirement to run AEM in docker too. Customers and partners asking how to run AEM in docker. If I can provide dockerfiles etc.  I am hestitating to do it, because for me docker and AEM are not a really good fit (right now with AEM 6.3 in 2017).

Some background first: Docker containers should be stateless. Only if the application within the container does not hold any persistent state, you can shut it down (which means deleting all the files created by the application in the container itself), start it up, replace it by a different container holding a new version of the application etc. The whole idea is to make the persistent state somebody else’s problem (typically a database). Deployments should be as easy as starting new docker instances (from a pre-tested and validated docker images) and shutting down the old ones. Not working and testing in production anymore.

So, how does that collide with AEM? AEM is not only an application, but the application is closely tied with a repository, which holds state. Typically the application is stored within the repository, next to the “user data” (= content). This means, that you cannot just replace an AEM instance inside docker by a new instance without loosing this content (or resetting it to a state, which is shipped with the docker image). Loosing content is of course not acceptable.

So the typical docker rollout approach of new application versions (bringing new instances live based on a new docker image and shutting down the old ones) does not work with AEM; the content sitting in the repository is the problem.

People then came up with the idea, that the repository can stored outside of the docker image, so isn’t lost on restart/replacement of the image. Docker calls this “host directory as data volume” (https://docs.docker.com/engine/tutorials/dockervolumes/#locate-a-volume).

Storing the repo as data volume on the host filesystem

That idea sounds neat and of course it works. But then we have a different problem. When you start a new docker image and you mount this data volume containing the repository state, your AEM still runs the “old” version of your application. Starting the repository from a different docker image doesn’t bring any benefit then.

Docker image version 2 still starts application version 1.0

When you want to update your AEM application inside the repository, you would still need to perform an installation of your application into a running repository. Working in a production environment. And that’s not the idea why you want to use docker.
With docker we just wanted to start the new images and to stop the old ones.

Therefor I do not recommend to use docker with AEM; there is rarely a value for it, but it makes the setup more complicated without any real benefit.

The only exceptions I would accept are really short-lived instances, where hosting the repository inside the docker system isn’t a problem and purging the repo on shutdown is even a feature. Typically these are short-lived development instances (e.g. triggered by Continous integration pipeline, where you automatically create dedicated docker instances for feature branches). But that’s it.

And as a sidenote: This does not only affect TarMK-based AEM instances. If you have mongo-based instances, the application is also stored within the (Mongo-) repo. Just running AEM in a new docker image doesn’t update the application magically.

To repeat myself: This considers the current state. I know that the AEM engineering is perfectly aware of this fact, and I am sure that they try to adress it. Let’s wait for the future 🙂