In the September session of the Ask the expert series (passcode: “Dispatch”) I talked about problems arising out the requirement to deal with multiple sites and each site having it’s own domain, and that a sling mapping is used to map the long repository paths to shorter URLs (like mapping /content/geometrix/en/services.html to http://geometrixx.com/services.html). I already tried to deal with this question in the Q&A part of the session, but I will write it here in more depth.
In the session on AEM dispatcher setups there was a question how to deal with shared content. If you do a straight-forward configuration of the dispatcher and map a shared content path (being it assets or pages) into the site structure of a site, the content is cached at this location in the dispatcher cache, but the invalidation happens only once at the „original“ path. So the content within the mapped paths in the site structure is not invalidated at all.
This is a problem, but you can see this problem from more than one angle.
The first question is, if you really need to share this content at all. I am not a SEO expert, but from what I heard, having duplicate content on multiple domains gives you a negative score on your page rank. Also from my point of view at some point the necessity rises to customize this shared content per tenant, which leads often to copy a shared page into the site and customize it there, essentially not using the shared content anymore. If there’s the risk of having this problem, you should think of using the MSM to avoid this „copy-and-adapt“ workflow and make it manageable. In that case you have true local copies and you don’t need to map the pages into the site content structure, avoiding the caching and invalidation problem completely.
The second question is, if it makes sense to offload all this shared content into a dedicated „shared ocontent“ domain, which is used by all sites; in that case the need to duplicate is avoided as well.
These are 2 suggestions to avoid some of the problems of the „shared content“ approach. If you cannot use them, you have to go the way of duplicate content at dispatcher level, with all the implications it has, mainly:
- potential SEO problems because of duplicate content
- increased disk consumption on dispatcher level
To deal with the problem of duplicate content and invalidation you have to go the way to create a custom invalidation logic, which is aware of your special setup and which does the invalidation accordingly. See the documentation on the dispatcher regarding this topic.