AEM Resource Mapping and Apache mod_rewrite, Teamwork!

After spend­ing a few days (or weeks!) look­ing for infor­ma­tion relat­ed to hid­ing parts of URLs with Apache Web Serv­er and Adobe Expe­ri­ence Manger, I believe I may have pulled it all togeth­er. This is my attempt at explain­ing how it works.

First, what is going on between Brows­er, Web Serv­er, Cache and AEM?
rewrite_pic1

Out of the box, every­thing works due to the full path being used from the brows­er -> web serv­er -> cache and if a page is updat­ed the Flush­ing agent is able to inval­i­date the prop­er page due to the path being the same in AEM as in the cache.

Now, since I am a devel­op­er and I have access to cre­ate my own resource map­ping, I am going to add a con­fig­u­ra­tion and point a short­ened URL to my land­ing page.

Sling:internalRedirect     string     /content/geometrixx/$1

Sling:match                         string     site.com/en(.*)$

I save it, repli­cate it to pub­lish and try it out. Click, http://site.com/en.html, sweet it works. Wow that was easy, looks good, all done! Wait a minute, when I update the page it doesn’t seem to show up prop­er­ly to the user, what is going on?

Let’s take a look.

rewrite_pic2
Ok, so my Brows­er -> Web Serv­er -> Cache is all look­ing for site.com/en.html, and the map­ping in AEM points it to /content/geometrixx/en.html, so it works from that end, but my Flush­ing agent is inval­i­dat­ing /content/geometrixx/en.html in the cache, which nev­er exists! The cache is putting my page at /en.html! Well that works the first time, but if I need to update the page I will need to clear the cache every time and my Flush­ing Agent will be doing noth­ing for me.

Well scratch that, let me try to do this on the web serv­er. I think I remem­ber some­thing about RewriteRule…

RewriteRule ^/$ /content/geometrixx/en.html [PT]

Ok, now when some­thing comes in, it goes to my land­ing page and in the cor­rect cache fold­er.
rewrite_pic3
Nice! This works, the cache looks good and the Flush­ing agent can inval­i­date the page! So I click on anoth­er link on the page and then the whole path pops up again! I guess this is good enough, who real­ly looks at the URL once they are in the site any­ways. Said no cus­tomer ever….

Well, let’s try putting them togeth­er!
rewrite_pic4
This works on the ini­tial call and all fol­low­ing calls to en.html, but why? Here is the break­down.
First, you will notice the RewriteRule is slight­ly dif­fer­ent.

RewriteRule ^/content/geometrixx/ ^/$1 [PT]

This lets site.com/en.html through with no changes because it does not match the fil­ter and the map­ping in AEM points it to /content/geometrixx/en.html. When the Flush­ing Agent fires off, it hits the fil­ter in the web serv­er and is writ­ten to /en.html, so it inval­i­dates the prop­er page. So every­thing seems to work, the only issue is the caching does not reflect the fold­er struc­ture in AEM. If cache struc­ture is not a con­sid­er­a­tion for your project, then this solu­tion could work.

You will notice we only took one page, en.html, into con­sid­er­a­tion, along with only the con­tent fold­er. A final solu­tion for all assets is more com­plex and I am work­ing on a write up for it. Also, I am look­ing into only hav­ing set­ting on the web serv­er so I can keep every­thing in one spot.

Resources
http://httpd.apache.org/docs/current/mod/mod_rewrite.html
http://www.wemblog.com/2012/07/how-to-use-dispatcher-with-mapped.html

Leave a Reply