Remove trailing slash from Sitecore URLs

Our clients once reported that many pages of their site is appearing twice in Google Analytics reports like http://mydomain.com/about-us and http://mydomain.com/about-us/. If they have hundreds of pages in site then their report is going to be very time consuming to collect unique pages and their count. Even having duplicate URLs for a common page can lower down the page rank while SEO indexing.

By default Sitecore (ItemManager.GetItemUrl(Item item)) does not append slash at the end of auto-generated URL (for non .aspx URLs). So, if we use this API properly, no chances of getting duplicate URLs. But chances that developer or Content Author by mistake added a slash in URL or end-user intentionally added slash, then such URLs are surely going to be tracked in Analytic data.

Earlier we thought to create a custom processor in httpRequestBegin pipeline. The same approach we found very well mentioned in https://aidandegraaf.wordpress.com/tag/sitecore-pipeline-processor-google-search-index-trailing-slash/.

But we do not want to give extra load to Sitecore engine and yet this approach needs extra efforts of development & QA to make full justice to any URL. Later on, we learned IIS URLRewrite can also serves the same purpose and thought to use it instead of our custom code as below.

Step 1: Open URL Rewrite Module

- Open IIS Manager.
- Click your Website on left pane.
- Click on "URL Rewrite" under IIS section as shown in below image.
- If you cannot find it, you have to install URL Rewrite IIS module.

Step 2: Add a "Append or remove the trailing slash symbol" Rule

- Click on "Add Rule(s)..."
- Select "Append or remove the trailing slash symbol" from SEO section

Step 3: Set rule to "Remove trailing slash if exists"

- From the dropdown select "Removed if it exists" and click on OK.

Alternative of above step:

As an alternate of above steps, you can directly write below code in your Web.config file. (You must have URL Rewrite installed for this as well)
Note: If you do first 3 steps from IIS Manager, ultimately IIS is going to write below code in your application's Web.config any how. So, both the steps are doing same thing.
<rewrite>
 <rules>
  <rule name="myRemoveTrailingSlashRule" stopProcessing="true">
   <match url="(.*)/$" />
   <conditions>
    <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
    <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
   </conditions>
   <action type="Redirect" url="{R:1}" />
  </rule>
 </rules>
</rewrite>

That's it, we have a better way to provide unique URLs by removing trailing salsh to have better Site Analytics and SEO indexing!

Apart from this, URL Rewrite is very powerful module, which we can use for multipurpose like,
- Creating Reverse Proxy using IIS.
- Creating Load Balanced Web Farm using IIS
- Other URL rewrites

Our Learning on Publishing Sitecore Sublayouts

After we started using Web Deploy for publishing sublayouts on our multisite environment, we were getting random caching issues on our CD servers. We have multiple publishing target databases and load balanced multiple CD servers. So, as I mentioned in my previous blog, when a sublayout is published, it is published with help of Web Deploy to one server and then the sublayout is replicated to all other servers using DFS.

What kind of issues we found?

Sometimes we found that the published sublayout gets reflected on few servers, on few servers we still get older content. But surprisingly, the published sublayout was replicated on all servers and still we were getting different output from different servers. So, when this happens, we used to clear caches on servers where such issues occurred assuming this might be because of Sitecore caching issues, later on we found workaround to publishing those sublayouts again after 1 or 2 minutes.

Now you will understand how critical it would be to publish sublayouts and getting them reflected on live servers quickly to make go-lives, re-brandings, news releases or press releases successful in one go.

Note: We found this only for those Cacheable sublayouts.

Our learnings to fix such issues

1. Web Deploy publishing should be synchronous.

We analyzed below sequence happened rarely.
  1. First we started publishing, so as per my previous blog, Web Deploy will start deploying sublayout in async mode (By default Web Deploy is configured on Sitecore is asynchronous). So, sublayout item publishing and sublayout physical file deploying are done in parallel.
  2. So, chances that item gets published before sublayout file is copied.
  3. Now, item is published, so CD servers will invoke "publish:end:remote" event and clear HTML (sublayout) cache.
  4. Now, before the new published sublayout file gets deployed to CD server, end-user requested a page which uses the same sublayout. So, the HTML cache will be generated again for older Sublayout.
  5. Now, Web Deploy sent a new Sublayout. (So, we have HTML cache of older sublayout)

Learning: Sublayout publishing should always be synchronous, so publishing will get on hold till the sublayout is not deployed to the CD server, which can be configured in WebDeploy.config as below.
<event name="publish:begin">
 <handler type="SitecoreTactics.Publishing.BeginWebDeploy, SitecoreTactics" method="PublishSublayouts">
  <synchronous>true</synchronous>
  <tasks hint="list:AddTask">
   ......
  </tasks>
 </handler>
</event>

2. Target Database should be the first publishing target.

We have 3 publishing targets say, web, web-2, web-3. Means, any publishing will be done in this sequence itself. Earlier we found the target database inside the WebDeploy.config was set as last publishing target, means web-3. We might have done this in past to make sure the item is published before sublayout getting deployed.

But, as per our recent experiences and findings, we should keep it as first publishing target. So, in our case, it should be web. This will help us when there are multiple servers in DFS, so when DFS is taking some more time in replicating to other servers. So,  for reducing such chances of delayed Sublayout deploy, we should keep web as the publishing target in WebDeploy.config as below.
<tasks hint="list:AddTask">  
 <default type="Sitecore.Publishing.WebDeploy.Task">  
   <!-- It should be the first Publishing target database -->  
   <targetDatabase>web</targetDatabase> 
 </default>
</tasks>

3. We can make few seconds delay in clearing HTML Cache.

If we have done above cases, there are no chances that the CD server where Web Deploy is sending Sublayout file will get any issues. But just consider a worst case where DFS is taking more time to replicate sublayouts to other servers. So, chances that HTML Cache will get cleared before the sublayout replication is done.

To avoid such cases, we can add a delay of few seconds, say 2 or 3 seconds before clearing HTML Cache (Only when the sublayouts are getting published)

Finally, we have hassle free one-time Sublayout publishing working without any caching issues very well! I'm sure this will be helpful to others who are facing same kind of issues.