Things to remember while using CDN for Sitecore websites

We had many learnings while using CDN for our different types of Sitecore websites, so thought to share here, if get useful to others! Before that I want to share one of few interesting incidents that left a message to us for configuring CDN carefully.

We believe that Sitecore resolves the Site using host name from the request URL, so for serving media requests we do not need to forward cookies or any other parameters from CDN to Sitecore CD servers. That's very true, but partially. If you are thinking for providing the best user experience to end-users, you need to take care more than this.

Now, consider a case that a user is publishing a new Content Page that have few images related to it also got published on your one more Publihing Target Databases. First request of this page came to one of many clustered servers on the same time when your publishing got finished. What are the chances that all those images will be visible to that first view of the page, 100%? No, not at all. Let's see why.

You know that CDN manages sticky session using a cookie (i.e, AWSELB for AWS CloudFront). We normally need sticky session for content pages. So, we never forget forwarding Cookies from CDN to CD servers and don't do the same for media files as media files has nothing to do with Cookies. In such cases, the first request of content page went to a server i.e, A. But, the images might get served from different servers say B, C, etc. (as we set them not to carry cookies) and think the images are yet to get reflected on any of these servers due to publishing or caching delay of a second. So, content of the page will be served properly but images will return 404 and CDN will cache the response for few minutes. It means we are still leaving with chances that end-users will get disturbed page layout for few minutes. This also gets applied to Stylesheet or Javascript files as well if they are served from Sitecore items. We can fix the issue if we apply sticky session forwarding for media requests as well.

CDN Configurations for caching media files

  1. Forward Cookies those play role in maintaining sticky session (i.e. AWSELB cookiefor all media items (To fix above explained issue)
  2. Forward Querystrings (To support Media Querystring parameters explained here for responsive websites)
  3. Never cache such media files those are protected i.e, those have disclaimers. You can keep them in a separate media folder and apply rule not to cache such URL patterns.
  4. If your media items are getting changed rarely, keep bigger caching duration i.e, 1 hour, otherwise keep it little as 5 minutes. Or to get more accurate results, Instead of all above rules, you can also get benefits of 304 if-modified-since header to serve media requests.

CDN Configurations for caching content pages

  1. If it's a pure static site without any user logins or protections, you can serve the site without forwarding any parameter.
  2. If your site is developed for multi device support, you must forward Referrer and User-Agent request headers.
  3. If your site is having any kind of login facility or requires session or has cookie-oriented responsive or adaptive architecture, you must forward Cookies header.
  4. If you have implemented security based on IP Addresses, you must forward X-Forwarded-For header.
  5. If you have implemented Browser Based Content Negotiation, you must forward Accept, Accept-Language, etc. parameters.
  6. If you are using functionalities like personalization, secured content, etc. you can avoid content caching on CDN.
  7. Never cache HTML content served through other than GET request.

So, for getting best usage of CDN with best user experience, you must have knowledge how your website are developed and behaves.

Avoid UNC file share to prevent File Change Notification issues

Recently we were experienced many unexpected application pool recycling with File Change Notifications (FCN) on our newly created Sandbox Sitecore CM instances. Every few hours (ranging from 20-40 hours) only one of two CM instances was getting restarted automatically with below error.

Change Notification for critical directories. File Change Notification Error in App_LocalResources HostingEnvironment initiated shutdown CONFIG change HostingEnvironment caused shutdown

What we tried

After spending a good amount of time on it, we hypothetical thought for few workarounds to do like
  • Stop Anti-virus on server
  • Use Process Monitor to check a File System for Web-root and Temporary Internet Files, etc. folders to know what's actually causing this
  • Check there's no access given on the web-root to any unauthorized user. 
But all looked good for us, even not getting much help from Google as well. So final option we had to debug the crash dumps.

One more thought came in mind that recycling is happening only on one instance, which is using Sublayouts from other Sitecore instance using UNC file share, and that other instance never got recycled! We know that DFS was the best option here, but we were not able to digest that UNC share can really cause this issue. At the same time we found a nice post where someone already faced the same issue and they fixed it by stopping UNC path sharing for sublayouts. And yes, it worked for us too.
(Note: Recycling is not happening because of number of re-compilations - "numRecompilesBeforeAppRestart")

- http://www.dnnsoftware.com/forums/threadid/318762/scope/posts/file-change-notification-issues-web-farm-over-unc-share
- http://blogs.msdn.com/b/tess/archive/2006/08/02/686373.aspx

What we Learned and Investigated:

- We learnt one more reason that can recycle the application pool
- Avoid using UNC sharing for ASPX, ASCX, RESX, App_Code files, etc. compilable files and Use DFS for them, what Sitecore recommends in Sitecore Scalability Guide.

Sitecore media streaming issue after publishing!

Recently we came across a strange behaviour of Sitecore Media streaming in MediaCache that "Overwritten media files are not getting reflected after publish". Just to have clear idea, this is not a browser caching issue mentioned in Sitecore KB article.

What's the issue?

We created a media item on CM and published it, and was visible on live site. Now we overwritten a new media file to the same media item and published it again. (Either using Detach/Attach from Content Editor or using Overwrite existing media from Page Editor.) Surprisingly, we were still getting older media file! We published again, again and again, but newly published media not getting updated.

And yet, this is a very random issue and occurs very rarely.

How we tried to troubleshoot?

  1. We have multiple servers in cluster with 2 target databases. We found that few servers of both target databases are serving older media file and rest of them serving latest one.
  2. Then we thought there might be some Item Path Cache or Item Cache clearing issue (Which happens on Sitecore some times). So, we cleared both these caches for this media item using Sitecore API. But result was same.
  3. Then we cleared whole Item Cache and Data Cache using Sitecore API. The result was same.
  4. Then we cleared All Sitecore Caches using http:///sitecore/admin/cache.aspx page. The result was same.
  5. Final option we had to clear all media cache physical files (Website\App_Data\MediaCache) so that Sitecore will create new media cache from database and can serve latest one. Even after deleting all files and folder from it, new files got generated but still were older one.
So, no solution at all after applying these many tricks!

How we fixed?

We had no other option but recycling the Application Pool. Finally, the master key worked for us. :)

What we concluded and what's the solution?

The only conclusion we had that Sitecore is storing media files somewhere in Server memory as well. Strange, right?

We raised to Sitecore Support for further investigation. Many thanks to Andrey Krupskiy from support who investigated and confirmed that Sitecore is really storing media files in RAM as well that might have caused this and provided below solution.

There is an internal media cache in RAM. This cache is used when media is not yet saved to the filesystem. Even, if you check code of Sitecore.Resources.Media.MediaCache class, in Reflector, it says the same. Sitecore serves media file RAM before its actual file cache gets generated on disk (might be to serve media faster), which is default behaviour of Sitecore. We can disable this behaviour by changing below configuration in Web.Config.
<setting name="Media.StreamPartiallyCachedFiles" value="false" />

We disabled the Media.StreamPartiallyCachedFiles setting as shown above on CM and CD servers.

Now it has more than a month now, we haven't faced the issue again.

Dealing with duplicate item names in Sitecore publishing

A very common challenge in Sitecore is to deal with duplicate item name inside a single parent. Many times our content authors complaint that they are not able to see latest content for an item even after publishing it many times. Let's see why users were getting older content.

What they actually do?

1. We have one item say Test under Home and published it.

On CM Server

On CD Server

2. Now, deleted the Test item, and created another item with Test name itself and published
(Or created new item named Test and deleted older one.)

On CM Server


On CD Server


Uff, how can we get latest content, when live database contains two items with same name under a single parent? Now, when you request for http://sitecoretactics/test/, it will surely going to render any random item. But due to increasing number of such cases, we thought to have a permanent solution to have unique item

Solution to have unique item on live sites

We will create a custom processor in <publishItem> pipeline that will take care for this. When the item Test gets published, it will check any other item with same name exist on target database or not. If exist with different ItemId, then it will delete that old item.

Create a PublishItemProcessor class as below.
namespace SitecoreTactics.Publishing
{
    public class RemoveDuplicateItems : PublishItemProcessor
    {
        Item sourceItem = context.PublishHelper.GetSourceItem(context.ItemId);
        if (sourceItem != null)
        {
            Item targetItem = context.PublishOptions.TargetDatabase.GetItem(sourceItem.Paths.Path);

            if (targetItem != null && targetItem.ID != sourceItem.ID)
            {
                context.PublishHelper.DeleteTargetItem(targetItem.ID);
            }
        }
    }
}

Using a patch config, add this processor inside pipeline as below. You may need to change processor's order if you have any customizations done in the pipeline.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <pipelines>
      <publishItem>
        <processor type="SitecoreTactics.Publishing.RemoveDuplicateItems, SitecoreTactics" />
      </publishItem>
    </pipelines>
  </sitecore>
</configuration>
Now repeat the same steps above, you will find older item gets deleted!

Remove trailing slash from Sitecore URLs

Our clients once reported that many pages of their site is appearing twice in Google Analytics reports like http://mydomain.com/about-us and http://mydomain.com/about-us/. If they have hundreds of pages in site then their report is going to be very time consuming to collect unique pages and their count. Even having duplicate URLs for a common page can lower down the page rank while SEO indexing.

By default Sitecore (ItemManager.GetItemUrl(Item item)) does not append slash at the end of auto-generated URL (for non .aspx URLs). So, if we use this API properly, no chances of getting duplicate URLs. But chances that developer or Content Author by mistake added a slash in URL or end-user intentionally added slash, then such URLs are surely going to be tracked in Analytic data.

Earlier we thought to create a custom processor in httpRequestBegin pipeline. The same approach we found very well mentioned in https://aidandegraaf.wordpress.com/tag/sitecore-pipeline-processor-google-search-index-trailing-slash/.

But we do not want to give extra load to Sitecore engine and yet this approach needs extra efforts of development & QA to make full justice to any URL. Later on, we learned IIS URLRewrite can also serves the same purpose and thought to use it instead of our custom code as below.

Step 1: Open URL Rewrite Module

- Open IIS Manager.
- Click your Website on left pane.
- Click on "URL Rewrite" under IIS section as shown in below image.
- If you cannot find it, you have to install URL Rewrite IIS module.

Step 2: Add a "Append or remove the trailing slash symbol" Rule

- Click on "Add Rule(s)..."
- Select "Append or remove the trailing slash symbol" from SEO section

Step 3: Set rule to "Remove trailing slash if exists"

- From the dropdown select "Removed if it exists" and click on OK.

Alternative of above step:

As an alternate of above steps, you can directly write below code in your Web.config file. (You must have URL Rewrite installed for this as well)
Note: If you do first 3 steps from IIS Manager, ultimately IIS is going to write below code in your application's Web.config any how. So, both the steps are doing same thing.
<rewrite>
 <rules>
  <rule name="myRemoveTrailingSlashRule" stopProcessing="true">
   <match url="(.*)/$" />
   <conditions>
    <add input="{REQUEST_FILENAME}" matchType="IsDirectory" negate="true" />
    <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
   </conditions>
   <action type="Redirect" url="{R:1}" />
  </rule>
 </rules>
</rewrite>

That's it, we have a better way to provide unique URLs by removing trailing salsh to have better Site Analytics and SEO indexing!

Apart from this, URL Rewrite is very powerful module, which we can use for multipurpose like,
- Creating Reverse Proxy using IIS.
- Creating Load Balanced Web Farm using IIS
- Other URL rewrites

Our Learning on Publishing Sitecore Sublayouts

After we started using Web Deploy for publishing sublayouts on our multisite environment, we were getting random caching issues on our CD servers. We have multiple publishing target databases and load balanced multiple CD servers. So, as I mentioned in my previous blog, when a sublayout is published, it is published with help of Web Deploy to one server and then the sublayout is replicated to all other servers using DFS.

What kind of issues we found?

Sometimes we found that the published sublayout gets reflected on few servers, on few servers we still get older content. But surprisingly, the published sublayout was replicated on all servers and still we were getting different output from different servers. So, when this happens, we used to clear caches on servers where such issues occurred assuming this might be because of Sitecore caching issues, later on we found workaround to publishing those sublayouts again after 1 or 2 minutes.

Now you will understand how critical it would be to publish sublayouts and getting them reflected on live servers quickly to make go-lives, re-brandings, news releases or press releases successful in one go.

Note: We found this only for those Cacheable sublayouts.

Our learnings to fix such issues

1. Web Deploy publishing should be synchronous.

We analyzed below sequence happened rarely.
  1. First we started publishing, so as per my previous blog, Web Deploy will start deploying sublayout in async mode (By default Web Deploy is configured on Sitecore is asynchronous). So, sublayout item publishing and sublayout physical file deploying are done in parallel.
  2. So, chances that item gets published before sublayout file is copied.
  3. Now, item is published, so CD servers will invoke "publish:end:remote" event and clear HTML (sublayout) cache.
  4. Now, before the new published sublayout file gets deployed to CD server, end-user requested a page which uses the same sublayout. So, the HTML cache will be generated again for older Sublayout.
  5. Now, Web Deploy sent a new Sublayout. (So, we have HTML cache of older sublayout)

Learning: Sublayout publishing should always be synchronous, so publishing will get on hold till the sublayout is not deployed to the CD server, which can be configured in WebDeploy.config as below.
<event name="publish:begin">
 <handler type="SitecoreTactics.Publishing.BeginWebDeploy, SitecoreTactics" method="PublishSublayouts">
  <synchronous>true</synchronous>
  <tasks hint="list:AddTask">
   ......
  </tasks>
 </handler>
</event>

2. Target Database should be the first publishing target.

We have 3 publishing targets say, web, web-2, web-3. Means, any publishing will be done in this sequence itself. Earlier we found the target database inside the WebDeploy.config was set as last publishing target, means web-3. We might have done this in past to make sure the item is published before sublayout getting deployed.

But, as per our recent experiences and findings, we should keep it as first publishing target. So, in our case, it should be web. This will help us when there are multiple servers in DFS, so when DFS is taking some more time in replicating to other servers. So,  for reducing such chances of delayed Sublayout deploy, we should keep web as the publishing target in WebDeploy.config as below.
<tasks hint="list:AddTask">  
 <default type="Sitecore.Publishing.WebDeploy.Task">  
   <!-- It should be the first Publishing target database -->  
   <targetDatabase>web</targetDatabase> 
 </default>
</tasks>

3. We can make few seconds delay in clearing HTML Cache.

If we have done above cases, there are no chances that the CD server where Web Deploy is sending Sublayout file will get any issues. But just consider a worst case where DFS is taking more time to replicate sublayouts to other servers. So, chances that HTML Cache will get cleared before the sublayout replication is done.

To avoid such cases, we can add a delay of few seconds, say 2 or 3 seconds before clearing HTML Cache (Only when the sublayouts are getting published)

Finally, we have hassle free one-time Sublayout publishing working without any caching issues very well! I'm sure this will be helpful to others who are facing same kind of issues.

Sitecore Publish Selected Sublayouts using WebDeploy

On our Multisite Sitecore instance, we have thousands of sublayouts. So, content authors or developers should be able to modify and publish selected sublayouts. Means, only selected sublayout should be deployed to CD servers along with the items.

We achieved this using Web Deploy, that can be configurable as guided is Sitecore Scalability Guide.

What approach we chose to conditionally sync Sublayouts?

  1. On CM environment, we have all sublayouts stored in a folder SiteSublayouts, now on publishing if sync this folder with CD server's relevant folder, then all sublayouts will get synced instead of just publishing the selected one. So, we applied an idea create another directory PublishedSublayouts on same level.
  2. So, on publishing a sublayout, it will be first copied from SiteSublayouts to PublishedSublayouts folder and then invoke WebDeploy. So,this will sync all sublayouts from PublishedSublayouts (Actually published or publishable sublayouts) to live server's SiteSublayouts folder.
Note: Here, we created SiteSublayouts and PublishedSublayouts folders outside the Webroot to ease of use.

How we implemented this approach?

  1. Configured WebDeploy settings in Sitecore. Enable App_Config\Include\Webdeploy.config file and do changes as below.
    <?xml version="1.0" encoding="utf-8" ?>
    <configuration>
      <sitecore>
        <events>
          <event name="publish:begin">
            <handler type="SitecoreTactics.SublayoutPublish, SitecoreTactics" method="SublayoutPublish">
              <tasks hint="list:AddTask">
                <default type="Sitecore.Publishing.WebDeploy.Task">
                  <!-- Publishing to the target database will trigger this deployment task. -->
                  <!-- You should prefer to write here first publishing target (if have multiple target DBs) -->
                  <targetDatabase>web</targetDatabase>
    
                  <!-- Target server is where we want to send sublayouts. If omitted, operation is performed on the local server. -->
                  <targetServer>x.x.x.x</targetServer>
    
                  <!-- userName and password are optional. If omitted, local user identity or credentials saved in Windows Vault will be used to connect to the server. -->
                  <userName>Administrator</userName>
                  <password>Password</password>
    
                  <!-- localRoot is optional. If omitted, the website root is used. -->
                  <localRoot>E:\CMS\Sitecore\PublishedSublayouts</localRoot>
    
                  <!-- remoteRoot is physical path where sublayouts are stored on remote server -->
                  <remoteRoot>E:\CMS\Sitecore\SiteSublayouts</remoteRoot>
                  
                  <!-- Paths, relative to the localRoot, which will be deployed to the remote location. -->
                  <items hint="list:AddPath">
                    <media>SiteSublayouts/</media>
                  </items>
                  
                </default>
              </tasks>
            </handler>
          </event>
        </events>
      </sitecore>
    </configuration>
    
    Here, we synced the PublishedSublayouts folder of CM server with relevant SiteSublayouts folder of CD server using Web Deploy. And we customized Sitecore's default WebDeploy handler for copying sublayouts mentioned in step 2. 
  2. Create a class as below by inheriting with Sitecore.Publishing.WebDeploy.PublishHandler as below. Here, when any sublayout is started publishing, on begin:publish event we defined in step 1, we copy the sublayout from SiteSublayouts folder to PublishedSublayouts folder and invoke WebDeploy as below code.
  3. namespace SitecoreTactics
    {
     public class SublayoutPublish : Sitecore.Publishing.WebDeploy.PublishHandler
     {
      string SourceFolder = "E:\CMS\Sitecore\SiteSublayouts";
      string DeployFolder = "E:\CMS\Sitecore\PublishedSublayouts";
    
      protected void DeploySublayout(object Sender, EventArgs args)
       {
       Item RootItem = ((Sitecore.Publishing.Publisher)(((Sitecore.Events.SitecoreEventArgs)(args)).Parameters[0])).Options.RootItem;
         if (RootItem.Paths.Path.ToLower().IndexOf("/sitecore/layout/sublayouts/") >= 0)
       {
        string sublayoutSourceFolder = SourceFolder + <Relative Path of the sublayout>;
        string sublayoutDeployFolder = DeployFolder + <Relative Path of the sublayout>;
    
        // Copy publishing sublayout to Deployable folder   
        File.Copy(sublayoutSourceFolder, sublayoutDeployFolder);
    
        // Invoke WebDeploy to sync published sublayouts
        base.OnPublish(Sender, args);
       }
      }
     }
    }
  4. We have multiple CM servers, so we replicated all these published sublayouts with help of DFS across all servers.
It's done! We can also use the same approach for publishing file based media items.

Very soon I will post few leanings we had after implementing sublayouts publishing!

Sitecore Lock / Unlock Item without modifying Statistics

Sitecore updates the item statistics (Updated and Updated by fields) on each lock or unlock operation on an item. Sometimes this is misleading for content authors.

When this is misleading

  • When a user is just locking an item and not done any modifications in it but this creates dilemma for other users that the user already had some changes on that item or not. In our multisite environment where hundreds of users work on a single Sitecore instance, this confusion occurs frequently.
  • Even some content authors demand to auto-unlock their items after publishing without updating modified Date time.

How we can achieve this?

We will see how we can lock or unlock items without modifying item statistics.

Approach#1

The best way to achieve it is using Item's RuntimeSettings itself.
public void LockItem(Item item)
{
    if (!item.Locking.IsLocked())
    {
        item.RuntimeSettings.ReadOnlyStatistics = true;
        item.Locking.Lock();
        item.RuntimeSettings.ReadOnlyStatistics = false;
    }
}
public void UnlockItem(Item item)
{
    if (item.Locking.IsLocked())
    {
        item.RuntimeSettings.ReadOnlyStatistics = true;
        item.Locking.Unlock();
        item.RuntimeSettings.ReadOnlyStatistics = false;
     }
 }
Here, item.RuntimeSettings.ReadOnlyStatistics = true; will not update statistics for current context of item. So, while locking or unlocking, it will not update statistics.

Approach#2

Another simple way is to update the __lock field though APIs with updateStatistics to set as false as below. I personally do not recommend this approach, but still works great in many cases where above approach does not work.
public void LockItem(Item item)
{
    if (!item.Locking.IsLocked())
    {
        using (new EditContext(item, false, false))
        {
            item["__lock"] = "<r owner=\"" + Context.User.Name + "\" date=\"" + DateTime.Now.ToString("yyyyMMddTHHmmss") + "\" />";
        }
    }
}
public void UnlockItem(Item item)
{
    if (item.Locking.IsLocked())
    {
        using (new EditContext(item, false, false))
        {
            item["__lock"] = "<r />";
        }
    }
}
Happy to see it works and happy to see our content authors happy!

Get Optimized Images For Sitecore Responsive Websites

If you are having Responsive or Adaptive websites built in Sitecore and using Sitecore Image Parameters to resize images on-the-fly, this post is helpful to you!

Recently while working for responsive websites, we found that while resizing image with less dimensions, Sitecore produces image with more file size than its original one. It's really not expected because it also increases page load time.

Example

Below is the image I found from Sitecore Website's Homepage, which is having dimensions of 660px × 441px and having size of 48.45 kB (49,614 bytes). Image: http://dijaxps1e29ue.cloudfront.net/~/media/Redesign/Common/Heros/600x441/Homepage_hero_600x441.ashx?ts=121514021455931&la=en



Now if I request Sitecore to produce image with less resolution. i.e., with width of 600px. So, it generates an image of dimensions of 600px × 401px and having size of 57.4 kB (58,778 bytes). Image: http://dijaxps1e29ue.cloudfront.net/~/media/Redesign/Common/Heros/600x441/Homepage_hero_600x441.ashx?ts=121514021455931&la=en&w=600


Is it a bug from Sitecore?

No. But Sitecore by default uses "Lossy Compression Algorithm" to resize images, so reducing image dimensions will not reduce file size. Also, Sitecore uses 95% of image quality by default, that will generate image with bigger file size. To know more about it, you can check code from Sitecore.Resources.Media.ImageEffectsResize class ResizeImageStream() function.
<setting name="Media.UseLegacyResizing" value="false" />
<setting name="Media.Resizing.Quality" value="95" />
Reducing above quality setting may give us image with less file size but should we compromize with quality of image?

Then how to solve this?

There is an alternate way Sitecore gives which was the default behavior in Sitecore in earlier versions that is by enabling Sitecore's ImageLegacyResizing. You can know more about it from Sitecore.Resources.Media.ImageEffectsResize class ResizeLegacy() function. You can do below settings for getting reduced file size. Thank to Sitecore Support guy (Paul Kravchenko) for guiding me in this direction.
<setting name="Media.UseLegacyResizing" value="true" />
<setting name="Media.InterpolationMode" value="Low" />

Media.UseLegacyResizing
This setting controls whether to use legacy resizing (ie. bypass the Sitecore.ImageLib library).

Possible values can be:
true
false (Sitecore's default value)


Media.InterpolationMode
The interpolation mode to use when resizing images, which are available on System.Drawing.Drawing2D.InterpolationMode enum. Read more about InterpolationMode. We can any of below values as per our need.

Possible values can be:
Bicubic
Bilinear
Default
High (Sitecore's default value)
HighQualityBicubic
HighQualityBilinear
Low
NearestNeighbor

Again, Sitecore defines these settings in configuration file so values of the setting remains same for each image resize, so not that much useful to get a generic solution. Eager to know if someone has such generic solution to resize any kind of image, by maintaining quality with reduced file size what Photoshop or Paint.Net gives.

Sitecore MVP 2015 - Achievement Unlocked Once Again!


Feeling proud that I am honored to be selected as "Technology Most Valuable Professional (MVP)" once again by Sitecore. Here is the list of of all Sitecore Technology MVPs 2015: http://www.sitecore.net/Events/Public-MVP-site/MVPs-2015/Technology.aspx

I want to thank everyone who frequented my blogs, my family, coworkers, Sitecore community, my employer - Investis and yes Sitecore!