URL Canonicalization & Internal Linking Structure: Consistency is Key

In SEO it’s important to occasionally go back to the basics to cover some issues that can affect the inner architecture of your website. After all, a strong and complex optimization campaign starts with your site. Also, as Content Management Systems (CMS) become more powerful and more customizable, it’s important that you speak with your vendor about certain issues before you jump on board. It’s also important to consider some situations outside of the CMS and how they affect your site.

In this blog post I am going to look at servers, specifically IIS and Apache, and how they handle URL structures.

Because more content management systems are allowing webmasters to customize URL structure, many sites now are graced with static-appearing URLs that are good for everyone; search engines and users. Instead of having mile-long URLs with a bunch of dynamic parameters, many webmasters have the option of using title case, lower case, hyphens, underscores, etc (depending on their system) to customize URL structure.

Of course, the search engines are getting better at indexing dynamic URLs and are even offering tools to help with URL rewrites in their specific engine, but in the end it’s tough to argue that a parameter-rich URL is as beneficial as a clean and simple one. It’s also important to consider how servers handle URL structure. One specific area I would like to discuss is how IIS and Apache handle URL cases. It’s pretty simple, and can make a difference in how you decide to structure your URLs.

Simply put, IIS is not case-sensitive and Apache is.

This means that on IIS, the following URLs will render the exact same page in a browser:

http://www.yoursite.com/Product-Category-One
http://www.yoursite.com/product-Category-One
http://www.yoursite.com/product-category-One
http://www.yoursite.com/product-category-one
http://www.yoursite.com/Product-category-one

(you get the point)

On Apache, these many URLs will not render in a web browser. Depending on the Apache server setup, URLs with case inconsistencies will either return a 404 or will redirect to the correct version.

So What Does this Mean?

You can see that on IIS this may present some URL canonicalization issues, however I find that the real issue lies with the internal linking structure. On either server, IIS or Apache, linking to multiple versions of the same page can be very harmful to the internal architecture of your website. For example, instead of 5 links pointing to http://www.yoursite.com/Product-Category-One, you are dispersing the links to this very same page over multiple versions of the same URL, fracturing the internal link popularity of this page.

Avoid URL Inconsistencies

A lot of responsibility of URL consistency lies in the hands of those adding links within the website. It’s best practice to have a consistent and pre-determined URL structure and to make sure everyone who has permissions to make changes to the site understands the importance of this consistency. Explain that even though the page renders (on IIS) or redirects (on Apache) that inconsistencies lead to link popularity dilution and fracturing, which can hurt the site’s search engine positioning potential.

In my experience, it helps to make the URL structure as simple as possible. I like lower case. It’s simple, it’s clean, and users tend to prefer this style when directly accessing or linking to the site.

Also, if you’re running into issues, take the time to research tools and other options to help you control the canonicalization problems you’re running into. There are various techniques and server add-ons to help you control the trouble you’re having with URL inconsistencies.