Duplicate Content & the Canonical URL

Among webmasters, duplicate content has long been a source of confusion, discussion and urban myth. One of the widely held beliefs is that duplicate content will get your site penalized by the search engines, thus hurting its chances of performing or even showing up on a SERP.

Last year, a post on Google’s webmaster blog stated this to be somewhat of a misconception. Unless the duplicate content is scraped and republished for malicious purposes, it’s not typically grounds for removal from a search result.

Duplicate content isn’t always the result of scraping and republishing. Often times it is the result of multiple URLs on the same domain, utilizing the same content, for any number of reasons. This won’t necessarily hurt you but your content’s strength can be diluted.

Google’s web crawler typically does a good job recognizing non-malicious duplicate content and has the ability to choose the most relevant version of the content to display.

But now Google is going a step further to allow webmasters to point Google’s crawler in the right direction.

According to a post yesterday on the Google webmaster blog, for simplicity and control of your URLs, all you have to do is insert this tag inside the section of site pages that are duplicates.

<link rel="canonical" href="http://www.example.com/product.php?item=1234" />

This tells Google the exact URL you prefer to have associated with the content in question. Google will see this line of code and know that any duplicates on the site refer to the canonical URL.

And it’s really as simple as that. This new format gives you the power to control your duplicate URLs, reduce URL clutter and highlight the URLs that you feel best represent your content; all of that can be yours with a simple line of code.