Search engines fight duplication with canonical
Google, Microsoft and Yahoo are now providing a way for web site developers to specify a preferred URL for any piece of content on a web site. The problem has been that sites may, legitimately, have the same content on different URLs. This causes problems for search engines which can't easily differentiate between these duplications. Now, a new type of "canonical" link reference can allow a page to express the preferred URL for a search engine to use.
Google explains the process in a with a step-by-step example. In the example, they show some URLs that point to the same page:
The differences in the examples are caused first by a category parameter and secondly by a tracking id and session id. To set the preferred URL, the page maintainer adds a
link element, with the
rel attribute set to "canonical" and the
href attribute set to the preferred URL. The
link element, goes into into the
head section of the page. For the examples above this would be
<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />
A Google search will now understand that the duplicated links belong to
http://www.example.com/product.php?item=swedish-fish, and additional URL properties like PageRank and related informations, says Google, will be transferred as well.
W3C (the World Wide Web Consortium) specifically provided the
rel attribute in the
link element for use by web developers to define relationships between pages for consumption by search engines.
Matt Cutts, a Google Engineer, also published a presentation on the link element and pointed out that canonical plug-ins have already appeared for Wordpress, Magento and Drupal.