On-Site Duplicate Content No More: Canonical URL Tag
Think your site is affected by a duplicate content penalty? The big three have good news for you: Google, Yahoo and MSN announced a new tag this week to specify the canonical URL for each page on your site.
Duplicate content has long been an issue for search engines and marketers alike. Search engines want to avoid cluttering results with copies of the same text while marketers and website owners often need (or create, whether or not they need them) several copies of the same text with different URLs on their sites for usability or other purposes.
The new tag allows webmasters to specify what URL the search engines should use for the content, regardless of what session id, link parameter, sort parameter, parameter order or other variable has been appended to the end of the URL in a link. The new tag goes in the <head> section of a page:
<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />
Google’s and Yahoo’s blog entries address some common questions as well. rel="canonical" (the Canonical Link Tag) is a “a hint that we honor strongly” according to Google, who vouches to “take your preference into account, in conjunction with other signals, when calculating the most relevant page to display in search results.”
The Canonical Link Tag can work with relative URLs and <base> links, but both teams recommend using absolute URLs in the Canonical Link Tag. Different subdomains for a canonical URL vs a display URL are okay; however, the tag cannot be used to specify a URL on a different domain as the canonical URL of a page. Instead, they recommend the standby 301 redirect solution
The URL specified in the rel=”canonical” tag can be a redirect to a canonical URL, but they don’t recommend specifying a 404 page as a canonical URL. The tag allows slight differences in content (Google gives the example of sort order on a table of products).
As a “strong hint,” the big three don’t promise to always use the canonical URL. In fact, Google says:
Our algorithm is lenient: We can follow canonical chains, but we strongly recommend that you update links to point to a single canonical page to ensure optimal canonicalization results.
So despite this effort, a lot of links to a noncanonical URL can still influence what URL the search engines use—or can undermine your link equity.
Google cites a live example at http://starwars.wikia.com/wiki/Nelvana_Limited, which specifies its rel=”canonical” as http://starwars.wikia.com/wiki/Nelvana. Wikipedia has long had a problem with noncanonical URLs, so seeing Wikia’s participation is a sign that they’ll work to resolve this issue on the encyclopedia, too.
Duplicate content has been a major concern for years. Will you be implementing this tag on your sites?