Canonicalization issues can really be a pain in the butt. They can wreak havoc on all your hard SEO work – all that great content you created, all those hard-won backlinks you built, all the other work you put into your site can be diluted like pouring a bucket of water into a glass of beer.
Making things worse, canonicalization problems come in more flavors than you can find in your local ice cream store. There are so many ways you can have canonical URL problems it makes my internet marketing brain spin just thinking about it. Im not going to make your head spin by listing them all out here.
Oh, and by the way – In yet-more fun, the solutions can be just as overwhelming as the problems… yay! If you’re a developer-type, maybe a coder or a programmer or both, then the solutions aren’t as daunting. But if you’re an average webmaster such as myself, and maybe use WordPress, such as myself, the solutions don’t come as easy.
So, what, is this post all canonical URL & canonicalization issues doom and gloom? What the heck is canonicalization, anyway, you ask? Glad you asked.
It’s maddeningly simple: canonicalization is making the same content available on different URLs.
This can happen in many ways – one of the most common being this:
indexed by Google.
Or by having dynamically-generated pages generate the same content on different URLs, very common with Ecommerce sites.
Or by having scrapers or thieves steal and publish your content on their “websites” (quotes on purpose, as those sites are usually a messy mish-mash that no visitor derives value from, but the site owner does this to try to rank highly in the Search Engine Results Pages (SERPs).
Then there is pagination and navigation issues that can cause canonicalization issues and duplicate content problems (think long articles broken up into separate pages,with a printer-friendly-page version also being indexed), duplicate websites created by the same person, sub-domains with duplicated content…
…Is your brain spinning yet?
Right now, let’s cover one of the most common, and perhaps most damaging form of canonicalization that I see literally – yes literally – every day in my job as an SEO working in an Internet Marketing agency:
Both http:// AND http://www versions of the site both exist, both are ranked by Google.
How To Find The Problem
Go to your browser address bar, and type in http://YourDomain.com (or .net, or whatever it is, and of course substitute “YourDomain” with whatever your domain name actually is, this site domain is InternetMarketingBrain) – and then hit enter.
Watch closely to see if the address changes over to http://www.YourDomain.com or stays at http://YourDomain.com.
Assuming there is no change and you stay at the URL http://YourDomain.com, now add in the www’s – place your cursor after the http:// and add the www’s so you now have http://www.YourDomain.com – and hit enter.
What happens? Does it revert to the http-only version or stay at the www version?
If there is no change in both cases when you hit enter, you have a canonicalization problem… You see, technically-speaking, a URL with www is actually a sub-domain of the http-only version. Yikes! You have a problem, Houston!
***Why is this a problem? – In brief, Google doesn’t like to show multiple results for the same search query, so your 2 sites force Google to decide which it will show. Even worse, you may have backlinks to pages on your http-only site, as well as your www version, which means you’re splitting up your precious link juice! Ouch and double ouch.
How to Fix the Problem:
If you’re on a Windows-based server (Microsoft IIS Server, to be more specific) you’ll need to ask your webmaster to do a SAP rewrite. Use a backlink analysis tool like Market Samurai to determine if the http-only or the www version has more backlinks, and have your webmaster do 301 (permanent) redirects from the version with the lower number of backlinks to the version with the higher number of backlinks.
If you’re hosted on an Apache/Linux-based server, you’ll need mod_rewrite code put into an .htaccess file placed on the root of your server (see your webmaster) ask your webmaster to write up the mod_rewrite code to perform wild-card matches such that all URI’s (page names) redirect to the corresponding URI (page) of the version you’re redirecting to.
I was going to post a couple of examples of mod_rewrite code one of the developers I work with provided, but I figured this post is already techie-enough. If you’re super-interested and want to see them anyway, let me know in the comments section below and I’ll add them here.
Bottom Line and Summary:
I wish the world of websites was not so technical, but it *just is* and the more you dive into it, the deeper it gets. I work with coders and programmers all day long and it never ceases to amaze me how much there is to know about how websites really work. Test your website for the http and www versions I noted above. If you find both exist, fix it and fix it fast, you’ll be doing your SEO a huge favor!
photo credit: PopCultureGeek.com