Now, soon after the Wordpress 2.7 version was released, along with all the excitement, there was one problem that was not noticed in all the hype. A couple of webmasters complained that the post articles with more than 50 comments created multiple URLs for the same post, because the comments split itself into separate pages every time it crossed 50.
Obviously, this feature was included in Wordpress 2.7 for ease of navigation and better user experience. Wordpress would by default cut a page if the comment number exceeds 50 and create a new URL for comments starting from 51.
The settings can be adjusted in Wordpress >Discussion Area page.
QOT, have discussed this duplicate issue in this post, and also here at WP hacks, where it is said that in order to get out of this situation all you have to do is to uncheck this setting.
Now, I wish if the duplicate content issue ended here. But unfortunately, there was more.
Another webmaster friend of mine (He doesn’t want to disclose himself) complained about something similar.
He had this website that’s really popular in India with news articles. And in one of the sections of the site where he has Wordpress installed, there was something strange happening with comments.
With his popular traffic stats, Google picks up every new published story on blogsearch and Google.com within seconds of publishing. But lately, he found out that whenever someone commented (published after approval) on an already indexed, new story – the comment URL gets picked up by Google.
Ex:- site.com/category/story-commentpage1
And then on, its really strange. Soon after the comment page is indexed, the original URL is replaced in the index, so much that the original URL does not appear for an appropriate search term it used to rank for. Now, the comments URL is indexed and it does not rank for the appropriate term.
Adding more to the problem – Every new story that’s published in the category after this, does not get indexed on Google.com or Blog Search.
So that means, if he let one comment publish in any one story in the section,
- The original URL which used to rank for the appropriate term on Google vanishes.
- The comments URL replaces the original URL but does not rank for search terms it used to rank for.
- The comment url gets picked up in blogsearch.
- No new stories published after this gets indexed on Blog search (only comment URLs does).
- So instead of new posts getting indexed and ranked within minutes of
publishing, indexing of actual post URLs get delayed for hours, then
stops altogether.
Now, that’s a tricky situation right ?
This is how I saw it. Technically, as a new comment is published, it goes to the paginated mode, a new URL is made and Google indexes it. But a clear duplication happens here.
According to the search engines,
- There is one original URL, which got indexed first. (Ex:- site.com/category/story.htm)
- And then there’s this comment URL which acts up as duplicate content(Ex:- site.com/category/story-commentspage1.htm)
So technically, every new comment page shows up as duplicate content to Google. (Check screenshots from Google webmasters to prove this)
So how to get over this issue ?
- If only we could 301 redirect all the comment URLs to the original URL. (Lots of complicated commands on .htaccess, confusing and complicated.)
- If only we could stop the comments URL from appearing. (Done – But indexing issue for new posts remained, when u remove the comment pagination, this is done automatically. But
nothing else changes, problems remain.)
- If we could block the robots from indexing all duplicate comment pages (Done, but that’s too tedious a job).
And then came the canonical tags for duplicate content from Google (and others of course) !
With the canonical tags, it would be possible for us to add the canonical tag to all the duplicate comments page, so that even if it created duplicate instances of the same page, the robots would be directed to go back and look for the original content on the original URL. (Tadaaaa !)
Now, this really worked out. But adding canonical tags manually meant a whole lot of time wasted. Instead we used the Canonical Tag plugin from Yoast. This plugin, helped us add the canonical tags to every duplicate content page stating the original URL and now the problem is sorted.
- No duplicate comments URLs.
- No indexing issues for newer posts.
- Search results page ranks restored and working fine.
If you have only a couple of pages on your site with such a problem, you could also do this manually by adding the canonical tags to each of them, picking their URLs from Google webmasters. But if duplicate content keeps popping up, I’d recommend using this plugin so that they are taken care of automatically.
SEO Auditor Features - Complete SEO audit - Competitor Analysis - Report generation Try it today ! |
Link Assistant Features - Easy Link Building - Finds link partners - Get backlinks regularly Try it today ! |
Rank Tracker Features - Rank Check reports - 558 Search Engines - Keyword Research Try it today ! |
Possibly related SEO & Social Media Articles
This is quite a serious issue, providing you have 50 comments on a post. Definitely something blogmasters with busy blogs have to becognisant of!
Reply
Wow! What a pain for SEO. Duplicate content is a big SEO no-no! I would think that WP would have considered that before trotting out the new platform.
Thanks for the hack!
Reply
This canonical tag thing is really big news, fantastic that there’s finally something we can do to remedy the situation!
Reply
I would use a workaround and increase the comments to say may be 150 or 200 from 50 based on the average comments I receive in a day or two. This should avoid the issue.
Cheers,
Eddie Gear
Reply
You don’t need an extra plugin to make canonical url work with wordpress – all you need is a small tweak to the header file in the theme.
Reply
Mani Karthik
Replied:
May be, but a majority of WP users considers using a plugin easier. Thanks for the instructions.
Reply
It’s good that this tag was recently invented. It comes in handy just in time for resolving an issue such as this one.
Reply
I am not sure why we need canonical tags when adding a meta noindex tag on duplicate content will suffice.
I have introduced this option to add noindex meta tag to comments pages of posts as soon as WP2.7 was released.
Reply
Rajesh
Replied:
oh…i didn’t mention that the option was added in my Platinum SEO plugin…
Reply
Mani Karthik
Replied:
Technically, it sounds right Rajesh. But a nofollow imight just take things too far. Canonical tag is specifically built to “show the way” to the original page – always. The contents are still indexed from comments, which might/could give some traffic especially on high traffic websites.
I’d assume, we still want the pages indexed but want the bots to understand that the original story is “over there” and this (comments pages) are only a continuation of it.
But a noindex tag would sort the duplicate issue. But it is as good as deleting the comment pages.
Hope its clear.
Reply
Rajesh
Replied:
Are you sure that pages will be indexed? and do you think yoast’s plugin sorted this comment-pagination out? i don’t think so…Infact the plugin adds canonical tags to only archives(day,month,year,etc.) and the it points to the same URL i.e. the year lin or month etc.
it doesn’t work at all in the way that google or other SEs will understand…
Reply
Ah Rajesh. I run the site Mani here is talking about. And while I did not try out your plugin, I know about it and it just might have worked for me too.
Not just Dup Content, but also Messed up indexing. That’s Measles on top of Malaria there!
But the point is not that. Not only did WP2.7 give me dup content issues which showed up in Webmaster Tools, started indexing comment pages, screwed up blog indexing, delayed and then stopped post indexing. I ended up in a situation where I could not publish a single comment. Mind you, I am not talking about posts with 10 comments, I am saying, even if a post had only 1 comment, publishing it would screw up indexing. Don’t ask me why, ask Google / WP!
In fact, in theory, when you uncheck the pagination in Discussion settings, the earlier comment-page-1, -2, kind of URLs get 301 redirected back to the actual URL. So that means there should not have been any problem in indexing at all. However, there was. Despite doing that, every published comment instantly stopped Google’s indexing of actual post URLs.
Using the canonical tag plugin fixed the problem INSTANTLY.
Your solution too looks workable. That’s another approach to the issue. The two solutions don’t seem mutually exclusive.
Reply
Thanks for this one. It have me a headache since yesterday and Mr G saw duplicate content because of comments. Previously I was in 2nd page then all of a sudden, BOOM! I was gone.
Thank you much!
Reply
[...] a post with a possible solution: Wordpress 2.7 Comments pages – Duplicate content issue and solution One of the first things I learned was you can go to Wordpress > Discussion Area and uncheck the [...]
its a new method.. the sounds look good…
Reply
I literally appreciate this information and thanks to you for sharing with us.
Reply
[...] Google News)Related SEO Tips & Articles:Duplicate content ? Canonical tag to your rescueWordpress 2.7 Comment pages -Duplicate content issue & solutionSEO Tips Day 11 – How to deal with duplicate content issuesIs all your pages on SERPs? if not [...]