Jul
27
Filed Under (Wordpress) by Mani Karthik on 27-07-2007

Wordpress by default is search engine friendly - a little too much sometimes that there are lot of chances for duplicate content.
Your Archives page, Feed, Index page are all sources of duplicate content.

And duplicate content is a serious issue with Google, as it would throw the pages into the supplemental index -and that would affect your traffic. Oleg Ishenko writes a great article on how Wordpress creates duplicate content, and how you can avoid duplicate content from appearing in your wordpress blog.


He thinks - Adding ‘noindex, follow’ tags, Adding unique meta description, Using more tag and preventing spiders from crawling feeds and auxiliary pages, should do the trick !

I would add one more point to it - Avoid posting articles in more than one category. Labelling posts in more than one caetgory simply multiplies the duplicate content on each of those categories. And this is also user-friendly. :)

Salutes to the man - indeed a great article!

Here’s a Wordpress plugin that can cure content duplication on your blog It says that it adds the nofollow meta tags to the category pages. So that’s half-done.I personally think that you got to follow the other steps suggested by Oleg Ishenko above, to ensure that duplicate content doesn’t appear at all on your blog.

(5) Comments    Read More   
Jul
26
Filed Under (Wordpress) by Mani Karthik on 26-07-2007

Recently, i had moved from blogger to wordpress - to a new host.It was not a cakewalk and i came across many hurdles while doing so.Yet, I’ve managed to retain my traffic and Google juice.
In the light of my experience, here are a few tips for you to watch out for if you are planning a migration from blogger to wordpress without losing your valuable traffic and google juice..
I’ve tried to make this a consolidated post including the Migration process - The SEO factors involved and coping with the after effects of it.If you have any questions, please feel free to ask.

Smooth transfer of your posts.
The biggest bottleneck when moving from blogger to wordpress is to move or copy your posts to the new host.
Earlier, there was a plugin to get this done. But now,since Blogger has updated it’s feed - it’s broken.

But, Wordpress has upgraded it’s version(2.2.1 as of today) and the Import feature is now well equipped to import your blogger posts from it’s RSS feed.
So all you got to do is make sure that your WP installation is up to date. If your host provides the old one, grab the latest version from www.wordpress.org and install it yourself.Directions to install Wordpress here

Importing the posts from Blogger
Go to your Wordpress Dashboard > Manage > Import > Blogger

On selecting Blogger, it would ask for granting permission to access your google account.
Login with the email account you blog on Blogger, and click the “Grant Permission” button.
Presto! That’s it. posts imported.

Note
Wordpress imports your posts from the RSS reader.So, if there are any additions to your RSS feeds, like footer text, etc - remove it from the blog feed settings or those text also will appear on your new posts.(For live example, see my posts imported frm blogger, you would see a “Visit the blog for more articles” text beneath every post.)

Tweaking your Wordpress.

Go to your .htaccess file via Control panel>File Manager and make it writable(CHMOD666)
Customize your permalinks

Dealing with Categories - In your blogger account you could have posted articles in multiple lables, so take some time and prune those categories in each post. This could be time consuming but, it helps in the long run.So you better make it right now.

Part 2 - Letting Google know.

After successfully setting up your Wordpress blog and importing posts, it’s time to let the Search Engines know.

- For this, go to your Google Webmasters account and add the new domain there.
- Do the verification as Google suggests.
- Make a sitemap
Create a sitemap of your new site using this sitemap generator tool and save the sitemap.xml file to your new hosts root directory(newdomain.com/sitemap.xml).
- Go to your Google Webmasters account dashboard > Select the new domain > Sitemaps and submit the new sitemap url there.

Now, you have to wait for 3-4 days for the google bots to crawl your site. A smart way to make this quicker is to get someone link to you.

As google crawls the new site, it will show the number of URL’s crawled in the statistics tab. It would be the same as listed in your sitemap.xml file.

Now, check if there are any broken url’s in your statistics. If there are, download the CSV file.

This is again a time consuming task - you got to check all your posts and check for broken links in it and correct it.I can’t find a plugin or tweak to beat it - if you know, please let me know.

Now, that Google has indexed both your new domain and old blogger account - goto the blogger account and delete the blog.

It’s also a good idea to request deletion of content from Google. But I’m not sure if they will entertain it if your blog isa small one, say with less than 100 posts. I really don’t know the criteria by which they decide whether or not to approve deletion from the Index, but they declined me.

Note - Ask all your backlinks(search for link:yourolddomain.com on Google) to correct their links to your new domain. Write them an email - it works!

(7) Comments    Read More   

The robots.txt file is used to control the crawlers activity on a website/blog. It will help you to keep some directories away from crawling while allowing some. For example if yu have two folders 1.Articles and 2.Javascripts - and if you wish to exclude Javascripts from crawling by robots, then you can command it on the robots.txt file.

A few basics about what the robots.txt file is -
- It is found in the root folder, Ex:-www.google.com/robots.txt
- It’s a text file and can be edited
- It is used to command the robots what to crawl and what not
- It is used to help the crawlers locate the sitemap on your site

If you are on blogger platform, then you can’t upload the robots.txt file. Panic not - there is another option which you can utilize. I’ll discuss it towards the end of this article. First let’s discuss a normal robots.txt implementation on a hosted site.

Implementing the Robots.txt file on a web-hosted site(Wordpress)

Pre-requisites - I assume that you have a wordpress hosted site with Cpanel/FTP access.

- Find the file at your public_html folder. If it isn’t there, create a blank text document.

Excluding a folder from crawling by SE bots.
Suppose you don’t want Google to index one of your folders.
In the robots.txt file, you have to specify two things - which crawler agent(Google, Yahoo, MSN) do you want to keep out and - which folder/folders you want to exclude.

The general syntax to be written in the Robots file is this.

User-agent: *
Disallow: /yourfolder/

Here, user-agent:* means all search agents(Google,MSN,Yahoo etc).
/yourfolder/ restricts that folder from crawling. Note that the sub-folders will not be crawled too.

In order to keep all agents away from crawling ALL folders, use this code.

User-agent: *
Disallow: /

You can specify individual crawler agents with their names(replacing *) like google bot,lycra etc.If you are following a general command to all search engine crawlers, keep the * in the user-agent line.

Specifying a sitemap with the Robots.txt file
Due to the recent agreement with the major search engines, they have come up with a common command that they will follow to detect sitemaps from robots.txt file. The command is -

Sitemap: Sitemap url here

Robots.txt for Blogger users.

Blogger users cannot upload the robots.txt file instead, they can use the robots meta tag to control the crawling of bots on particular files.

These codes should be included in the HEAD section of the particular page template.(Enclosed in arrow brackets)

META NAME=”ROBOTS” CONTENT=”NOINDEX”

This command will not index the current page in which this code is included.

META NAME=”ROBOTS” CONTENT=”NOFOLLOW”

This command will not follow/parse the links present on the particular page where this code is present in the head section.Blogger users can use this option to their advantage when making posts.If you want every new page to be crawled by the bots, include the following code to head section of your blogger template.

meta name=”robots” content=”index, follow”

Happy driving the robots. :)

(9) Comments    Read More   
DailySEOblog.com | Articles featured on this blog are copyrighted to the author and should not be reproduced as such without written permission.