Whenever someone’s copied your content,  out of ten people who have talked to me on this, nine panic !
After all, someone’s copied your content and is not a good thing. Plagiarism and scraping have been there on the web since blogging, but largely popular these days. Earlier, only high traffic, money making blogs were scraped but these days even smaller unpopular blogs are copied.

The Wikipedia says-

Web Scraping (sometimes called harvesting) generically describes any of various means to extract content from a website over HTTP for the purpose of transforming that content into another format suitable for use in another context.

So is scraping bad? How does it affect you and How can you block scrapers?

First off, I’d suggest that you relax in such situations. There’s nothing to panic as such, when someone is scraping your website/content.

The reason - If someone is stealing your content, that means that guy is clearly a struggler and has a fairly new site, with no authority whatsoever. And in most cases, the site will not have any original content at all, with lot of outgoing links, and the content on the site will be far too similar (remember those templates they use where only your name is changed?) and there’s duplicate content all over the place.

So in effect, he is not going to affect you in any ways, either in stealing your traffic on Google SERPs or in any other way.
All he is doing is copying content from different sites like your’s with a pre made template, in the futile hope that aggregating all the content will automatically throw him up on the SERPs. Which clearly does not happen.

Now, the case is different when an authority site or a relatively older/popular site scrapes content. While I have all the guts to believe that an authority site will not do such practices, accidents can happen.

An example here - Yahoo India had offered it’s content development outsourced to a private firm in Bangalore. These smart chaps went around the Internet copying all the content related to the Yahoo requests and provided them in plenty. What they did is actually rephrasing of the already existing content, and re-packaging it for Yahoo. But there were content like Indian cooking recipes, which had very little content to “rephrase”. So they ended up providing the same thing to Yahoo.

Now, the original site who wrote the content came to know about this, and files a complaint. Long story short - Yahoo lost the case and sacked the outsourcing partner.

The problem here was that, the outsourcing partner copied content from a relatively lesser known site (but with original content) and provided it to yahoo. Now Yahoo being an authority site, ranked better than the original site for all the same content. And this infact troubled the relatively lesser known site.

So in such cases where the “copy cat” has more authority, you have a reason to panic but not otherwise. Mostly in the cases of XML Scrapers, there’s no need to panic.

How can you block scrapers copying your site’s content?

There are more than one way to do it.

  • Always link to one of your previous articles on the blog
    A simple text link in your article that links to one or more of your previous articles will make sure that while the scrapers are copying your content, the link still remains and you can trace them from the backlink. And it sometimes help you with another incoming link, it helps when counting number of backlinks on Yahoo, if not Google and some folks even think this is a good thing.
    So include a text link to your blog articles and you have a solution.
  • Use partial feeds
    A main source of scraping is XML feeds. As all the blogs these days have a full feed published, it’s easy for scrapers to just leech out the content from teh full feed. It is a clever option to offer only partial feeds, but this could also result in unhappy genuine subscribers. So it is your take to decide whether you want to do it the harsh way or not.
  • Use a copyrighter plugin
    If you are on Wordpress there ar plugins available that will help you to include a copyright notice on your XML feeds.
    You can get a nice plugin here
  • Use a linkback and signature credit text
    A wiser option is to use a signature text on all your feesd, that may contain a linkback to the original blog and your signature. This will appear on all the scrpaers site too, thereby ensuring credit to your blog.
    Ex:- The one Lorelle has on her articles

If you are looking at content theft and how to fight it legally, Amit has a good article on it.

There are more technical ways to curb the scraping issue, but I leave that to the discussion as I’m unsure of them. Essentially, the fact is that while scraping or copying content can be annoying, there isn’t much you have to worry about becauase these copycats are going to weed out soon.
So ignore them, they keep mushrooming here and there, and is not worth the attention, and like someone said, use them may be for getting some backlinks, I can’t agree that they are of any good use, but yes if you are fond of numbers, may be it will help.

If you'd like to stay updated with SEO, grab the RSS feed now !What's this?

    Read More   


COMMENTS >>
Ahilosu on 1 May, 2008 at 12:36 pm

There’s this plugin that I used when I was writing longer posts to prevent theft. It was called “byrev scrambler text”or so…very useful


Binny V A on 1 May, 2008 at 7:29 pm

To anyone who has been a victim of scrapping…

Scrapping is a fact of life on the net - don’t panic over it. Try to take advantage of the fact.

But please, please, don’t make your feed a partial feed. That will be a great disservice to your readers.


Maneesh J on 2 May, 2008 at 5:02 pm

I am running a small career site with low traffic (http://www.careerdrive.co.in) and recently a well known job site clickjobs copied an article from my site as it is.They had stated it as source at the bottom of the page.I called them out and a person from the company told me that their act would help in getting more traffic for my site.It was in engineering.clickjobs.com .I told him to either give me a link or remove the content.The guy removed the content immediately and i was happy.In my case i think search engines would always think content at clickjobs to be original.That’s why i wanted it to be removed.Is that right?


Incoming links to this article

Post a Comment
Name:
Email:
Website:
Comments:

Get FREE SEO Tips in your inbox. Enter your email address here: