What is Content Scraping?

Have you run into content thieves, also known as content scrapers, during your blogging tenure? If not, you probably will at some point in your writing career. How do you know if your site has been scraped? All of your content will redirect through another site’s servers. These sites, such as BuzzMyFx, lead you to believe that you can check to see if your content is being stolen. You simply input your site name to see if your site is being scraped. When you do so, it will most likely state that you are not being scraped; however, these sites appear to be merely for stealing your content once you input your site name. Thankfully, BuzzMyFx is now down! But, beware…it (and others like it) will probably pop up on different domains.

You may be wondering how sites like BuzzMyFx work. If you visit this type of site to be sure that your website content is not being scraped, once you enter your website name, your internet browser will connect to their server first. Next, their site mimics a browser and contacts your server for your page information. Once it has that information, the content scraper site changes the code on the page so all links are directed to them. These sites also put in their own StatCounter and rewrite the ad code so they get revenue from all of the ads on this page.

Content scraping sites are committing copyright infringement as well as stealing the revenue from ads on the scraped sites. So why do people create sites that scrape content? Because, once they get lots of sites redirected to them, this is a very lucrative business.

At this point, we’re sure you are wondering how to stop content scrapers. Every server keeps a log of each visit to one of their hosted sites. These logs include your IP address, the date and time of visit, what page you loaded and what website referred you. And webmasters know how to get the necessary information from these logs. If you suspect that your site has been scraped, contact your webmaster immediately! They can make changes to your .htaccess file (the file that controls who can see your site and what they can see) so the pages on the scraped version of your site will be replaced with a warning that your site has been scraped. In this warning, you can tell visitors to go to your real site but be sure your real site isn’t linked or it will be scraped again. Just spell out your site like this www dot WebsiteDesignAustinTexas dot com. If you are using a WordPress site, you can install the WP-Ban plugin to keep content thieves off of your site. With this plugin, you can list IP addresses that you want to keep off of your site and even leave a personal message for each IP address.

One more word of caution in this situation…if you contact a content scraper site that has stolen your content, DO NOT EVER sign anything that they send you. BuzzMyFx has been known to sell domain protection packages. This sounds great at first; however, these packages actually steal your legal right to your content, login in credentials, mailing list and any other personal information on your site. How can they do that? Well, it’s simple…if you sign a form given to you by a content thief; you just signed your rights to the information on your site over said thief.