In “SEO for Google Shopping” I discussed the need to optimize product feeds. I stated that adding keyword product descriptions and titles into the feeds is scalable with “scraping”. But I didn’t describe it any further.
In this post I explain scraping. I will review why it is useful and how it relates to search engine optimization. Scraping can speed up many tasks, eliminating hours of manual labor.
What is cockroach?
Scraping is the process of extracting elements such as text, code, and images from a web page. Scraper applications range from browser extensions to standalone software.
Scrapping speeds up manual copying and pasting of items on a page using a mouse and keyboard. For example, a human could spend hours manually updating 500 title tags. With a good scraper, it would take a minute.
Cockroaching is becoming more and more common. For example, the web crawler Screaming Frog uses scraping to extract data from a website.
Google crawls websites to show large snippets in organic search results. The text in Google’s answer boxes comes from scraping.
For years retailers have been scratching the product pages of their competitors in order to get their prices quickly. Your website may be deleted as you read this.
Scraping your own site can come in handy. Scraping allows all of your products and prices to be quickly summarized in a single table for further analysis.
Content thieves use scraping to reproduce articles and images. Spammers rely on scraping tools to impersonate websites and mimic their success. Such tools also make it easier for spammers to scrape selected content and convert it into new posts. Google doesn’t like this because the result is usually low-value pages. However, for spammers, this can be a quick way to trick Google on volume. Sometimes it works, but nowhere near as well as it used to be.
SEO tools search Google’s search results to determine rankings. These tools perform millions of searches every day to get updated ranking information. Google tried harassing tracking companies to stop. It costs Google money because it renders every page for the bot. In addition, the metrics for the search volume are increased.
Scraping can do SEO tasks on a large scale. For example, let’s say a competitor’s website often shows up on the first page of Google for a handful of terms. You can search through each term and write down the results, or you can scraper the results from Google. You can export the data with a good scraper.
Almost anything on the web can be scraped off. The fun part is figuring out when and how to do it. For example, a customer recently wanted to update all of their logos on the Internet as part of a branding exercise. With ScrapeBox and a few minutes of setup, I had a full table of all the websites that Google knew had the outdated logos. Each line had the specific image url and its actual appearance.
Websites sometimes prohibit scraping as part of their terms and conditions. For example, a few years ago LinkedIn sued 100 people who used scrapers to copy user data. Knowing what a website will allow (or not allow) in terms of scraping is important.
Scraping opens up options that you may never have considered. “Is there some way to get all of this data at once?” A well thought out scratching strategy could be the answer.