Web Scraper Review

By | March 5, 2017

web-scraper-review

Ever wanted to extract data from web page automatically but didn’t feel like expending the extra energy and time to copy-paste? No judgments, I have been there, too. The best Google chrome extension for extracting data from a web page is web scraper.

Web Scraper is a lightweight web scraping tool that provides a point and click interface to capture the required data. It is the best free web scraping chrome extension for non-programmers who want to extract data from a website without hiring a web scraper professional. I am frequently using them to fulfil my simple web scraping requirements.

I know there are many web scraping tools out there but this free tool I consider to be the best for basic data scraping from websites into CSV.

Salient Features:

  • Export data to CSV
  • Download Images
  • Extract data from dynamic websites
  • Time delay between request
  • Extract data from many pages

Pros:
Takes a little bit of getting used to – but once you’ve played around with it for a while you realise this is an extremely powerful tool. And what’s more, it’s free!

Cons:
Has a bit of a learning curve. It is not of any use in advance web scraping scenario like submitting forms, bypassing CAPTCHAs etc.

Product Information:

Price/Cost Free
Platform/Deployment Chrome Browser Extension
Training Comprehensive, appropriate, well-structured user documentation & demonstrative video tutorials,
Seller Information webscraper.io
Learning curve Has bit of learning curve
Functionality Basic
Usability Easy to install, learn and understand
Data Export options CSV, CouchDB

Web Scraper In Practice:

As an example, we will extract data from e-commerce website with pagination. I.e. amazon.

#1. Install extension
Install the extension from Chrome store. After installing it you should restart chrome to make sure the extension is fully loaded.

#2. Create Plan (Sitemap)
Create the sitemap by providing following information:

  • Name: Name of the sitemap
  • Start URL: Starting URL of the target website from where the scraping will start.

Create Sitemap

#3. Create Selectors
After creating plan or sitemap, we have to create selectors. Selectors are the elements on the target website which contains data points. You can add, edit, delete selectors in the selectors panel.

Create Selectors

Web scraper provides point and click interface for creating selectors. While creating selectors, you can preview selected element and data that is being scraped to ensure that the right element is selected.
Point and Click Interface

Following types of selectors are available:

  • Text Selector: Used to extract text
  • Link Selector: Used to extract URLs and Navigate Website
  • Link Popup Selector: Used to extract data from the popup.
  • Image selector: Used to extract image URL or download image
  • Table selector: Used to extract data from tables
  • Element attribute selector: Used to extract attribute value of an element for example alt attribute of an image.
  • HTML selector: Used to extract inner HTML of the selected element.
  • Grouped selector: Used to group multiple elements(data points) into one record.
  • Element selector: Used as a parent when the element contains multiple data items.
  • Element Scroll down selector: Used to scroll down the page to load more elements.
  • Element Click Selector: Element Selector having click behavior

Selector Types

#4. Inspect Selector Graph
After you have created all the selectors for the website you are going to scrape, you can inspect the graph structure of selectors in the Selector graph panel. It shows the data extraction traversal process and child-parent relationship between selectors. Screenshot below shows selector graph of our example.

Selector Graph

#5. Scrape
After completing the sitemap creation with all the required selectors, you can start scraping the website. You can start scraping by selecting Scrape option from the sitemap menu. When you select Scrape option, the scraper will ask for following two parameters:

  • Request interval(ms): Number of milliseconds between each web request
  • Page Load delay(ms): Number of milliseconds for which scraper has to wait before scraping the page.

Then finally press “Start Scraping” button. The new window will popup and the scraping process begins…

#6. Browse Extracted data
After successful scraping, the alert will popup to notify scraper has finished scraping data and you can view the extracted data by opening Browse panel.Browse Extracted Data

#7. Export
You can export data to CSV as shown below.
Export Data As CSV
#8. Import sitemap that I have created to extract data from amazon
Download sitemap and import it using Import Sitemap option.

For more details, you can read official documentation and watch video tutorials here.

Thanks for reading!

71 thoughts on “Web Scraper Review

  1. Charles H Alban

    other videos show a right click to scrape… “scrape similar”. your instructions don’t mention that, and i don’t get “scrape similar” with right click.

    has this feature been removed or have i not installed it properly?

    Reply
  2. innit

    Not so great in scraping an e-commerce site. Tried creating links to all categories in a mega-nav from which links to items shown and also pagination links (child to itself to follow through all pages) and also created text selectors for name and price.
    Firstly, webscrapper.io does gets very confused and mixes up data or randomly extracts what it wants.
    Secondly, cant seem to reuse created selectors such as item or name without it causing confusion with scrapper and not working
    Thirdly, the export creates columns for each selector, therefore, for the above use case, if you are creating selectors etc to fix issues mentioned, when you do eventually seem to get some kind of export it has many columns which make it hard to read the extract.
    Cant say I would advise using webscrapper for anything of use.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *