How do you get data from a web page to an Excel spreadsheet automatically? The answer is web scraping. There are a number of software and tools like Easy Web Extract, Fminer, Helium Scraper etc. that helps you to extract data from web pages, but none of them are as easy to use as Scraper – a Google Chrome extension.
Scraper used to be a fantastic Google Chrome extension that allows you to extract data from website and later you can export it to TSV or Excel Spreadsheets. Scraper is very simple and easy to use web scraper for those who wants web pages data into spreadsheets quickly.
Download and Install the Scraper extension for free from Google Chrome Store. Once you install the Scraper extension, you can access it by right clicking on any element on a web page and choosing the “Scrape similar…” option.
Scraper is based on XPath and JQuery Selectors. If you are new to XPath and Jquery Selector then quickly go through W3School XPath tutorials and JQuery Selectors Tutorials. You can also download XPath Helper Chrome extension for learning and getting help regarding XPath.
Using XPath and JQuery selectors, you can specify which web page elements to be scraped.
Scraper has following useful features:
- Both XPath and JQuery Selectors are supported
- Add/Remove columns
- Live Preview of Results
- Copy data to clipboard in TSV (Tab Separated Values) format
- Export Results to Google Docs
- Reset the options to their previous values
- Save your current scraper project settings as a preset to quickly restore them in the future
- Option to exclude empty results from final output
Let us understand Scraper by taking a practical example:
#1. Load target web page:
Go to target web page. In our case… Wikipedia page…
#2. Scrape the data
#3. Fine tuning
Scraper will set some defaults, if your are not happy with that, go ahead and set different XPath Selector as per the needs. Press Scrape button to update the results.
Customized XPath in our case:
#4. Export the data
Have a look at the video…
The only drawback of this Scraper extension is that it extracts data from only a single web page at a time. It does not allows to extract data from multiple web pages or websites with pagination. If you want to extract data from multiple web pages, have a look at another nice extension known as web scraper.
Congratulations! I am sure after reading this tutorial, you have some basic understanding on how Scraper extension can be used to extract data from a web page. This is one of several options available to extract data from a given web page. You can also have a look at another nice video tutorial on this extension…
Scraper is open-sourced under a BSD license. Source code is available on Github.
I hope this tutorial post will be helpful for you.
If you have any question about this Scraper extension or anything related to web scraping, you can feel free to ask.