Ever wanted to extract data from web page automatically but didn’t feel like expending the extra energy and time to copy-paste? No judgments, I have been there, too. The best Google chrome extension for extracting data from a web page is web scraper.
Web Scraper is a lightweight web scraping tool that provides a point and click interface to capture the required data. It is the best free web scraping chrome extension for non-programmers who want to extract data from a website without hiring a web scraper professional. I am frequently using them to fulfil my simple web scraping requirements.
I know there are many web scraping tools out there but this free tool I consider to be the best for basic data scraping from websites into CSV.
- Export data to CSV
- Download Images
- Extract data from dynamic websites
- Time delay between request
- Extract data from many pages
Takes a little bit of getting used to – but once you’ve played around with it for a while you realise this is an extremely powerful tool. And what’s more, it’s free!
Has a bit of a learning curve. It is not of any use in advance web scraping scenario like submitting forms, bypassing CAPTCHAs etc.
|Platform/Deployment||Chrome Browser Extension|
|Training||Comprehensive, appropriate, well-structured user documentation & demonstrative video tutorials,|
|Learning curve||Has bit of learning curve|
|Usability||Easy to install, learn and understand|
|Data Export options||CSV, CouchDB|
Web Scraper In Practice:
As an example, we will extract data from e-commerce website with pagination. I.e. amazon.
#1. Install extension
Install the extension from Chrome store. After installing it you should restart chrome to make sure the extension is fully loaded.
#2. Create Plan (Sitemap)
Create the sitemap by providing following information:
- Name: Name of the sitemap
- Start URL: Starting URL of the target website from where the scraping will start.
#3. Create Selectors
After creating plan or sitemap, we have to create selectors. Selectors are the elements on the target website which contains data points. You can add, edit, delete selectors in the selectors panel.
Web scraper provides point and click interface for creating selectors. While creating selectors, you can preview selected element and data that is being scraped to ensure that the right element is selected.
Following types of selectors are available:
- Text Selector: Used to extract text
- Link Selector: Used to extract URLs and Navigate Website
- Link Popup Selector: Used to extract data from the popup.
- Image selector: Used to extract image URL or download image
- Table selector: Used to extract data from tables
- Element attribute selector: Used to extract attribute value of an element for example alt attribute of an image.
- HTML selector: Used to extract inner HTML of the selected element.
- Grouped selector: Used to group multiple elements(data points) into one record.
- Element selector: Used as a parent when the element contains multiple data items.
- Element Scroll down selector: Used to scroll down the page to load more elements.
- Element Click Selector: Element Selector having click behavior
#4. Inspect Selector Graph
After you have created all the selectors for the website you are going to scrape, you can inspect the graph structure of selectors in the Selector graph panel. It shows the data extraction traversal process and child-parent relationship between selectors. Screenshot below shows selector graph of our example.
After completing the sitemap creation with all the required selectors, you can start scraping the website. You can start scraping by selecting Scrape option from the sitemap menu. When you select Scrape option, the scraper will ask for following two parameters:
- Request interval(ms): Number of milliseconds between each web request
- Page Load delay(ms): Number of milliseconds for which scraper has to wait before scraping the page.
Then finally press “Start Scraping” button. The new window will popup and the scraping process begins…
You can export data to CSV as shown below.
#8. Import sitemap that I have created to extract data from amazon
Download sitemap and import it using Import Sitemap option.
For more details, you can read official documentation and watch video tutorials here.
Thanks for reading!