Html screen scraping with HtmlAgilityPack Library

added by ebizdom
8/28/2010 5:26:02 PM

0 Kicks, 530 Views

What is Screen Scraping ? Screen scraping is a process that reads any webpage and extract data from html tags. In this article, i will examine how to scrape a given web page using htmlagilitypack library. It is a .NET code library that allows you to parse "out of the web" HTML files. It can be downloaded @ http://htmlagilitypack.codeplex.com/ In this Tutorial, i will read my own web site http://savebigbucks.ca that offers daily deals in Canada. Here is code snippet that reads the web page.


1 comments

Fernir
6/13/2011 8:49:44 AM
HtmlAgilityPack is really nice parsing library, but for data scraping (gathering) its better to use something like Gogybot library, it can retreview HTML pages and HtmlAgilityPack can parse data.