Search code examples
search-engineweb-scrapingweb-crawler

What is the difference between web-crawling and web-scraping?


Is there a difference between Crawling and Web-scraping?

If there's a difference, what's the best method to use in order to collect some web data to supply a database for later use in a customised search engine?


Solution

  • Crawling would be essentially what Google, Yahoo, MSN, etc. do, looking for ANY information. Scraping is generally targeted at certain websites, for specfic data, e.g. for price comparison, so are coded quite differently.

    Usually a scraper will be bespoke to the websites it is supposed to be scraping, and would be doing things a (good) crawler wouldn't do, i.e.:

    • Have no regard for robots.txt
    • Identify itself as a browser
    • Submit forms with data
    • Execute Javascript (if required to act like a user)