Search code examples
programming-languagesperformancescreen-scraping

Screen Scraping Efficiency


We are going to be scraping thousands of websites each night to update client data, and we are in the process of deciding which language we would like to use to do the scraping.

We are not locked into any platform or language, and I am simply looking for efficiency. If I have to learn a new language to make my servers perform well, that is fine.

Which language/platform will provide the highest scraping efficiency per dollar for us? Really I'm looking for real-world experience with high volume scraping. It will be about maximizing CPU/Memory/Bandwidth.


Solution

  • You will be IO bound anyway, the performance of your code won't matter at all (unless you're a really bad programmer..)