Search code examples
web-scrapingcomputer-scienceranking

Is web-scraping legal for scientific purposes?


I am writing a research on a service ranking algorithm, and I want to prove its performance and accuracy by running it on a public data. let's say apple store data, google play, expedia etc. Can I parse their data from HTML and use it in my research? or I would be performing illegal act (web scraping)?

And should i mention explicitly in my research that the data is used only for scientific reasons?

I've read about webscraping and the controversies about its illegality, but i did not find any article about if it's used for scientific purposes only.

Thanks in advance


Solution

  • There is nothing inherently illegal about web-scraping a site.

    However, I would suggest that you pay attention to the particular site's "Terms of Use" to see if it is something which they expressly forbid. For example, the Expedia Terms of Use here http://www.expedia.ie/p/support/termsofuse outline:

    you may not visit or make available the website or any part of the web pages of the website by automatic means, such as by using crawlers or shop bots to systematically retrieve or copy information or connect the content of the website functionally to another website via links

    *That being said, as long as you don't exert an unreasonable load on the site, or republish their content as your own, I don't expect you will run into any problems.