I need kind of service that extracts title from web page and returns in from of JSON. I would not like to parse web page or waste any unnecessary CPU cycles. i.e. call should be something like this:
curl http://api.someservice.com/fetch?url=google.com&element=title&out=json
Response from API would be:
{
response: {
title: "Google"
source: "google.com"
}
status: "success"
}
Any hint would be highly appreciated.
You should have a look at YQL - it's a general-purpose service from Yahoo! that can do this kind of scraping really easily. Try this:
select * from html where url="google.com" and xpath='//title'