Search code examples
phpjavascriptscreen-scraping

Remotely processing a form on another website and realtime screen scraping


Ok this is quite complicated and not even sure if it is possible. Need some insight from knowledgeable people to advise on how I should proceed.

I need to process a form on a remote site, screen scrape the results (on the fly), parse the information and display it back to the end user.

--More clearly explained by example--

  • 1 my site is -> sitea.com
  • [2] the form is on -> somebodyelseswebsite.com (no DB access, but form is public)


Here's my logic:

  1. i can replicate the form from site [2] and make an exact copy on my site1.
  2. when the user submits the form i need some kind of object in the POST (javascript?) that will assign the users input to ... and process the form on site [2], screen scrape the results, and return the data in an array, which i can display on my site1.


key points:

  • The user must not be aware of the transaction with site[2].
  • This must happen in real-time and fast


So can this be done? If YES, How? I know about PHP cURL can I use only PHP or do I need to use something else?

--further clarification--

enter image description here


Solution

  • Yes, this can be done. cURL is one way to do it, yes. You need some pretty heavy error-checking and validation for any sort of reliability though. You'd use a cURL POST (assuming the remote host doesn't have any sort of form key, ip block, referer checking, etc.) to replicate the behavior of that form's fields. Then you'd need to scrape the return and I think that's the difficult part.

    For me, I'd use a DOM Parser to get very specific. Here is a post on how to do that.