I want to load html content through the Tor network and execute JavaScript to load additional content through this network via AJAX. This must be done automated by a script that runs on a Linux server without any human interaction. I can't find a combination of tools that enables automated execution of JavaScript that came through the Tor network.
I want to write an application with this characteristics:
environment
features
The environment-constraints forbid the use of a web browser. Everything must be done by programs or scripts. The feature-constraints force to execute JavaScript that doesn't connect directly to the internet, but through the Tor network.
Tor
To use the Tor network I can run a Tor client that provides a socket on my machine. Then I write a Perl script that connects to this socket. The Perl scripts sends http- and https-requests through this socket to the Tor client, who subsequently routs them through the Tor network. All response goes the same way back.
I've tested this, it works fine. But in a Perl script it is really hard to execute JavaScript that comes with the received html documents. I had to write a JavaScript emulator in Perl to make this possible but this is way beyond my available time and beyond my skills.
JavaScript
To execute embedded or attached JavaScript I can use a tool like phantomJS or slimerJS (phantomJS does not work properly on Ubuntu 12.04, so I use slimerJS which offers almost the same features). With this tools I can load html documents and automatically get all JavaScript executed that comes with it, so I also receive all content that is not part of the initially html document but gets loaded later by Ajax or similar techniques. And additionally I easily can analyze the document's DOM tree to extract the items I am interested in.
I've tested this too and it also works fine, but the tools I know (phantomJS and slimerJS) uses their own procedures to connect to the internet. There seems to be no way to tell them to connect to a socket and use it to communicate through it with the internet.
Is there a way to automatically execute Ajax calls through the Tor network?
To me there seems to exist two possible ways:
If you have a Tor client running, you can use the address its listening to for proxy settings. Check the docs for the proxy options you need to pass:
The proxy type will be SOCKS. Remember you need the address socket is bound to locally.