Let's assume I browse a specific web page that uses JavaScript to update its view constantly (using Web 2.0 techniques to talk to their server to retrieve updates of data).
Now I like to run some code on my own computer that monitors the contents and alerts me if some specific data appears on the page, so that I could record that data, for instance.
I am looking for ways to accomplish that. Since it's a private project, I am flexible in the choices of my tools (I can program in C and REALbasic, and could manage a little JavaScript as well). The only thing out of my control is the page I want to monitor.
I would prefer a solution I can employ on Mac OS X, but Linux or Windows would be feasible, too.
First, I wonder if there are already solutions for this out there. Something like a user-scriptable web browser, for instance.
If that's not available, I wonder how to best approach this by programming it myself. E.g, can someone tell me if Apple's Webkit allows me to introspect a dynamically updating web page?
As a last resort, I guess I would have to insert my own javascript code into the viewed webpage (I could do that easily, I think, at time of loading the page over the internet), and then have that script run periodically, introspecting the page it's in. The only thing I don't know in this case is how to get it to communicate with the outside, i.e. my computer. I could certainly write an app that it could try talking to, but how could it at all access my computer resources to establish such a communication? As far as I understand the sandboxing of web pages, they cannot read/write local files or communicate with a socket on the computer they're running on, or can they?
So, any ideas are welcome, as long as they're clear of the concept that I have to let a browser or its engine render the page and run the page's Javascripts.
This sounds like it could be pretty easy using Jetpack in Firefox.
You can create browser extensions using Javascript - it's still in alpha but looks to be workable (and awesome)...