Search code examples
internet-explorerwin32-process

read browser "in memory HTML" from EXE


I need to read the "in memory" HTML of a webpage that is displayed in "Internet Explorer 8" from an external process (EXE application).

To put it more simply, lets say you load a page in your browser where some 'INPUT TEXT' are shown, then you fill the INPUTs and "before" submiting the page I need to switch to my EXE application and read to values filled in the INPUTs.

I tried SPY++ but there is no "class" for any INPUT in the webpage (like there will be for textboxes in normal Win32 apps) and the entire client area of the broswer shows up as "Internet Explorer_Server" class.

I have done this many times to integrate data between applications but always against Win32 apps, this is the first time Im trying to read from a browser and Im really at a loss here.

The only think I understand for sure is that I need, somehow, to access the actual DOM of the running browser.

Note that I cannot use some kind of web control to load the page and then parse, since the loaded page will by empty and what I need is the data the user entered before submitting the page.

Any suggestion where to start looking will be appreciated :)


Solution

  • You cannot access the contents of an IE browser window using HWNDs, because there are none. Starting with the HWND of the "Internet Explorer_Server" window, you can extract an IHTMLDocument2 interface from it and then use IE's DOM interfaces to access and manipulate the browser contents as needed.

    How to obtain an IHTMLDocument2 from an HWND

    Interfaces and Scripting Objects

    IHTMLDocument2 Interface