Search code examples
electronchromiumpuppeteerchromium-embeddedchrome-devtools-protocol

What's the most reliable way to programmatically control a Chromium instance?


I'm researching reliable ways to programmatically control instances of Chrome/Chromium to leverage its capabilities of rendering web pages in Node.js/C#/Java application. In short, things I want to do are as follows:

  • Open/close a browser window.
  • Minimize, maximize browser window.
  • Navigate to a certain URL.
  • Set cookies.

To make it more clear: I need a headful browser to display web pages to the end users. It can be either embedded to my app or can be a standalone browser (separately shipped instance of Chromium for example).

I was not able to find information about any public APIs in Chrome/Chromium that I can use from the Node.js/C#/Java environment. The ones available for Chrome Extensions are not applicable to my project, as I want to control the browser from the outside, like Selenium WebDriver does for example. So far I found the following ways to control the browser the way I need:

  1. To use Puppeteer/WebDriver APIs.
  2. Use chrome-remote-interface NodeJS library.
  3. Rely on Chrome Embedded Framework capabilities.
  4. Rely on Electron.js capabilities.
  5. Build my own library that somehow includes the Chromium modules as dependencies (similar to what Electron team implemented for example).

First two options are similar from the perspective of all the mentioned libraries eventually leveraging Chrome Devtools Protocol. The risk of CDP being retired/deprecated is quite substantial for our project. The other concern is that the intention of CDP is debugging and test automation and not application development. Moreover, having an open debugging port in Chrome open on user's machine seems vulnerable.

CEF and Electron paths concern me due to dependency on the embedded Chromium updates cadence. Although Electron team is targeting to update with every other release of Chromium it still can be a security concern due to inability to update the Chromium version right after a new version released with a security patch, for example. Moreover, in case when I need the real browser experience (and that's the case) I won't have it out of the box and I'll have to implement browser features like buttons, tabs address bar etc. myself.

Option #5 seems to be extremely complex in implementation as it seems to require team's competency on Chromium internals, C++ development and C++ build tooling.

Anything I missed in the options list? Anything I missed in my assumptions? Any tips, thoughts, suggestions will be greatly appreciated!


Solution

  • Some of your options are about controlling a browser (#1, #2) while others are about embedding a browser (#3, #4). These are two different use cases and what you need depends on what your goal is.

    Controlling a browser

    If you want to control a browser to execute tasks, maybe even in the background without the end user noticing, you should go for option 1 (puppeteer) or 2 (chrome-remote-interface).

    I recommend to use puppeteer as this is the library developed by the Google Chrome developers and it comes with many functions for your use case (opening browser windows, navigating, setting cookies).

    I do not see any reason to worry about the Chrome DevTools Protocol being abandoned anytime soon. The Chrome DevTools fully rely on this protocol. In addition, Firefox (Mozilla bug tracker: #1316741,#1523097) and Edge are already partly supporting the protocol making it even more unlikely to be abandoned in the future. (more information)

    Embedding a browser

    If you need to embed a browser, meaning you are trying to show a browser inside your application, you should focus on the options 3 (Chrome Embedded Framework) or 4 (Electron).

    The Chrome Embedded Framework is a more low-level approach putting a separate browser into your application. But I cannot go into detail here, as I have never used this one myself.

    Electron on the other hand is a browser, meaning the whole application is developed as web application. You can embed another browser window (webview) into your browser, which you can essentially control (similar to what puppeteer can do).

    Directly using the Chromium code (option 5)

    Although the Chromium project is split into multiple components, it sounds like you need a full browser. I once compiled the Chromium source code myself and it takes literally hours. Keep in mind, that he code consists of roughly 35 million lines of code (source). Even if you figure out what parts of the code to use, it is more realistic that some low-level parts of the code change and break your implementation than the DevTools Protocol being abandoned. So, I definitely recommend to not follow this idea.

    Alternatives

    Depending on your use case, you could also take a look at DOM simulation libraries like jsdom or cheerio. These libraries are very limited in terms of their functionality and you might have to implement parts of the browser yourself, e.g. downloading the document, reading and setting headers to deal with cookies, etc.


    All in all, I recommend to go for puppeteer if you want to control a browser to execute tasks primarily in the background. If you need a browser window as part of your application go for Electron.