Search code examples
javascriptnode.jspuppeteernodejs-server

Can't export wsEndpoint created from puppeteer browser


I am trying to open a puppeteer browser upon startup and then exporting the wsEndpoint so that I may use the link to connect to the browser rather than opening a new browser every time I call the function.

Here is the code snippet in the file app.js that is the entry point for node.

const mainFunction = async () => {
    const browser = await puppeteer.launch()
    const wsEndpoint = browser.wsEndpoint()
    return wsEndpoint
}

mainFunction().then(async endpoint => {
    console.log(endpoint)
    module.exports = endpoint
})

upon startup, the console log above returns a link that I then export And here is the code snippet in the utility file equities.js

const puppeteer = require("puppeteer")
const endpoint = require("../../app.js")
module.exports = async(symbol)=>{
  console.log(endpoint)
  const browser = await puppeteer.connect({connectWSEndpoint: endpoint})

}

Every time I call the function, the console log only returns an empty object meaning that the export in app.js failed for some reason. I tried to google a few things and tried different ways of exporting but none seem to work. Can someone help guide me? Thank you so much in advance.


Solution

  • A few things here seem amiss to me -- this code feels like it wasn't tested along the way, leading to multiple points of failure. Try to take smaller steps so you can isolate problems instead of accumulating them.


    For starters, the mainFunction code abandons the browser object, creating a leaked subprocess resource can't be closed.

    I'd return or store the browser variable along with the endpoint so someone can clean it up through a function. Or just return the browser and let the client code pull the endpoint out of it if they want, as well as manage and close the resource.


    Next, the export code:

    mainFunction().then(async endpoint => {
        console.log(endpoint)
        module.exports = endpoint
    })
    

    I don't understand the motivation for this extra then wrapper that receives an async resolution function that never uses await. You may think Node awaits all of this code, then sets the module.exports value before the client file's require runs synchronously. That's not the case, which can be determined with a simpler piece of code:

    app.js (in the same folder throughout this post for convenience):

    const mainFunction = async () => 42;
    
    mainFunction().then(async endpoint => {
        console.log("endpoint":, endpoint)
        module.exports = endpoint
    })
    

    index.js:

    const endpoint = require("./app");
    
    console.log("imported:", endpoint);
    

    node index gives me:

    imported: {}
    endpoint: 42
    

    The promise resolved after the require, which synchronously brought in the default blank object module.exports -- probably not what you expected.

    If you have async code, it has to stay async forever, including exports and imports. Try exporting the promise directly, then awaiting it in the client:

    app.js:

    const mainFunction = async () => 42;
    module.exports = mainFunction;
    

    index.js:

    const getEndpoint = require("./app");
    
    getEndpoint().then(endpoint => console.log("imported:", endpoint));
    

    Running node index gives me: imported: 42.


    The client code in equities.js looks more reasonable because it exports a promise synchronously, but it's going to have to await the endpoint promise it imported anywhere it uses it.

    Also, Puppeteer throws on puppeteer.connect({connectWSEndpoint: endpoint}), Error: Exactly one of browserWSEndpoint, browserURL or transport must be passed to puppeteer.connect. I'll leave that up to you to work out based on your goals.

    Here's a rewrite sketch that fixes the promise problems, but is only a proof of concept which will need tweaks to do whatever you're trying to do:

    app.js:

    const puppeteer = require("puppeteer");
    
    const browserPromise = puppeteer.launch();
    
    const endpointPromise = browserPromise
      .then(browser => browser.wsEndpoint())
    ;
    
    module.exports = {browserPromise, endpointPromise};
    

    equities.js:

    const puppeteer = require("puppeteer");
    const {browserPromise, endpointPromise} = require("./app");
    
    module.exports = async symbol => {
      const endpoint = await endpointPromise;
      console.log(endpoint);
      //const browser = await puppeteer.connect({connectWSEndpoint: endpoint}) // FIXME
      const browser = await browserPromise;
      await browser.close();
    };
    

    index.js:

    const equitiesFn = require("./equities");
    
    (async () => {
      await equitiesFn();
    })();
    

    Run node index and you should see the ws printed.

    Note that you can wrap the exported promises in functions or as part of an object which is a layer of abstraction more typical for the interface of a library if you want. But this doesn't change the fundamental asynchrony. The client will call the exported functions and await the endpoint and/or browser promises through that extra layer of indirection,

    require("./app").getBrowser().then(browser => /* */);
    

    versus

    require("./app").browserPromise.then(browser => /* */);
    

    If you don't want to expose the browser object, that's fine, but I'd suggest exposing a function that closes the underlying browser so you can get a clean exit, e.g.

    app.js:

    const puppeteer = require("puppeteer");
    
    const browserPromise = puppeteer.launch();
    
    const closeBrowser = () => 
      browserPromise.then(browser => browser.close())
    ;
    
    module.exports = {closeBrowser};
    

    index.js:

    require("./app")
      .closeBrowser()
      .then(() => console.log("closed"))
    ;