Search code examples
google-chrome-devtools.net-6.0google-chrome-headlessheadless-browser

Headless Chrome Not Sending WebSocket URL in StandardOutput


I am trying to create a headless chrome using .NET core and code I am using shared below. As per the "https://developer.chrome.com/docs/chromium/new-headless" it has to send the WebSocket URL to standard output. After starting the process I have created to delay to allow chrome to start. And I have read the Standard Output and Standard Error but both are empty. I have tried some trial and error as per the link above but still facing issue.

And apart from this is using "--dump-dom" is better approach for web scraping websites which has JS modifying the HTML or using this websocket URL in .NET Core

static void Main(string[] args)
 {

     StartBrowser().GetAwaiter().GetResult();    

 }
 public static async Task StartBrowser()
 {
     using (var process = new Process())
     {

         process.StartInfo.FileName = "C:\\Program Files\\Google\\Chrome\\Application\\chrome";
         process.StartInfo.CreateNoWindow = false;
         process.StartInfo.UseShellExecute = false;  
         process.StartInfo.RedirectStandardOutput = true;
         process.StartInfo.RedirectStandardError = true;
         process.StartInfo.RedirectStandardInput = true;

         string arguments = "--headless=new --remote-debugging-port=0 https://developer.chrome.com/";
         process.StartInfo.Arguments = arguments;
         Console.WriteLine("Start Time:" + DateTime.Now.ToString());
         process.Start();
         await Task.Delay(20000);
         Task<string> output = process.StandardOutput.ReadToEndAsync();
         Task<string> error = process.StandardError.ReadToEndAsync();
         var task =await Task.WhenAll(output,error);
         Console.WriteLine(process.StandardOutput.ReadToEnd());
         Console.WriteLine(process.StandardError.ReadToEnd());
         Console.WriteLine("End Time: " + DateTime.Now.ToString();

         process.WaitForExit();

     }
 }

Besides that when I am trying to connect to CDP socket using postman and trying to perform some commands like "Page.enable",Network.enable" many or not working only "Target" related commands are working.


Solution

  • ReadToEnd() is the culprit! It is trying to read till the end of stream, which will only happen when chrome exits.

    You can remove all the awaits/Task/delay etc and use this instead:

    // your existing code till process.Start()
    
    while (true)
    {
        string? stdErrorLine = process.StandardError.ReadLine();
    
        if (stdErrorLine != null)
        {
            Console.WriteLine(stdErrorLine);
        }
        else
        {
            break;
        }
    }
    
    Console.WriteLine("End Time: " + DateTime.Now.ToString();
    
    process.WaitForExit();
    

    Now to shed some light on Page.enable etc not working, this this may help.