Search code examples
javascriptjavaselenium-webdriverselenium-chromedriverjsoup

How can I parse the site after the site loaded the javascript?


I want to display the webpage HTML after it loads the javascript so that I can get an accurate representation of the tables.

I have tried using other jar but this one is the only one that seem to work for me because the rest looks outdated.

System.setProperty("webdriver.chrome.driver", "D:\\Download bestanden\\chromedriver_win32\\chromedriver.exe");

    ChromeOptions options = new ChromeOptions();
    //options.addArguments("headless");
    WebDriver driver = new ChromeDriver(options);

    driver.get("https://www.flashscore.com/");
    System.out.println(driver.getTitle());

    Document doc = Jsoup.parse(driver.getPageSource());
    System.out.println(doc.select("ul.submenu.hidden li a").text());
    driver.close();
    driver.quit();
    System.out.println("Completed");

If I search for lmenu_17 I expect more results than Superlinga by Albania as a href, I expect First Division Albanian cup and Super Cup to be displayed as well like they do in the inspector. Thank you in advance any help is appreciated!


Solution

  •         ChromeDriver driver = new ChromeDriver();
            driver.Navigate().GoToUrl("https://www.flashscore.com/");
    
            //works after the page is fully loaded.
            //goes to a bottom line.
    
            string href = driver.FindElementByXPath("//*[@id='lmenu_17']/ul/li[1]/a").GetAttribute("href"); // albanian link
            //driver.Navigate().GoToUrl(href);
    
            foreach (var element in driver.FindElements(By.XPath("//*[@id='lc']/div[6]/ul/li/a")))
            {
                Console.WriteLine(element.GetAttribute("href"));
            }
    
            driver.FindElementByXPath("//*[@id='lc']/div[6]/ul/li[12]/a").Click();
            Thread.Sleep(1000);
    
            foreach (var element in driver.FindElements(By.XPath("//*[@id='lc']/div[9]/ul/li/a")))
            {
                Console.WriteLine(element.GetAttribute("href"));
            }
    
            Console.ReadKey();
    

    you don't need to get the page source.

    picture of working I don't know if it'll help. Happy if I could help.