i am trying to parse the following webpage https://shop.sprouts.com/shop/flyer using .Net, Selenium, PhantomJs. The data that I am seeing in the element's text is completely different than what I see on the screen. Is there a better way to parse the webpage?
using Microsoft.VisualStudio.TestTools.UnitTesting;
using OpenQA.Selenium;
using OpenQA.Selenium.PhantomJS;
[TestClass]
public class UnitTest1
{
const string PhantomDirectory = @"..\..\..\packages\PhantomJS.2.1.1\tools\phantomjs";
[TestMethod]
public void GetSproutsWeeklyAdDetails()
{
using (IWebDriver phantomDriver = new PhantomJSDriver(PhantomDirectory))
{
phantomDriver.Navigate().GoToUrl("https://shop.sprouts.com/shop/flyer");
var elements = phantomDriver.FindElements(By.ClassName("cell-title-text"));
}
}
}
As per the WebSite https://shop.sprouts.com/shop/flyer
to parse the data that you are seeing in the element's text you need to induce WebDriverWait for the visibility of all the desired elements and you can use the following solution:
Solution:
IList<IWebElement> elements = new WebDriverWait(driver, TimeSpan.FromSeconds(3)).Until(ExpectedConditions.VisibilityOfAllElementsLocatedBy(By.XPath("//span[@class='cell-title-text' and @ng-bind-html='productTitle()']")));
foreach (IWebElement element in elements)
{
Console.WriteLine(element.GetAttribute("innerHTML"));
}
Equivalent Python Exmaple:
driver.get('https://shop.sprouts.com/shop/flyer')
myList = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[@class='cell-title-text' and @ng-bind-html='productTitle()']")))
for item in myList:
print(item.text)
Console Output:
Sweet Corn, 1 EA
Cantaloupe Melons, 1 LB
Red Cherries
Half Chicken Breast
Roma Tomatoes
100% Grass Fed Ground Beef Value Pack
Colby Jack Rbst Free
Walnut Halves & Pieces