using HtmlUnit for scraping data from the internet, and I need to login the following page https://accounts.google.com/login.
When use the "getPage()" method i keep getting this exceptio, how can i solve it? thank you in advance
Exception in thread "main" ======= EXCEPTION START ========
Exception class=[net.sourceforge.htmlunit.corejs.javascript.JavaScriptException]
com.gargoylesoftware.htmlunit.ScriptException: AssertionError: Assertion failed: No element found with className: signin-card (script in https://accounts.google.com/login?hl=es#identifier from (2653, 11) to (2753, 10)#2660)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:894)
at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:628)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:513)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:776)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:752)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:740)
at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:916)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:307)
at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:368)
at com.gargoylesoftware.htmlunit.html.HtmlScript$2.execute(HtmlScript.java:238)
at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:257)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:773)
at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:730)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1209)
at net.sourceforge.htmlunit.cyberneko.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1111)
at net.sourceforge.htmlunit.cyberneko.filters.DefaultFilter.endElement(DefaultFilter.java:207)
at net.sourceforge.htmlunit.cyberneko.filters.NamespaceBinder.endElement(NamespaceBinder.java:337)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3137)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2100)
at net.sourceforge.htmlunit.cyberneko.HTMLScanner.scanDocument(HTMLScanner.java:927)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:506)
at net.sourceforge.htmlunit.cyberneko.HTMLConfiguration.parse(HTMLConfiguration.java:459)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:980)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:241)
at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:187)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:269)
at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:157)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:512)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:386)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:451)
at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:436)
at prog.htmlUnit.Scrapeo.iniciaSesion(Scrapeo.java:74)
at prog.htmlUnit.ProgramaPruebas.main(ProgramaPruebas.java:24)
Caused by: net.sourceforge.htmlunit.corejs.javascript.JavaScriptException: [object Object] (script in https://accounts.google.com/login?hl=es#identifier from (2653, 11) to (2753, 10)#2660)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1006)
at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:798)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:105)
at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:411)
at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:252)
at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3286)
at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.exec(InterpretedFunction.java:115)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$3.doRun(JavaScriptEngine.java:767)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:879)
... 35 more
JavaScriptException value = [object Object]
======= EXCEPTION END ========
The part that throw the exception is just as simple as this:
public HtmlPage iniciaSesion(String correo, String pass) throws FailingHttpStatusCodeException, MalformedURLException, IOException{
HtmlPage pagActual;
HtmlTextInput cajaTexto;
HtmlButton boton;
pagActual= cliente.getPage("https://accounts.google.com/login?hl=es#identifier");
return pagActual;
The main program just calls this method and uses the .asXml() method but it throws the exception before using it.
You need to enable Javascript on your client. This code should work:
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF);
WebClient client = new WebClient(BrowserVersion.CHROME);
client.getOptions().setJavaScriptEnabled(true);
client.getOptions().setThrowExceptionOnScriptError(false);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
String url = "https://accounts.google.com/login";
final HtmlPage page = client.getPage(url);
System.out.println(page.asText());