Goal: Get inner text of JavaScript element from Yahoo Finance Page. Please refer to
I can get the innerHTML using the the code below
document.getElementsByClassName('D(ib) Va(t)')[15].childNodes[2].innerHTML
But, I can't find a method to communicate this to the Yahoo Finance page in Java
I've briefly tried the following APIs:
I think Nashorn can get the text I'm looking for, but I haven't been able to do it yet.
If anyone has done something similar or can point me in the right direction, that would be much appreciated.
Let me know if more details are needed.
HtmlUnit seems to have problems with this site, since the response is incomplete as well. You could use PhantomJS. Just download the binary for your OS and create a script file (see API).
Script (yahoo.js
):
var page = require('webpage').create();
var fs = require('fs');
page.settings.userAgent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.120 Safari/537.36';
page.settings.resourceTimeout = '5000';
page.open('http://finance.yahoo.com/quote/AAPL/profile?p=AAPL', function(status) {
console.log("Status: " + status);
if(status === "success") {
var path = 'yahoo.html';
fs.write(path, page.content, 'w');
}
phantom.exit();
});
Java code:
try {
//change path to phantomjs binary and your script file
String phantomJSPath = "bin" + File.separator + "phantomjs";
String scriptFile = "yahoo.js";
Process process = Runtime.getRuntime().exec(phantomJSPath + " " + scriptFile);
process.waitFor();
//Jsoup
Elements elements = Jsoup.parse(new File("yahoo.html"),"UTF-8").select("div.asset-profile-container p strong"); //yahoo.html created by script file in same path
for (Element element : elements) {
if(element.attr("data-reactid").contains("asset-profile.1.1.1.2")){
System.out.println(element.text());
}
}
} catch (Exception e) {
e.printStackTrace();
}
Output:
Consumer Goods
Note:
The following link returns a JSONObject containing the company information, not sure though if the crumb
parameter changes or is constant for a company:
https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?formatted=true&crumb=hm4%2FV0JtzlL&lang=en-US®ion=US&modules=assetProfile%2CsecFilings%2CcalendarEvents&corsDomain=finance.yahoo.com