Search code examples
javaeclipseparsingstandard-library

parsing a webpage with java


i'm looking to parse the real time rates from this webpage: http://www.truefx.com/ into my java programs, ie, i want the data from the webpage that is refreshed on a second by second basis to be constantly streamed into my program.

I would like to do this using the standard java libraries, if possible. i'm aware of plugins like jsoup and possibly others, but i'd like to not have to download and install the plugins as the computer i'm using's hard drive is based in california and everything but a few core programs, eclipse being on of them, gets deleted every night when the system restarts.

So if anyone knows of a package in the standard eclipse download that can do this, please let me know! thanks


ok so i got this working, but it seems very slow. for example the data will change on a second by second basis, and even though i'm refreshing the webpage that i read from on a second by second basis also (i used thread.sleep(1000)), and then get a new instance of the webpage, it only updates once every minute or so. what gives?

here's what my code looks like (i used what you posted above as my url reader) :

 public String getPage(String urlString){
        String result = "";
        //Access the page
        try {
         // Create a URL for the desired page
         URL url = new URL(urlString);
         // Read all the text returned by the server
         BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
         String str;
         while ((str = in.readLine()) != null) {
             // str is one line of text; readLine() strips the newline character(s)
             result += str;
         }
         in.close();             
        } catch (MalformedURLException e) {
        } catch (IOException e) {
        }          
        return result;
    }

    public static void main(String[]args){
        int i =0;
        Reading r = new Reading();

    while(true){
        try{Thread.sleep(1000);}catch(Exception e){}
        String page = new String(r.getPage("http://www.fxstreet.com/rates-charts/forex-rates/"));
        int index = page.indexOf("last_3212166");
        //System.out.println(i+page);
        i++;
        System.out.println(i+"GBP/USD: "+page.substring(index+14,index+20));
    }

Solution

  • With no external API you can get the page by this function just with importing java.net.URL

    static public String getPage(String urlString){
        String result = "";
        //Access the page
        try {
         // Create a URL for the desired page
         URL url = new URL(urlString);
         // Read all the text returned by the server
         BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
         String str;
         while ((str = in.readLine()) != null) {
             // str is one line of text; readLine() strips the newline character(s)
             result += str;
         }
         in.close();             
        } catch (MalformedURLException e) {
        } catch (IOException e) {
        }          
        return result;
    }
    

    Then use the java.util.regex to match the data that you want to get from the page. and parse it into your labels. Don't forget to put all this in a thread with a while(true) loop, and a sleep(some_time) to have second by second informations.