Search code examples
javajsoup

Cannot scrape price from website with Jsoup


I'm having project in college which I need to build a software that tracks item prices based on URLs that the user should enter (currently only from Banggood.com).

I have just started to learn about scraping info from websites so I've manage to do that and I stuck just in the beginning. I managed to scrape the item title but unsuccessfull to do that with the item price. I upload my current code.

I could not managed get the right info from Jsoup site nor Google

import java.io.IOException;
import java.util.Scanner;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

public class ProjectLearning 
{
    public static void main(String args[])
    {
        Scanner scan = new Scanner(System.in);
        
        print("Please enter your item URL: ");
        String userUrl = scan.next();
        print("Give me a sec..");
        
        Document document;
        try {
            //Get Document object after parsing the html from given url.
            document = Jsoup.connect(userUrl).get();

            String title = document.title(); //Get title
            print("Item name: " + title); //Print title.
            
            //Get price
            Elements price = document.select("div#item_now_price");
        
            for (int i=0; i < price.size(); i++) 
            {
                print(price.get(i).text());
            }

        } catch (IOException e) 
        {
            e.printStackTrace();
        }
        print("There you go");
    }

    public static void print(String string) {
        System.out.println(string);
    }
}

Output:

Please enter your item URL: 
https://www.banggood.com/3018-3-Axis-Mini-DIY-CNC-Router-Standard-Spindle-Motor-Wood-Engraving-Machine-Milling-Engraver-p-1274569.html?rmmds=flashdeals&cur_warehouse=CN

Give me a sec..

Item name: 3018 3 axis mini diy cnc router standard spindle motor wood engraving machine milling engraver Sale - Banggood.com

Solution

  • It's because you are getting the element with id item_now_price and not class.

    Looking into the URL that you've entered, the element with the price looks like below:

    <div class="item_now_price" oriattrmin="0" oriattrmax="0" noworiprice="149.9" oriprice="149.9" oriprice3="146.7" oriprice10="145.16" oriprice30="143.64" oriprice100="142.11">US$149.90</div>
    

    The correct selector should be Elements price = document.select("div.item_now_price");

    Look into https://jsoup.org/cookbook/extracting-data/selector-syntax for learning more about the selectors.

    Update: So, I've looked in your code and the reason you are not getting the price as output is that the price is loaded through another Ajax request. Unfortunately jSoup cannot help you here.

    For more information, look into this answer: Fetch contents(loaded through AJAX call) of a web page