I am trying to use JSoup to scrape the poster image from an IMDb link, and save so that it can be used by my program later. This is what I have so far:
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Attributes;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class JSoupTest
{
public static void main(String[] args)
{
String address = "https://www.imdb.com/title/tt1270797/";
try
{
Document doc = Jsoup.connect(address).get();
Element link = doc.select().select();
}
catch (IOException e)
{
// Auto-generated catch block
e.printStackTrace();
}
}
}
Now, I know the image is under a div class named "poster", but I cannot find out how to extract it. Please bear with me, as I have no prior experience with JSoup. Thanks a lot.
I've been using JSoup for awhile. But I've never tried to download an image from a HTML source.
After getting document as you did above, you'll get the div you want, by using:
Elements divs = doc.getElementsByClass("poster");
The code above will return all Elements with 'poster' class.
If you are sure there's only one div named 'poster' you can do:
Element poster = divs.first();
If you aren't sure of that, you'll need to find a way to differentiate that div from the others.
Now, that you have your 'poster' div, you can get the link inside it, by doing:
Elements image = poster.getElementsByTag("a");
The code above will return all links inside 'poster' div. As we did above, if you're sure there's only one link inside 'poster' div, you can do:
Element downloadImage = image.first();
Now, you have the link for the image you want.