I have been trying to extract a link from the html source of a website, but I cant get it to print out the result. I'm somewhat new to extracting links so my code could be all wrong (Any clarification would be helpful). The link I'm looking to output is https://shop.ccs.com/checkout/cart/add/uenc/aHR0cHM6Ly9zaG9wLmNjcy5jb20vaGFwcHktc29ja3Mtd2l6LWtoYWxpZmEtYmxhY2stYW5kLWJsdWUtc29ja3MtOS0xMQ,,/product/383628/ from the productUrl https://shop.ccs.com/happy-socks-wiz-khalifa-black-and-blue-socks-9-11
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.ListIterator;
import java.util.Map;
import java.util.Scanner;
import org.jsoup.Connection;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import org.jsoup.Connection.Method;
public class mains {
public static void main(String[] args) throws IOException {
Document doc = Jsoup.connect(productUrl)
.userAgent("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109 Safari/537.36")
.get();
Elements links = doc.select("form[action]");
Elements imports = doc.select("link[action]");
String absHref = links.attr("abs:action");
System.out.println(absHref);
}
}
The short answer is if you want to add product to the basket you can just visit this URL: https://shop.ccs.com/checkout/cart/addAjax/?product=383628&related_product=&qty=1
The long answer is this site submits a form to an URL which has no real HTML content but uses javascript to process your request further. Jsoup can't handle that but we can cheat and use web browser's debugger to peek what happens next and that's how I obtained the URL above.
You can easily use the same link with different product id and quantity. Remember that if you want to make another request for example to check your basket contents you should also pass cookies obtained from previous request. Without that your basket will always be empty.