Search code examples
clojurejsoupclojure-java-interop

Specifying class of object in clojure


I wan't to scrape a website, which requires me to login. I've decided to use Jsoup to do this. I'm having trouble "translating" this line of code to Clojure properly:

Connection.Response loginForm = Jsoup.connect("**url**")
        .method(Connection.Method.GET)
        .execute();

Without specifying the class Connection.Response in my Clojure code, the connection has the class jsoup.helper.HttpConnect, which lacks methods I need to cookies from the session.

So far I've come up with the following Clojure code:

(import (org.jsoup Jsoup Connection
               Connection$Response Connection$Method))
(do
 (def url "*URL*")
 (def res (doto (org.jsoup.Jsoup/connect url)
   (.data "username" "*USERNAME*")
   (.data "password" "*PASSWORD")
   (.method Connection$Method/POST)
   (.execute)))
 (type res))

Solution

  • The problem is you are using doto where you should use a -> threading macro:

    (let [url "*URL*"]
      (-> url
          (org.jsoup.Jsoup/connect)
          (.data "username" "*USERNAME*")
          (.data "password" "*PASSWORD*")
          (.method "Connection$Method/POST)
          (.execute)))
    

    doto form is usually used when you need to setup a Java object which provides setter-like methods returning void and that prevent you from use threading.

    (doto (SomeClass.)
      (.setA 1)
      (.setB 2)
      (.execute))
    

    Translates into:

    (let [obj (SomeClass.)]
      (.setA obj 1)
      (.setB obj 2)
      (.execute obj)
      obj)
    

    As you can see doto doesn't return the result of the last method call but the object provided as its first argument (SomeClass object in this case). So your current code returns the object created by Jsoup/connect method (jsoup.helper.HttpConnect as you notices) instead of the Connection.Response result of execute() method call.

    What you need is:

    (-> (SomeClass.)
        (.withA 1)
        (.withB 2)
        (.execute))
    

    where with* are builder methods returning this instead of void.

    The above threading form is equivalent to:

    (.execute
      (.withB
        (.withA
          (SomeClass.)
          1)
        2))