Search code examples
apache-sparkclojureapache-spark-sqlflambo

Clojure - No matching method found for select method in DataFrame when using Flambo


I'm using Flambo to work with Spark. I want to retrieve a DataFrame which contains given column names. I wrote a simple function as follows:

(defn make-dataset
  ([data-path column-names and-another]
    (let [data (sql/read-csv sql-context data-path)
      cols (map #(.col data %) column-names)]
      (.select data (Column. "C0")))))

I get the following exception when i execute it.

IllegalArgumentException No matching method found: select for class org.apache.spark.sql.DataFrame clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:80)

What am i doing wrong? Why col. works whereas select. doesn't when both of them are available from the same Class? Please help me if i am wrong?


Solution

  • DataFrame.select you are trying to call has following signature:

    def select(cols: Column*): DataFrame
    

    As you can see it accepts a vararg of Column whereas you provide it a single, bare Column value which doesn't match the signature, thus the exception. Scala's varargs are wrapped in scala.collection.Seq. You can wrap your column(s) into something that implements Seq using following code:

    (scala.collection.JavaConversions/asScalaBuffer [(Column. "C0")])