I'm using Flambo to work with Spark. I want to retrieve a DataFrame which contains given column names. I wrote a simple function as follows:
(defn make-dataset
([data-path column-names and-another]
(let [data (sql/read-csv sql-context data-path)
cols (map #(.col data %) column-names)]
(.select data (Column. "C0")))))
I get the following exception when i execute it.
IllegalArgumentException No matching method found: select for class org.apache.spark.sql.DataFrame clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:80)
What am i doing wrong? Why col.
works whereas select.
doesn't when both of them are available from the same Class?
Please help me if i am wrong?
DataFrame.select
you are trying to call has following signature:
def select(cols: Column*): DataFrame
As you can see it accepts a vararg of Column
whereas you provide it a single, bare Column
value which doesn't match the signature, thus the exception. Scala's varargs are wrapped in scala.collection.Seq
. You can wrap your column(s) into something that implements Seq
using following code:
(scala.collection.JavaConversions/asScalaBuffer [(Column. "C0")])