I'm wondering if there is a concise method for dropping a DataFrame's column in SparkR, such as df.drop("column_name")
in pyspark.
This is the closest I can get:
df <- new("DataFrame",
sdf=SparkR:::callJMethod(df@sdf, "drop", "column_name"),
isCached=FALSE)
This can be achieved by assigning NULL to the Spark dataframe column:
df$column_name <- NULL
See the original discussion at the related Spark JIRA ticket.