I have a spark data frame:
library(SparkR); library(magrittr)
as.DataFrame(mtcars) %>%
groupBy("am")
How can I ungroup this data frame? There doesn't seems to be any ungroup function in the SparkR library!
There doesn't seems to be any ungroup function in the SparkR library
That's because groupBy
doesn't have the same meaning as group_by
in dplyr
.
SparkR::group_by
/ SparkR::groupBy
returns not a SparkDataFrame
but a GroupData
object which corresponds to GROUP BY
clause in SQL. To convert it back to a SparkDataFrame
you should call SparkR::agg
(or if you prefer dplyr
nomenclature SparkR::summarize
) which corresponds to SELECT
component of the SQL
query.
Once you aggregate you get back SparkDataFrame
and grouping is no longer present.
Additionally SparkR::groupBy
doesn't have dplyr group_by(...) %>% mutate(...)
equivalent. Instead we use window functions with frame definition.
So the take away message is - if you don't plan to aggregate don't use groupBy
.