I am trying to count the number of rows by group in a DataFrame. The following code generates a new column, called x1, which which has the intended information:
by(df, [:grouping_var_1, :grouping_var_2], nrow)
However, I am not aware on how to generate such column in a way I can define a name other than x1. The solution I have found so far is:
@pipe df |> by(_, [:grouping_var_1, :grouping_var_2], nrow) |> rename(_, :x1 => :my_desired_name);
Is there anyway I could do this directly without having to use rename ?
Thanks in advance.
Please update DataFrames.jl to 0.21 version.
Then use:
combine(groupby(df, [:grouping_var_1, :grouping_var_2]), nrow => :my_desired_name)
Two comments:
by
is deprecated and you are recommended not to use it (you can see the warning if you start Julia with --depwarn=true
)source_columns => function => target_column_name
, you can use a shorthand source_columns => function
, in which case the name of the target column is generated automatically. A special case is nrow
(without anything) and nrow => target_column_name
, as for nrow
you do not have to pass the source columns for convenience