r dataframe group-by grouping data-manipulation

Grouping into desired number of groups

I have a data frame like this: ID is the primary key and Apples is the number of apples that person has.

ID	Apples
E1	10
E2	5
E3	NA
E4	5
E5	8
E6	12
E7	NA
E8	4
E9	NA
E10	8

I want to group NA and non-NA values into only 2 separate groups and get the count of each. I tried the normal group_by(), but it does not give me desired output.

Fruits %>% group_by(Apples) %>% summarize(n())

Apples    n()
<dbl>    <int>
 4         1            
 5         2            
 8         2            
 10        1            
 12        1            
 NA        3

My desired output:

Apples    n()
<dbl>    <int>
 non-NA    7                    
 NA        3

Solution

We can create a group for NA and non-NA using group_by, and we can also make it a factor so that we can change the labels in the same step. Then, get the number of observations for each group.

library(dplyr)

df %>% 
  group_by(grp = factor(is.na(Apples), labels=c("non-NA", "NA"))) %>% 
  summarise(`n()`= n())

#  grp     `n()`
#  <fct>  <int>
#1 non-NA     7
#2 NA         3

Or in base R, we could use colSums:

data.frame(Apples = c("non-NA", "NA"), n = c(colSums(!is.na(df))[2], colSums(is.na(df))[2]), row.names = NULL)

Data

df <- structure(list(ID = c("E1", "E2", "E3", "E4", "E5", "E6", "E7", 
"E8", "E9", "E10"), Apples = c(10L, 5L, NA, 5L, 8L, 12L, NA, 
4L, NA, 8L)), class = "data.frame", row.names = c(NA, -10L))