I am trying to understand why the following code doesn't work. My understanding is it will take data$Sepal.Length
(element within the nested data column) and iterate that one(the vector) over the function sum
.
df <- iris %>%
nest(-Species) %>%
mutate(Total.Sepal.Length = map_dbl(data$Sepal.Length, sum, na.rm = TRUE))
print(df)
But this throws an error Total.Sepal.Length must be size 3 or 1, not 0.
The following code works by using anonymous function as how it is usually accessed
df <- iris %>%
nest(-Species) %>%
mutate(Total.Sepal.Length = map_dbl(data, function(x) sum(x$Sepal.Length, na.rm = TRUE)))
print(df)
I am trying to understand why the previous code didn't work even though I am correctly passing arguments to mutate
and map
.
You should do this:
df <- iris %>%
nest(-Species) %>%
mutate(Total.Sepal.Length = map_dbl(data, ~sum(.x$Sepal.Length, na.rm = TRUE)))
Two things: any reason you're not using group_by
?
Second: your initial mutate is trying to perform:
map_dbl(df$data$Sepal.Length, sum, na.rm = TRUE)
Which brings an empty result, because df$data$Total.Sepal.Length
is NULL
(you have to access each list element to access the columns, so df$data[[1]]$Total.Sepal.Length
works)
# A tibble: 3 × 3
Species data Total.Sepal.Length
<fct> <list> <dbl>
1 setosa <tibble [50 × 4]> 250.
2 versicolor <tibble [50 × 4]> 297.
3 virginica <tibble [50 × 4]> 329.