Given a tibble looking as the one below, I'm trying to use the Tidyverse to perform a conditional sum based on the value of Item
in each of the two Fields. Specifically, both for foo
and bar
, I'd like to sum the value corresponding to item a
to that of item b
and then remove the row of the former. The result I'm looking for is exemplified in table 2.
data <- tibble(Field = rep(c("foo", "bar"), each=4),
Item = rep(c("a", "b", "c", "d"), 2),
Value = runif(8))
# table 1 # table 2
| Field | Item | Value | | Field | Item | Value |
|-------|------|-----------| |-------|------|-----------|
| foo | a | 0.8167347 | | foo | b | 0.9583989 | <== 0.8167347 + 0.1416642
| foo | b | 0.1416642 | | foo | c | 0.7054814 |
| foo | c | 0.7054814 | | foo | d | 0.1196948 |
| foo | d | 0.1196948 | |--------------------------|
|--------------------------| | bar | b | 0.6177568 | <== 0.3604500 + 0.2573068
| bar | a | 0.3604500 | | bar | c | 0.7003040 |
| bar | b | 0.2573068 | | bar | d | 0.8131556 |
| bar | c | 0.7003040 |
| bar | d | 0.8131556 |
So far I didn't manage to get anywhere near the expected result. I know how to use dplyr
's grouping function to isolate items belonging to either one of the two fields, but I have no idea how to select the value of a
and sum it to b
after the grouping has been performed.
You can change Item, to receive b
when it is equal a
, then summarise
library(dplyr)
set.seed(123)
data <- tibble(Field = rep(c("foo", "bar"), each=4),
Item = rep(c("a", "b", "c", "d"), 2),
Value = runif(8))
# A tibble: 8 x 3
Field Item Value
<chr> <chr> <dbl>
1 foo a 0.288
2 foo b 0.788
3 foo c 0.409
4 foo d 0.883
5 bar a 0.940
6 bar b 0.0456
7 bar c 0.528
8 bar d 0.892
data %>%
mutate(Item = if_else(Item == "a","b",Item)) %>%
group_by(Field,Item) %>%
summarise(Value = sum(Value,na.rm = TRUE)) %>%
ungroup()
# A tibble: 6 x 3
Field Item Value
<chr> <chr> <dbl>
1 bar b 0.986
2 bar c 0.528
3 bar d 0.892
4 foo b 1.08
5 foo c 0.409
6 foo d 0.883