i have a dataframe like this
df = data.frame(day = c("1", NA, NA, NA, NA, "2", NA, NA, NA),
Unit = c("unit1", NA, NA, NA, "unit2", "unit1", NA, NA, "unit2"),
Problem = c("Oil", "Engine", "Electric", NA, NA, "Oil", "Power", NA, NA),
duration = c(2, 5, 1, NA, NA, 1.5, 3, NA, NA))
row 1:5 is day1 and 6:9 day2, if there are duplicated values in the same column, the value is NA.
i tried to use
df %>%
pivot_wider(names_from = Problem, values_from = duration)
but its not worked, my expected df is like this
df1 = data.frame(day = c("1", "1", "2", "2"),
Unit = c("unit1", "unit2", "unit1", "unit2"),
Oil = c(2, 0, 1.5, 0),
Engine = c(5, 0, 0, 0),
Electric = c(1, 0, 0, 0),
Power = c(0, 0, 3, 0),
NoProblem = c(0, 0, 0, 0))
We fill
the NA with previous non-NA elements, get the distinct
rows, and reshape to wide with pivot_wider
library(dplyr)
library(tidyr)
df %>%
fill(day, Unit, Problem) %>%
distinct(day, Unit, Problem, .keep_all = TRUE) %>%
mutate(duration = replace_na(duration, 0)) %>%
pivot_wider(names_from = Problem, values_from = duration,
values_fill = 0) %>%
mutate(NoProblem = 0)
-output
# A tibble: 4 × 7
day Unit Oil Engine Electric Power NoProblem
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 unit1 2 5 1 0 0
2 1 unit2 0 0 0 0 0
3 2 unit1 1.5 0 0 3 0
4 2 unit2 0 0 0 0 0