I'm trying to make a connection to other columns within a function given a conditional.
Essentially, I want to make a dataframe go from long to wide given a conditional, where those values in one column are NA
relative to another column that has values in that same row, turn the NAs
into a specific numeric.
Although the values assigned have to be column specific. So if 2010
has NAs
whilst 2019
has a value, then return 16
otherwise, if 2019
has NAs when in that same row, 2010
has values, return 16
.
what I have tried:
# A tibble: 26 x 4
year locality_id landcover pland
<chr> <chr> <int> <dbl>
1 2010 L2228604 10 0.0645
2 2010 L2228604 13 0.935
3 2010 L452817 8 0.0968
4 2010 L452817 9 0.0323
5 2010 L452817 12 0.613
6 2010 L452817 13 0.194
7 2010 L452817 14 0.0645
8 2010 L596267 0 0.194
9 2010 L596267 9 0.0323
10 2010 L596267 11 0.0645
# ... with 16 more rows
p <- function(x){
if(x == 'NA') {
return(16)
} else {
17
}
}
#wrong outcome
# A tibble: 26 x 4
locality_id pland `2010` `2019`
<chr> <dbl> <dbl> <dbl>
1 L2228604 0.0645 17 NA
2 L2228604 0.935 17 NA
3 L452817 0.0968 17 NA
4 L452817 0.0323 17 NA
5 L452817 0.613 17 NA
6 L452817 0.194 17 NA
7 L452817 0.0645 17 NA
8 L596267 0.194 17 NA
9 L596267 0.0323 17 NA
10 L596267 0.0645 17 NA
# ... with 16 more rows
What I am expecting:
# A tibble: 26 x 4
locality_id pland `2010` `2019`
<chr> <dbl> <int> <int>
1 L2228604 0.0645 10 17
2 L2228604 0.935 13 17
3 L452817 0.0968 8 17
4 L452817 0.0323 9 17
5 L9185766 0.54838710 16 8
6 L9185766 0.19354839 16 9
7 L9185766 0.03225806 16 13
8 L9185766 0.16129032 16 14
9 L9234578 1.00000000 16 12
reproducible code:
structure(list(year = c("2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2010", "2010", "2010", "2010", "2010", "2010",
"2010", "2010", "2019", "2019", "2019", "2019", "2019", "2019",
"2019", "2019", "2019", "2019", "2019"), locality_id = c("L2228604",
"L2228604", "L452817", "L452817", "L452817", "L452817", "L452817",
"L596267", "L596267", "L596267", "L596267", "L152650", "L910180",
"L910180", "L910180", "L4791597", "L4791597", "L9149985", "L9149985",
"L9149985", "L9185766", "L9185766", "L9185766", "L9185766", "L9185766",
"L9234578"), landcover = c(10L, 13L, 8L, 9L, 12L, 13L, 14L, 0L,
9L, 11L, 13L, 13L, 0L, 8L, 9L, 5L, 8L, 10L, 11L, 12L, 4L, 8L,
9L, 13L, 14L, 12L), pland = c(0.0645161290322581, 0.935483870967742,
0.0967741935483871, 0.032258064516129, 0.612903225806452, 0.193548387096774,
0.0645161290322581, 0.193548387096774, 0.032258064516129, 0.0645161290322581,
0.709677419354839, 1, 0.4375, 0.34375, 0.03125, 0.566666666666667,
0.0333333333333333, 0.1, 0.0333333333333333, 0.866666666666667,
0.0645161290322581, 0.548387096774194, 0.193548387096774, 0.032258064516129,
0.161290322580645, 1)), row.names = c(NA, -26L), class = c("tbl_df",
"tbl", "data.frame"))
You can get the data in wide format and based on column name replace the NA
value.
library(dplyr)
library(tidyr)
df %>%
pivot_wider(names_from = year, values_from = landcover) %>%
mutate(across(`2010`:`2019`, ~if(cur_column() == '2019')
replace_na(.x, 17) else replace_na(.x, 16)))
# locality_id pland `2010` `2019`
# <chr> <dbl> <dbl> <dbl>
# 1 L2228604 0.0645 10 17
# 2 L2228604 0.935 13 17
# 3 L452817 0.0968 8 17
# 4 L452817 0.0323 9 17
# 5 L452817 0.613 12 17
# 6 L452817 0.194 13 17
# 7 L452817 0.0645 14 17
# 8 L596267 0.194 0 17
# 9 L596267 0.0323 9 17
#10 L596267 0.0645 11 17
# … with 16 more rows