I have an R dataframe that looks something like this:
Date | Price UP | Price Down |
---|---|---|
2023-01-01 | NA | NA |
2023-01-02 | 10 | NA |
2023-01-03 | NA | NA |
2023-01-04 | NA | 4 |
2023-01-05 | NA | 3 |
2023-01-06 | 10 | NA |
2023-01-07 | NA | 2 |
I tried using for loop to assign 'Phase numbers' every time the price goes up and then comes down. For example if the price UP is 10 and the next price down after that is 4, then the dates between them are Phase 1, and so on we mark Phase 2, 3, ...n.
Can we have a dplyr friendly way to get the same result?
The final output should look something like this:
Date | Price UP | Price Down | PhaseIndicator |
---|---|---|---|
2023-01-01 | NA | NA | |
2023-01-02 | 10 | NA | Phase 1 |
2023-01-03 | NA | NA | Phase 1 |
2023-01-04 | NA | 4 | Phase 1 |
2023-01-05 | NA | 3 | |
2023-01-06 | 10 | NA | Phase 2 |
2023-01-07 | NA | 2 | Phase 2 |
here my proposal tell me if it corresponds
library(dplyr)
library(tidyr) # you need tidyr for the fill function
df=data.frame(date = 1:7,up=c(NA,10,NA,NA,NA,10,NA),down=c(NA,NA,NA,4,3,NA,2))
df
date up down
1 1 NA NA
2 2 10 NA
3 3 NA NA
4 4 NA 4
5 5 NA 3
6 6 10 NA
7 7 NA 2
here the phases number is determined by the number of times the price go up row_number(up>0)
with fill()
the phase number (nup) is carried forward to the next value
df%>%
mutate(nup=row_number(up>0))%>%
fill(nup)%>%
mutate(phase_start = ifelse(!is.na(nup),paste0("pahase ",nup),NA))
date up down nup phase_start
1 1 NA NA NA <NA>
2 2 10 NA 1 pahase 1
3 3 NA NA 1 pahase 1
4 4 NA 4 1 pahase 1
5 5 NA 3 1 pahase 1
6 6 10 NA 2 pahase 2
7 7 NA 2 2 pahase 2