Search code examples
rdplyrdata-manipulation

Assigning phases to stock prices in R


I have an R dataframe that looks something like this:

Date Price UP Price Down
2023-01-01 NA NA
2023-01-02 10 NA
2023-01-03 NA NA
2023-01-04 NA 4
2023-01-05 NA 3
2023-01-06 10 NA
2023-01-07 NA 2

I tried using for loop to assign 'Phase numbers' every time the price goes up and then comes down. For example if the price UP is 10 and the next price down after that is 4, then the dates between them are Phase 1, and so on we mark Phase 2, 3, ...n.

Can we have a dplyr friendly way to get the same result?

The final output should look something like this:

Date Price UP Price Down PhaseIndicator
2023-01-01 NA NA
2023-01-02 10 NA Phase 1
2023-01-03 NA NA Phase 1
2023-01-04 NA 4 Phase 1
2023-01-05 NA 3
2023-01-06 10 NA Phase 2
2023-01-07 NA 2 Phase 2

Solution

  • here my proposal tell me if it corresponds

    library(dplyr)
    library(tidyr) # you need tidyr for the fill function 
    df=data.frame(date = 1:7,up=c(NA,10,NA,NA,NA,10,NA),down=c(NA,NA,NA,4,3,NA,2))
    
    df
      date up down
    1    1 NA   NA
    2    2 10   NA
    3    3 NA   NA
    4    4 NA    4
    5    5 NA    3
    6    6 10   NA
    7    7 NA    2
    

    here the phases number is determined by the number of times the price go up row_number(up>0) with fill() the phase number (nup) is carried forward to the next value

    df%>%
      mutate(nup=row_number(up>0))%>%
      fill(nup)%>%
      mutate(phase_start = ifelse(!is.na(nup),paste0("pahase ",nup),NA))
    
      date up down nup phase_start
    1    1 NA   NA  NA        <NA>
    2    2 10   NA   1    pahase 1
    3    3 NA   NA   1    pahase 1
    4    4 NA    4   1    pahase 1
    5    5 NA    3   1    pahase 1
    6    6 10   NA   2    pahase 2
    7    7 NA    2   2    pahase 2