Search code examples
rfrequencydetecthyper

R: Detect frequency


I have to create a function (or loop) in R to detect hyper-frequent. The requirement to detect hyper-frequent is to come 3 times in 180 days, if that requirement is met the person will be hyper-frequent, not only in the future, but in the past visits where he did not meet the hyper-frequent requirement as well.

pacient <- c(10,10,10,10,10,11,11,12,12,12,13, 13, 15, 14); pacient
date <- as.Date(c("01/01/2018","02/05/2018", "04/06/2018", "10/11/2018", "05/12/2018", "02/01/2018", "06/08/2018", "01/01/2018", "03/01/2018", "06/03/2018", "05/08/2018", "05/08/2019", "05/07/2019", "08/07/2017"), format = "%d/%m/%Y"); date 
DF <- data.frame(pacient, date); DF


count_visit <- function(x){
  DF <- data.table(DF)
  DTord<-DF[with(DF , order(DF $ date)), ]; DTord 
  DTord[,num_visit := order(date), by  = pacient];DTord 
  DTordID <- DTord[with(DTord, order(DTord$pacient)), ]; DTordID  
  DTordID[,max_visit := max(num_visit), by  = pacient];DTordID 
framedatos <- as.data.frame(DTordID)

  return(framedatos)}

REUP_visit <- count_visit(DF); head(REUP_visit)


   pacient    date      num_visit   max_visit
    10     01/01/2018      1           5
    10     02/05/2018      2           5
    10     04/06/2018      3           5
    10     10/11/2018      4           5
    10     05/12/2018      5           5 
    11     02/01/2018      1           2
    11     06/08/2018      2           2
    12     01/01/2018      1           3   
    12     03/01/2018      2           3
    12     06/03/2018      3           3
    13     05/08/2018      1           2
    13     05/08/2019      2           2
    14     08/07/2017      1           1
    15     05/07/2019      1           1

So far I have only managed to create a function that tells me the number of visits per patient and the maximum number of visits a patient has had (this is what I need for something else):

  pacient    date    num_visit  max_visit  days_visit   <180 future_hyperf  past_hyperf
    10     01/01/2018      1           5       0          1      no           yes
    10     02/05/2018      2           5       121        2      no           yes
    10     04/06/2018      3           5       33         3      yes          yes
    10     10/11/2018      4           5       159        4      yes          yes  
    10     05/12/2018      5           5       25         5      yes          yes
    11     02/01/2018      1           2       0          1      no           no 
    11     06/08/2018      2           2       216        1      no           no 
    12     01/01/2018      1           3       0          1      no           yes 
    12     03/01/2018      2           3       2          2      no           yes 
    12     06/03/2018      3           3       62         3      yes          yes  
    13     05/08/2018      1           2       0          1      no           no         
    13     05/08/2019      2           2       365        1      no           no 
    14     08/07/2017      1           1       0          1      no           no 
    15     05/07/2019      1           1       0          1      no           no 

The output I need is one that has: "day_visit", "<180", "future_hyperf" and "past_hyperf".

The objective of the variable "day_visit" is to number the patient's first visit to the emergency room at 0 and then count the days between visits.

    DF <- DF %>%
  group_by(pacient) %>%
  arrange(date) %>%
  mutate(days_visit= date - lag(date, default = first(date)))

The variable "<180" would be the variable that number 1 the first time it comes, 2 the second (if it is <180 days with the previous visit), 3 (if it is <180 days with the previous visit) and so on . If, for example, the patient reaches 2 and the third visit does not meet <180 days, it would be necessary to put 1 again (the loop would be restarted).

The variable "future_hyperf" says yes or no. It is marked as if it made the future once the patient reaches 3 in the variable <180, it does not matter if the visits are later than 180 days and does not comply. Once the criterion is met, it is forever.

The variable "past_hyperf" converts all the patients that have if in the variable "future_hyperf" in itself also to the past.

Thank you!

SOLUTION

DF3 <-  DF %>%
  arrange(pacient, date) %>%
  group_by(pacient) %>%
  mutate(days_visit = as.integer(date - lag(date, default = first(date))) ,
         less_180 = days_visit < 180) %>%
  mutate(counter = rowid(pacient, cumsum(date - shift(date, fill=first(date)) > 180)),
         future_hyperf = case_when(counter >= 3 ~ "yes",
                                   TRUE ~ "no"),
         past_hyperf = case_when(max(counter, na.rm = T) >= 3 ~ "yes",
                                 TRUE ~ "no")) 
DF3 <- DF3[with(DF3,order(pacient,date)),]

Solution

  • Try this:

    pacient <- c(10, 10, 10, 10, 10, 11, 11, 12, 12, 12, 13, 13, 15, 14)
    pacient
    date <-
      as.Date(
        c(
          "01/01/2018",
          "02/05/2018",
          "04/06/2018",
          "10/11/2018",
          "05/12/2018",
          "02/01/2018",
          "06/08/2018",
          "01/01/2018",
          "03/01/2018",
          "06/03/2018",
          "05/08/2018",
          "05/08/2019",
          "05/07/2019",
          "08/07/2017"
        ),
        format = "%d/%m/%Y"
      )
    date
    DF <- data.frame(pacient, date)
    DF
    #packages
    library(dplyr)
    library(lubridate)
    #time zone
    lct <- Sys.getlocale("LC_TIME")
    Sys.setlocale("LC_TIME", "C")
    DF <- DF %>%
      group_by(pacient) %>%
      mutate(num_visit = cumsum(pacient) / pacient) %>% # number of visits
      mutate(max_visit = max(num_visit)) %>% # max visit
      mutate(days_visit = as.Date(date, "%d/%m/%Y") - lag(as.Date(date, "%d/%m/%Y"))) %>% # days between visits
      mutate(minus_180_days = case_when(days_visit < 180 &
                                          !is.na(days_visit) ~ num_visit,
                                        TRUE ~ 1)) %>% # is days between visits < 180
      mutate(future_hyperf = case_when(minus_180_days > 3 ~ "yes",
                                       TRUE ~ "no")) %>% # future hyperf
      mutate(past_hyperf = case_when(max(minus_180_days, na.rm = T) >= 3 ~ "yes",
                                     TRUE ~ "no")) # past hyperf
    

    Hope it helps