Search code examples
rmedical

Data transformation "proc transpose" equivqlent in R


Hi everyone I am new here I have the follwing dataset

member_id<-c(603,603,603)
fill_date<-c("02/17/2005","06/13/2005","08/11/2005")
drug<-rep("a",3)
days_supply<-rep(30,3)
dataset<-data.frame(member_id,fill_date,drug,days_supply)

I want to transform the data as the follwing: Transformed data

In sas I use this code:

    proc sort data=claims;
       by member_id fill_dt;
       run;
    proc transpose data=claims out=fill_dates (drop=_name_) prefix=fill_dt;
    by member_id;
    var fill_dt;
    run;

    proc transpose data = claims out=days_supply (drop=_name_) prefix = days_supply;
    by member_id;
    var days_supply;
    run;

    data both;
    merge fill_dates days_supply;
    by member_id;
    format start_dt end_dt mmddyy10.;
    start_dt=fill_dt1;
    end_dt=fill_dt1+179;
    run;

I was wondering if you could help with th equivalent code in R

Thanks


Solution

  • This may get you started.

    # in case you don't have those packages installed
    install.packages("reshape2")
    install.packages("tidyverse")
    
    library(reshape2)
    library(tidyverse)
    
    
    member_id<-c(603,603,603)
    fill_dt<- c("2005-02-17", "2005-06-13", "2005-08-11")
    days_supply<-rep(30,3)
    dataset<-data.frame(member_id,fill_dt,days_supply)
    
    
    
    
    dataset_melt <- melt( data =dataset, id.vars = "member_id" )
    dataset_melt <- dataset_melt %>% group_by(variable) %>% mutate( variable_n = paste0( variable, row_number() ))
    
    dataset_cast <- data.table::dcast( data = dataset_melt, formula = member_id ~  variable_n, value.var =c("value")  )
    dataset_cast <- dataset_cast %>% mutate( start_dt = as.Date(fill_dt1), 
                                             end_dt   = start_dt + 179 )
    
    dataset_cast
    

    To get better help I suggest creating a minimally reproducible example of what you are doing in SAS. This means SAS code which creates the data in SAS, and creates the output that you want. You data is not minimal because you do not use the "drug" variable.