Search code examples
rdummy-variable

dummy coding for certain variable


I have a dataframe that looks like this

df <- data.frame(task       = c(1, 2,  3, 4, 5, NA),
                 day        = c(10, 6,  7, 9, 9, 10),
                 deadline   = c(7, 12, 9, 7, 9, NA),
                 completion = c(1, 1,  1, 1, 0, NA))

Now I want to create a dummy variable that shows if a task was overdue on the day of completion, therefore I have created this code, somehow it does not give me the right results.

df$overduetask <- ifelse(df$completion == 1 & df$day > df$deadline, 1,0)

So my thought behind this is, if a task was completed (completion = 1) and the day is greater than the deadline, then the task is overdue. The output i get for the overdue variable is only 0's, which i manually checked and cannot be true.


Solution

  • It works for me:

    df$overduetask <- ifelse(df$completion == 1 & df$day >df$deadline, 1,0)
    

    Have you spell it wrong cllw$ instead of df$ ?

    Hi, has I said, it works for me:

    eduardo> str(df)
    'data.frame':   6 obs. of  5 variables:
     $ task       : num  1 2 3 4 5 NA
     $ day        : num  10 6 7 9 9 10
     $ deadline   : num  7 12 9 7 9 NA
     $ completion : num  1 1 1 1 0 NA
     $ overduetask: num  1 0 0 1 0 NA
    

    I suspect what could be your problem... It happened many times to me in R: When you check completion == 1 probably the test is failing because of rounding problems, for example if you have completion defined as LONG or FLOAT. You can try:

    df$overduetask <- ifelse(as.integer(df$completion) == 1 & df$day > df$deadline, 1,0)
    

    I hope it helps