Search code examples
rdata.tabletime-seriesdummy-variable

Dummyfication of a column/variable


I'm designing a neural Network in R. For that I have to prepare my data and have imported a table.

For example:

      time    hour Money day
1:  20000616    1  9.35   5
2:  20000616    2  6.22   5 
3:  20000616    3  10.65  5
4:  20000616    4  11.42  5
5:  20000616    5  10.12  5
6:  20000616    6  7.32   5

Now I need a dummyfication. My final table should look like this:

      time    Money day  1   2   3   4   5   6   
1:  20000616  9.35   5   1   0   0   0   0   0
2:  20000616  6.22   5   0   1   0   0   0   0
3:  20000616  10.65  5   0   0   1   0   0   0
4:  20000616  11.42  5   0   0   0   1   0   0
5:  20000616  10.12  5   0   0   0   0   1   0
6:  20000616  7.32   5   0   0   0   0   0   1

Is there an easy way/smart way to transform my table into the new layout? Or programmatically in R? I need to do this in R, not before the Import.

Thanks in advance


Solution

  • You can easily make dummy variables by using the dummies package.

    library(dummies)
    
    df <- data.frame(
      time = c(20000616, 20000616, 20000616, 20000616, 20000616, 20000616), 
      hour = c(1, 2, 3, 4, 5, 6), 
      Money = c(9.35, 6.22, 10.65, 11.42, 10.12, 7.32), 
      day = c(5, 5, 5, 5, 5, 5))
    
    # Specify the categorical variables in the dummy.data.frame function.
    df_dummy <- dummy.data.frame(df, names=c("hour"), sep="_")
    names(df_dummy) <- c("time", 1:6, "Money", "day")
    df_dummy <- df_dummy[c("time", "Money", "day", 1:6)]
    df_dummy
    # time Money day 1 2 3 4 5 6
    # 1 20000616  9.35   5 1 0 0 0 0 0
    # 2 20000616  6.22   5 0 1 0 0 0 0
    # 3 20000616 10.65   5 0 0 1 0 0 0
    # 4 20000616 11.42   5 0 0 0 1 0 0
    # 5 20000616 10.12   5 0 0 0 0 1 0
    # 6 20000616  7.32   5 0 0 0 0 0 1