Search code examples
rdata-sciencedummy-variable

dummification throws an error.'x' must be atomic for 'sort.list'


My str(df) looks like below :

> str(categoricalVar)
'data.frame':   56660 obs. of  10 variables:
 $ FavouriteSource    : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
 $ FavouriteSource30  : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
 $ FavouriteSource90  : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
 $ FavouriteSource180 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
 $ FavouriteSource360 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
 $ Favorite_GameBin   : Factor w/ 594 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 262 163 388 378 378 220 253 378 378 378 ...
 $ Favorite_GameBin30 : Factor w/ 309 levels "1-2-3 Dora!",..: 191 191 191 191 191 191 191 191 191 191 ...
 $ Favorite_GameBin90 : Factor w/ 332 levels "1-2-3 Dora!",..: 206 206 206 206 206 206 206 206 206 206 ...
 $ Favorite_GameBin180: Factor w/ 363 levels "1-2-3 Dora!",..: 226 226 226 226 226 226 226 226 226 226 ...
 $ Favorite_GameBin360: Factor w/ 449 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 283 283 283 283 283 283 283 283 283 283 ...
> 

I'm trying to dummify them but, it throws an error as below :

> categoricalVar_dummy <- dummy(categoricalVar)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

What Am I doing wrong ?


Solution

  • Here's two solution using the dummies package. I can't see from your question if the dummy calls is from the dummies package. Regardless,

    first some data,

    categoricalVar <- data.frame(
              FavouriteSource = c('bar', 'foo', 'foo', 'foobar', 'foo', 'foo'),
              FavouriteSource30 = c('A', 'C', 'C', 'B', 'B', 'A')); categoricalVar
    #>   FavouriteSource FavouriteSource30
    #> 1             bar                 A
    #> 2             foo                 C
    #> 3             foo                 C
    #> 4          foobar                 B
    #> 5             foo                 B
    #> 6             foo                 A
    

    then load the dummies library

    # install.packages(c("dummies"), dependencies = TRUE)
    library(dummies)
    

    and here the dummy.data.frame() method to get dummies,

    dummy.data.frame(categoricalVar)
    #>   FavouriteSourcebar FavouriteSourcefoo FavouriteSourcefoobar FavouriteSource30A
    #> 1                  1                  0                     0                  1
    #> 2                  0                  1                     0                  0
    #> 3                  0                  1                     0                  0
    #> 4                  0                  0                     1                  0
    #> 5                  0                  1                     0                  0
    #> 6                  0                  1                     0                  1
    #>   FavouriteSource30B FavouriteSource30C
    #> 1                  0                  0
    #> 2                  0                  1
    #> 3                  0                  1
    #> 4                  1                  0
    #> 5                  1                  0
    #> 6                  0                  0
    

    or as Sathish suggest in the comment above,

    lapply(categoricalVar, dummy)
    #> $FavouriteSource
    #>      categoricalVarbar categoricalVarfoo categoricalVarfoobar
    #> [1,]                 1                 0                    0
    #> [2,]                 0                 1                    0
    #> [3,]                 0                 1                    0
    #> [4,]                 0                 0                    1
    #> [5,]                 0                 1                    0
    #> [6,]                 0                 1                    0
    #> 
    #> $FavouriteSource30
    #>      categoricalVarA categoricalVarB categoricalVarC
    #> [1,]               1               0               0
    #> [2,]               0               0               1
    #> [3,]               0               0               1
    #> [4,]               0               1               0
    #> [5,]               0               1               0
    #> [6,]               1               0               0