My str(df) looks like below :
> str(categoricalVar)
'data.frame': 56660 obs. of 10 variables:
$ FavouriteSource : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
$ FavouriteSource30 : Factor w/ 3 levels "App","LF","None": 1 1 3 3 3 1 3 3 3 3 ...
$ FavouriteSource90 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ FavouriteSource180 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ FavouriteSource360 : Factor w/ 3 levels "App","LF","None": 3 3 3 3 3 3 3 3 3 3 ...
$ Favorite_GameBin : Factor w/ 594 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 262 163 388 378 378 220 253 378 378 378 ...
$ Favorite_GameBin30 : Factor w/ 309 levels "1-2-3 Dora!",..: 191 191 191 191 191 191 191 191 191 191 ...
$ Favorite_GameBin90 : Factor w/ 332 levels "1-2-3 Dora!",..: 206 206 206 206 206 206 206 206 206 206 ...
$ Favorite_GameBin180: Factor w/ 363 levels "1-2-3 Dora!",..: 226 226 226 226 226 226 226 226 226 226 ...
$ Favorite_GameBin360: Factor w/ 449 levels " Team Umizoomi: Street Fair Fix -Up (Explorer)",..: 283 283 283 283 283 283 283 283 283 283 ...
>
I'm trying to dummify them but, it throws an error as below :
> categoricalVar_dummy <- dummy(categoricalVar)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
What Am I doing wrong ?
Here's two solution using the dummies
package. I can't see from your question if the dummy
calls is from the dummies
package. Regardless,
first some data,
categoricalVar <- data.frame(
FavouriteSource = c('bar', 'foo', 'foo', 'foobar', 'foo', 'foo'),
FavouriteSource30 = c('A', 'C', 'C', 'B', 'B', 'A')); categoricalVar
#> FavouriteSource FavouriteSource30
#> 1 bar A
#> 2 foo C
#> 3 foo C
#> 4 foobar B
#> 5 foo B
#> 6 foo A
then load the dummies
library
# install.packages(c("dummies"), dependencies = TRUE)
library(dummies)
and here the dummy.data.frame()
method to get dummies,
dummy.data.frame(categoricalVar)
#> FavouriteSourcebar FavouriteSourcefoo FavouriteSourcefoobar FavouriteSource30A
#> 1 1 0 0 1
#> 2 0 1 0 0
#> 3 0 1 0 0
#> 4 0 0 1 0
#> 5 0 1 0 0
#> 6 0 1 0 1
#> FavouriteSource30B FavouriteSource30C
#> 1 0 0
#> 2 0 1
#> 3 0 1
#> 4 1 0
#> 5 1 0
#> 6 0 0
or as Sathish suggest in the comment above,
lapply(categoricalVar, dummy)
#> $FavouriteSource
#> categoricalVarbar categoricalVarfoo categoricalVarfoobar
#> [1,] 1 0 0
#> [2,] 0 1 0
#> [3,] 0 1 0
#> [4,] 0 0 1
#> [5,] 0 1 0
#> [6,] 0 1 0
#>
#> $FavouriteSource30
#> categoricalVarA categoricalVarB categoricalVarC
#> [1,] 1 0 0
#> [2,] 0 0 1
#> [3,] 0 0 1
#> [4,] 0 1 0
#> [5,] 0 1 0
#> [6,] 1 0 0