Search code examples
rspss

AUTORECODE from SPSS to R


I want to write a function that is doing the same as the SPSS command AUTORECODE.

AUTORECODE recodes the values of string and numeric variables to consecutive integers and puts the recoded values into a new variable called a target variable.

At first I tried this way:

AUTORECODE <- function(variable = NULL){
    A <- sort(unique(variable))
    B <- seq(1:length(unique(variable)))
    REC <- Recode(var = variable, recodes = "A = B")
    return(REC)
}

But this causes an error. I think the problem is caused by the committal of A and B to the recodes argument. Thats why I tried

eval(parse(text = paste("REC <- Recode(var = variable, recodes = 'c(",A,") = c(",B,")')")))

within the function. But this isn´t the right solution.

Ideas?


Solution

  • factor may be simply what you need, as James suggested in a comment, it's storing them as integers behind the scenes (as seen by str) and just outputting the corresponding labels. This may also be very useful as R has lots of commands for working with factors appropriately, such as when fitting linear models, it makes all the "dummy" variables for you.

    > x <- LETTERS[c(4,2,3,1,3)]
    > f <- factor(x)
    > f
    [1] D B C A C
    Levels: A B C D   
    
    > str(f)
     Factor w/ 4 levels "A","B","C","D": 4 2 3 1 3
    

    If you do just need the numbers, use as.integer on the factor.

    > n <- as.integer(f)
    > n
    [1] 4 2 3 1 3
    

    An alternate solution is to use match, but if you're starting with floating-point numbers, watch out for floating-point traps. factor converts everything to characters first, which effectively rounds floating-point numbers to a certain number of digits, making floating-point traps less of a concern.

    > match(x, sort(unique(x)))
    [1] 4 2 3 1 3