Search code examples
rbooleandata-cleaningdata-conversion

How to convert string booleans expressed in local language into boolean


I have a large dataset collected in local language, where each bolean column use the word "PRAWDA" for TRUE and "FAŁSZ" for FALSE.

So far, the dataframe stores those values as a column with factors with 2 levels: "FAŁSZ" and "PRAWDA". My aim is to have the column which will be full of booleans, not string values.

How could I convert all the booleans expressed in Polish into English ones?


Solution

  • If it is just a boolean, use == to check for 'PRAWDA', which returns TRUE for those cases that match 'PRAWDA' and FALSE otherwise. As there is only two values, this should be sufficient

    df1$col2 <- df1$col1 == 'PRAWDA'
    

    If there are multiple columns, use a loop

    df1[] <- lapply(df1, function(x) if(all(x %in% c('PRAWDA', 'FALSZ'), na.rm = TRUE)) x == 'PRAWDA' else x)