Search code examples
rdataframevariablesnastring-concatenation

How to add a suffix to multiples variables without considering NA?


For this table is necessary add for every variable _T without considering NA.

T1:

var1        var2     var3
Argentina   Italy     NA 
Mexico      Chile     NA
France      Hungary   NA
Spain       UK        NA

I tried with this code:

o_cols <- c("var1", "var2", "var3")
out_cols <- paste0(o_cols, "_T")
output <- data.table (data_base)
output[, c(out_cols) := lapply(.SD, function(x){paste0(x, "_T")}), .SDcols 
= o_cols]

var_cols <- paste0(o_cols, "_value")

The problem that is all the variables including "NA" are added _T.

the final result have to look like this:

    var1_value      var2_value        var3

    Argentina_T     Italy_T             NA 
    Mexico_T        Chile_T             NA
    France_T        Hungary_T           NA
    Spain_T         UK_T                NA

Solution

  • Almost. You can add an ifelse() to your lapply().

    > result
           var1_T1    var2_T1 var3_T1
    1 Argentina_T1   Italy_T1      NA
    2    Mexico_T1   Chile_T1      NA
    3    France_T1 Hungary_T1      NA
    4     Spain_T1      UK_T1      NA
    

    Code

    # Paste each value as "value_T1" if it is not NA 
    result <- data.frame(lapply(df, function(x) ifelse(!is.na(x), paste0(x, "_T1"), x)), 
                         stringsAsFactors = FALSE)
    # Convert each column name to "name_T1"
    colnames(result) <- paste0(colnames(result), "_T1")
    

    Data

    df <- read.table(text = "var1        var2     var3
    Argentina   Italy     NA 
    Mexico      Chile     NA
    France      Hungary   NA
    Spain       UK        NA", header = TRUE, as.is = TRUE)
    

    Open question: With what type of logic do you want to convert the column names? Is one NA enough to not apply the transformation? Do all values have to be NA to not do it?