Search code examples
rloopsfor-loopcrosstab

Is there a way to make 2-way tables with different pairs of variable in R?


I am cleaning a data of a numeracy test.

Some test items are multiple-choice items, where students choose one of the choices (e.g. a), b), or c)).

In the dataset, I made new variables by converting the items into binary variables. For example, if the correct answer is a) for Item1, I made newItem_1 by recoding a) = 1 and otherwise = 0 (NA is left as it is).

I would like to double check if the re-coding is done successfully by table-ing the original and new variables. Doing this one pair only (in this case Item1 and newItem_1) is easy, but since I have a lot of these multiple-choice items, it's not efficient to write a script to table each pair one by one.

Here's my question: is there any way to make 2-way tables with each pair of these original and new variables? I tried to do this by for loop and looked for tips online, but couldn't find a solution so far.

I extracted part of the dataframe below.

structure(list(ID = 1:20, gender = c("Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
"Male", "Male", "Male", "Male", "Female", "Female", "Female", 
"Female", "Female"), Item1 = c("c", "c", "a", "a", NA, "c", "c", 
"b", "b", "b", "c", "c", NA, "c", "a", "d", "c", "c", "c", "c"
), Item2 = c("d", "d", "d", "d", "d", "a", "a", "a", "a", "b", 
"b", "c", "c", "c", "c", "d", NA, NA, "d", "d"), Item3 = c("b", 
"d", NA, "a", NA, "d", "c", "c", NA, "d", "c", NA, NA, "c", "d", 
"c", "d", "d", "d", "d"), new_Item1 = c(1L, 1L, 0L, 0L, NA, 1L, 
1L, 0L, 0L, 0L, 1L, 1L, NA, 1L, 0L, 0L, 1L, 1L, 1L, 1L), new_Item2 = c(1L, 
1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, NA, 
NA, 1L, 1L), new_Item3 = c(0L, 0L, NA, 0L, NA, 0L, 1L, 1L, NA, 
0L, 1L, NA, NA, 1L, 0L, 1L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, 
-20L))

Many thanks in advance.

Shun

For a pair, I just type: library(janitor) tabyl (g3, Item1, new_Item1) and I can see my recoding is correct. But I want to loop the same tabulation through Item1, 2 and 3 (and more) in this case. So my expected output would be something like (if I use tabyl):
-------------------
Item1 1 0 NA
a # # #
b # # #
c # # #
d # # #
NA # # #

Item2 1 0 NA
a # # #
b # # #
c # # #
d # # #
.....
----------------------
I hope my explanation is clear.


Solution

  • You can get the column names in a variable and use Map to loop over each pair and return the comparison table.

    library(janitor)
    x <- grep('^Item\\d+$', names(df), value = TRUE)
    y <- grep('^new_Item\\d+$', names(df), value = TRUE)
    
    Map(function(p, q) tabyl(df, .data[[p]], .data[[q]]), x, y)
    
    #$Item1
    # Item1 0  1 NA_
    #     a 3  0   0
    #     b 3  0   0
    #     c 0 11   0
    #     d 1  0   0
    #  <NA> 0  0   2
    
    #$Item2
    # Item2 0 1 NA_
    #     a 4 0   0
    #     b 2 0   0
    #     c 4 0   0
    #     d 0 8   0
    #  <NA> 0 0   2
    
    #$Item3
    # Item3 0 1 NA_
    #     a 1 0   0
    #     b 1 0   0
    #     c 0 5   0
    #     d 8 0   0
    #  <NA> 0 0   5