Is it possible to analyze items' performance from different exams using the same questions but with a different order in each version?

We produced five exams2nops exams using the same groups of items, with randomized order. All of them were schoice items. As such, five different *.rds files were obtained, each of them will be used with the correspondent scanned exams. I noticed that in those *.rds files to be used in the nops_eval there is the information about the *.rmd which as used to produce that exams' question. E.g.:

However, after producing the the nops_eval.csv that information is lost. I would like to merge all five nops_eval.csv files using the *.rmd information to match each question. Since the same question (e.g. exercise 22) can be genareted from the different *.rmd files. All the same 22 *.rmd files were used in all exams (all have the same 22 questions but with different orders.

I would like to obtain a data frame with the merged csv to allow me to conduct Item Response Theory and Rasch modeling analysis.

Solution

Yes you can merge the information from the CSV files by reordering them based on the file/name information from the RDS files. Below I illustrate how to do this using the check.* columns from the CSV files. These are typically closest to what I need for doing an IRT analysis.

First, you read the CSV and RDS from the first version of the exam:

eval1 <- read.csv2("/path/to/first/nops_eval.csv", dec = ".")
metainfo1 <- readRDS("/path/to/first/exam.rds")

Then, you only extract the check.* columns and use the exercise file names as column names.

eval1 <- eval1[, paste0("check.", 1:length(metainfo1[[1]]))]
names(eval1) <- sapply(metainfo1[[1]], function(x) x$metainfo$file)

I'm using $file here because it is always unique across exercises. If $name is also unique in your case and has the better labels, you can also use that instead.

Then you do the same for the second version of the exam:

eval2 <- read.csv2("/path/to/second/nops_eval.csv", dec = ".")
metainfo2 <- readRDS("/path/to/second/exam.rds")
eval2 <- eval2[, paste0("check.", 1:length(metainfo2[[1]]))]
names(eval2) <- sapply(metainfo2[[1]], function(x) x$metainfo$file)

If the same exercises have been used in the construction of the two version, the column names of eval1 and eval2 are the same, just in a different order. Then you can simply do

eval2 <- eval2[, names(eval1)]

to reorder the columns of eval2 to match those of eval1. Subsequently, you can do:

eval <- rbind(eval1, eval2)

If you have more than two versions of the exam, you just iterate the same code and rbind() everything together in the end.

Similar code can also be used if the exercises are just partially overlapping between the versions of the exercise. In that case I first construct a large enough NA matrix with the merged exercise file names and then insert the results:

n1 <- nrow(eval1)
n2 <- nrow(eval2)
nam <- unique(c(names(eval1), names(eval2)))
eval <- matrix(NA, nrow = n1 + n2, ncol = length(nam))
colnames(eval) <- nam
eval[1:n1, names(eval1)] <- as.matrix(eval1)
eval[(n1 + 1):(n1 + n2), names(eval2)] <- as.matrix(eval2)

Again you would need to iterate suitably to merge more than two versions.

In either case the resulting eval could then be processed further to become the IRT matrix for subsquent analysis.