How to remove questions with repeated items?

Frequently, when generating questions based on parameters, some generated questions have to be eliminated because, within such a questions, it sometimes happens that some items are the same.

My code is the following.

The question:

```{r, include = FALSE}
a <- sample(1:1,1)
b <- sample(1:1,1)
```

Question
========

Let $z = `r (a^2)*(b^2)`$. Hence, $\sqrt{z}$ is equal to:

Answerlist
----------

* $`r a*b`$
* $`r -a*b`$
* $`r 2*a*b`$
* $`r 3*a*b`$
* $`r 4*a*b`$

Meta-information
================
exname: My question
extype: schoice
exsolution: 10000
exshuffle: TRUE

The code to generate the several versions of the question:

library(exams)

setwd("/tmp/")

expargrid <- function(file, ...) {
  df <- expand.grid(...)
  stopifnot(nrow(df) >= 1L)
  sapply(1L:nrow(df), function(i) {
    args <- as.list(df[i,])
    args <- c(list(file = file), args)
    do.call(exams::expar, args)
  })
}

n <- 1

myquestions <- expargrid(paste0("question",sprintf("%02d", n),".Rmd"), a = 0:1, b = 0:1)

exams2moodle(myquestions,dir = "/tmp/", schoice=list(answernumbering="none"), name="PM")

Can the bad questions be eliminated automatically from the Moodle xml file generated by R/Exams?

Solution

In general it is hard to catch such problems outside of the Rmd exercise files. Instead it would be better to write the R code in the exercise in such a way that it assures that the question list only contains unique items - and if that is not the case to keep on re-sampling the parameters until there is a version that works.

Part of the problem to catch the problem in general is also that it might depend on the random number generator and its seed because the problem might just occur very rarely.

In your special case, this is a bit different because you make a full grid of all possible combinations so that each exercise file does not have any random elements anymore. Here the best solution is to run an exams2xyz() interface (or the underlying xexams() function) once, inspect the output, eliminate the problematic exercise, and then run the desired exams2xyz() interface again.

Relying on your myquestions vector with four static variants of the dynamic question01.Rmd exercise template you could do:

myq_check <- xexams(myquestions)
## Warning messages:
## 1: In driver$read(file_tex[idj]) :
##   duplicated items in question list in '_tmp_Rtmpjhh2Lb_question01+60A32D0E+C9FE1'
## 2: In driver$read(file_tex[idj]) :
##   duplicated items in question list in '_tmp_Rtmpjhh2Lb_question01+60A32D0E+CA2C6'
## 3: In driver$read(file_tex[idj]) :
##   duplicated items in question list in '_tmp_Rtmpjhh2Lb_question01+60A32D0E+CA4CF'

This is similar to running exams2moodle(myquestions) but only weaves the Rmd files and reads them into R - without converting them to HTML and writing a Moodle XML file. Hence it's a bit faster and does not produce any files that would need to be cleaned up afterwards.

The output myq_check is a nested list with

1 list for n = 1 random replication,
each containing 4 lists pertaining to the four exercise files from myquestions,
each containing 6 list elements with
- question text,
- questionlist with the question items,
- solution text (if any, empty in this case),
- solutionlist with the solution explanations (if any, empty here),
- metainfo with the meta-information,
- supplements with the file paths of supplementary files (if any, empty here).

Running xexams(myquestions) already warns about problems in three (ot of the four) exercise files. (You need to use R/exams version >= 2.4-0 to get these.) By inspecting the number of unique items in the questionlist we can find out which are affected by this:

ok <- sapply(myq_check[[1]], function(x) {
  length(x$questionlist) == length(unique(x$questionlist)) })
## exercise1 exercise2 exercise3 exercise4 
##     FALSE     FALSE     FALSE      TRUE

Thus, only the last exercise in myquestions is really suitable so you should subset before proceeding with exams2moodle():

myquestions <- myquestions[ok]
exams2moodle(myquestions)

As already pointed out above. In this case this is sufficient to make the selection. If there is remaining randomness in the exercise files, this might not be enough to catch all problems. Then it would be better to program a custom solution into the Rmd exercise.