Search code examples
rstatisticsmethodology

Creating non-random matched pairs


I am looking for an R package that would allow me to match match each subject in a treatment group to a subject in the general population that has similar characteristics (age, gender, etc).


Solution

  • I use the MatchIt package for doing this type of thing. You may receive advice to use propensity score matching, but there are limitations to that widely used approach (see: PS Not)

    library(MatchIt)   # use for matching
    library(tidyverse) # The overall package.  It will load lots of dependencies
    
    set.seed(950)
    n.size <- 1000
    
    # This creates a tibble (an easier to use version of a data frame)
    myData <- tibble(
    a = lubridate::now() + runif(n.size) * 86400,
    b = lubridate::today() + runif(n.size) * 30,
    ID = 1:n.size,
    #   d = runif(1000),
    ivFactor = sample(c("Level 1", "Level 2", "Level 3", "Level 4" ), n.size, replace = TRUE),
    age = round(rnorm(n = n.size, mean = 52, sd = 10),2),
    outContinuous = rnorm(n = n.size, mean = 100, sd = 10),
    tmt = sample(c(1,0), size = n.size, prob = c(.3, .7), replace = TRUE)
    )
    
    # Using matching methods suggestions found in Ho, Imai, King and Stuart 
    myData.balance <- matchit(tmt~age + ivFactor, data = myData, method = "nearest", distance = "logit")
    
    # Check to see if the matching improved balance between treatment and control group
    summary(myData.balance)
    
     # Extract the matched data.  Now we can use this in subsequent analyses
     myData.matched <- match.data(myData.balance)