Search code examples
rlapplymatchingmahalanobis

MatchIt combined with lapply(): Error in eval(object$call$data, envir = env) : object 'x' not found


So my situation is the following: I have a large dataframe which contains the data I should use in matching analyses. I should, however, match inside subgroups that are defined by certain areas. Because I didn't want to do that "manually" for each subgroup (there are too many), I came up with an approach that divides the initial dataframe into sub-dataframes containing information of each unique treated area and the control areas, and saves these dataframes into a list. After this, I performed matching on the dataframes in the list using matchit function from R's MatchIt package. Here a heavily simplified example of how the dataframe list looks like:

> list_df
$A
   name treatment     cov1     cov2 cov3       var
1     A         1 13.65933 200.5809   13 1000.1185
2     A         1 15.80334 233.8301   13 1010.1038
3     A         1 15.16098 215.1046   13  999.8548
4     A         1 16.45487 185.4957   13  997.5585
5     A         1 15.55230 193.5955   13 1001.2822
9     U         0 16.33895 175.6502   13  999.0682
10    U         0 18.05787 197.6041   13 1003.2781
11    U         0 14.29088 229.5446   13 1002.9567
12    U         0 16.32195 238.9975   13  998.9453
13    U         0 15.25240 217.5467   13 1004.0581
14    U         0 14.69154 219.9963   13  999.3270
15    U         0 14.88606 153.6038   15  989.6423
16    U         0 14.34472 212.5205   15  994.6094
17    U         0 14.66233 231.1179   15  999.7775
18    U         0 14.69155 240.4084   15  994.9280
19    U         0 15.63663 198.3323   10 1007.4225
20    U         0 15.19980 183.5846   10  997.6229

$B
   name treatment     cov1     cov2 cov3       var
6     B         1 15.66004 187.1542   15 1004.2311
7     B         1 13.89696 197.5548   15  995.6478
8     B         1 16.17403 204.9423   15 1001.5157
9     U         0 16.33895 175.6502   13  999.0682
10    U         0 18.05787 197.6041   13 1003.2781
11    U         0 14.29088 229.5446   13 1002.9567
12    U         0 16.32195 238.9975   13  998.9453
13    U         0 15.25240 217.5467   13 1004.0581
14    U         0 14.69154 219.9963   13  999.3270
15    U         0 14.88606 153.6038   15  989.6423
16    U         0 14.34472 212.5205   15  994.6094
17    U         0 14.66233 231.1179   15  999.7775
18    U         0 14.69155 240.4084   15  994.9280
19    U         0 15.63663 198.3323   10 1007.4225
20    U         0 15.19980 183.5846   10  997.6229

In the real data, I have seven covariates, two of which are matched using exact method.

Here code for matching combining matchit (with Mahalanobis distance) and lapply:

library(MatchIt)

m_obj_Mah <- lapply(area_list,
                    function(x){
                                matchit(Treatment ~ Cov1 + Cov2 + Cov3 + Cov4 + Cov5, 
                                data=x, method="nearest", exact = ~ Cov6 + Cov7, distance="mahalanobis")
                               }
                   )

In the code above, everything works fine. However, when I try to extract the matched datasets, I get the error:

m_data_Mah <- lapply(m_obj_Mah,
              function(x) {match.data(x)})

Error in eval(object$call$data, envir = env) : object 'x' not found

Weirdest thing here is that I used the same approach to do nearest neighbour propensity score matching with calipers in the same dataset and the error didn't appear. The error apparently has something to do with defining the function using x as a name for each df in lapply, but I can't come up with a solution (either looping through the areas in another way or defining the x in lapply somehow differently). Any suggestions?

And sorry that I didn't provide any data. It would be quite complicated to generate a realistic dataset and I cannot share the original. I can try to come up with some kind of a dummy dataset if it's absolutely necessary.


Solution

  • Please see this issue, which asks the same question, and the documentation for match.data(), which answers it (see the data argument).

    This is an inherent limitation of match.data(), but the solution is simple and documented: supply the original dataset to the data argument of match.data(), as so:

    m_data_Mah <- lapply(seq_along(area_list), function(i) {
       match.data(m_obj_Mah[[i]], data = area_list[[i]])}
    

    If you are using version 4.2.0 or higher of MatchIt, using exact will automatically match within subgroups of the exact matching variables (i.e., it will perform separate matching procedures within each one) when using method = "nearest". Setting verbose = TRUE will show which level is currently being matched. You can also use the new rbind() method to combine the matched datasets together (in older versions, you will create statistical errors by using rbind()).