Search code examples
rmahalanobis

Subclassification with Mahalanobis distance nearest neighbor matching in R


I am using the MatchIt package to implement nearest neighbor matching with the Mahalonobis distance. After the matching stage, how do I get it to report which control observation was matched with each treatment observation?

The following code does not work and throws the warning "No subclassification with pure Mahalanobis distance."

library("MatchIt")

data("lalonde")

lalonde_matchit_nn <-
  matchit(
    treat ~ age + educ + black + hispan + nodegree + married + re74 + re75,
    baseline.group = 1,
    data = lalonde,
    method = "nearest",
    distance = "mahalanobis",
    subclass = T
  )

Again, what I look for is for the output to have an ID for each pair of treatment and control, just like the subclass reported with other matching methods (e.g., "exact" or "cem").


Solution

  • You are looking for the attributes of the output in this case: output is lalonde_matchit_nn and attributes are nn and match.matrix

    smry<-lalonde_matchit_nn$nn #A basic summary table of matched data (e.g., the number of matched units)
    
    #represent the names of the treatment units, which
    #come from the data frame specified in data. Each column stores the name(s)
    #of the control unit(s) matched to the treatment unit of that row. F
    matchedPool<-lalonde_matchit_nn$match.matrix
    
    

    Now if you look at smry and matched pool from above code:

    smry
              Control Treated
    All           429     185
    Matched       185     185
    Unmatched     244       0
    Discarded       0       0
    
    head(matchedPool)
    
         1        
    NSW1 "PSID375"
    NSW2 "PSID341"
    NSW3 "PSID361"
    NSW4 "PSID345"
    NSW5 "PSID172"
    NSW6 "PSID237"
    
    

    The smry tells the population of each type and matched pool gives you the ID which has matched as per your optimal criteria, in this case, Mahanlobis distance, However the warning message Warning message: No subclassification with pure Mahalanobis distance is telling you that for this method other optimal parameters can be a better choice.

    For more details, it's always good practice to refer the package document, https://cran.r-project.org/web/packages/MatchIt/MatchIt.pdf