Search code examples
rmergecomparisonmissing-datar-factor

How to compare two R data frames to find missing factor-level?


I have two dataframes in R that look like the following:

Dataframe 1 
    |Var1|Var2|Var3|
    |1   |abc|bla  |
    |2   |abc|bla  |
    |3   |abc|bla  |
    |4   |abc|bla  |
    |5   |abc|bla  |
    |6   |abc|bla  |

Dataframe 2
    |Var1|Var2|Var3|
    |1   |abc|bla  |
    |1   |abc|bla  |
    |2   |abc|bla  |
    |3   |abc|bla  |
    |3   |abc|bla  |
    |4   |abc|bla  |

Var1 is a factor variable in both Dataframes (however, I do not mind transforming it if it helps to solve the issue). Dataframe1 consists of 1070 factor-levels and Dataframe2 of 1069 factor-levels. How can I find out which the missing factor-level is in Dataframe 2?

Thank you


Solution

  • Just take the set difference between the levels of the two factors.

    F1 = factor(c('A', 'B', 'C'))
    F2 = factor(c('B', 'C'))
    
    setdiff(levels(F1), levels(F2))
     [1] "A"