Search code examples
rmatrixcontingency

Contingency Matrix in R


I am trying to construct a contingency matrix for instances between a caller and callee. I am just having issues because my variable caller_id contains values that are 5 numbers in length; however, I need to separate the values based on if they begin with 1, 2, or 3. For example, my data is of the pattern:

CALLER         CALLEE
12345            1
23456            1
35643            2

Where the prefix of Caller and the value for Callee could be 1, 2, or 3, representing 1 for of white ethnicity, 2 for of black ethnicity, and 3 for unknown. I need to then create a contingency matrix such as:

              White Caller     Black Caller
White Callee    # of calls    # of calls
Black Callee    # of calls    # of calls
Unknown Callee  # of calls    # of calls

If anyone has any advice on how I could go about separating the values and creating the matrix, it would be much appreciated. Thank you in advance.


Solution

  • With base R you may use

    with(df, table(CALLER = substr(CALLER, 0, 1), CALLEE))
    #       CALLEE
    # CALLER 1 2
    #      1 1 0
    #      2 1 0
    #      3 0 1
    

    where substr(df$CALLER, 0, 1) extracts the first digit from df$CALLER (see ?substr) and then table gives the contingency table.