Search code examples
rarules

Association rules having same support but different confidence values


I am generating rules from my data and one thing I noticed were a few duplicated rules. These rules have the same support, lift and count values but different confidence and coverage values.

I initially thought this was due to a white space in one of the product names but I have trimmed and cleaned the product info before mining for rules.

#GENERATE RULES
rules1 <- apriori(transactions,
                 parameter = list(
                   sup = supportLevels[3],
                   conf = confidenceLevels[9],
                   minlen = 2,
                   target = "rules"
                 )
)

# VIEW THE ASSOCIATION RULES
inspect(sort(rules1, 
             by = "lift", # sort by strongests to weakest rules
             decreasing = TRUE))

Below you can see the first two rules which are duplicated/symmetrical but have different confidence values.

enter image description here

Unfortunately I can not share my dataset as it's proprietary and I could not replicate with the Groceries dataset in Arules.

Does anyone have an idea why I could get different confidence but same support and lift for these rules?


Solution

  • This follows directly from the definition of the measures for two rules

    X => Y
    Y => X
    

    which are both created from the same frequent item set given by the union of X and Y.

    • Support is calculated on the generating frequent item set, so supp(X => Y) = supp(Y => X) = supp(X and Y)
    • Lift is symmetric so lift(X => Y) = lift(X => Y)
    • Confidence is asymmetric and depends on the support of the left-hand-side. So if supp(X) is different from supp(Y), then conf(X => Y) will be different from conf(Y => X).