Solved – Interpreting association rules correctly

aprioriassociation-rulesdata mining

I working on a problem to identify subgroups within a population. After writing some code to get my data into the correct format I was able to use the apriori algorithm for association rule mining.

When I look at the results I see something like the following:

 rule 1
 0.3  0.7  18x0 -> trt1

rule 2
0.4  0.7  17x0 -> trt1

rule 3
0.3  0.7  16x1 -> trt1

The variables in the group come from how I discretized the data and can be read as follows

(variable name = 17x)(value = 1).

I want to make sure that I'm reading this correctly in that someone that got a response of 0 in group 17 or somebody that got a response of 0 in group 18 would fall into the trt1 group but not people that got a response of 0 in both categories since there is no

18x0 17x0 -> trt1

rule.

Best Answer

When interpreting traditional (frequency-confidence) association rules, it is important to note that the discoveries do not necessarily express positive statistical dependence, but they may also express negative dependence, independence or statistically insignificant positive dependence (that doesn't hold in the population). So, the absence of a certain rule does not mean that the rule antecedent and consequent are not positively associated.

If you want to find positive dependencies in the population, you need to use other search criteria. In principle, it is possible to search first for all association rules with so small minimum frequency and confidence thresholds that no true associations are missed and then filter the results with other measures that estimate the strength and/or significance of the dependence (e.g. leverage, lift, chi^2, mutual information, Fisher's p, etc). Some apriori implementations even offer this option but the choice of measures may be limited. However, this approach is often infeasible, because the number of rules explodes exponentially (the total number of all possible rules is O(2^k) where k is the number of attributes).

There are also efficient algorithms that find only some condensed presentation of all frequent and confident association rules (e.g. all rule where the antecedent is a closed set) and that can be used with minimal thresholds. However, they may be difficult to interpret because they have extra criteria which association rules are presented and you still need to do the filtering afterwards.

A better approach is to use algorithms that search directly with statistical goodness measures without any (or at most minimal) minimum frequency requirements. Such methods are nowadays getting more popular and you can find free source codes in the internet (note that the patterns may be called also classification rules or dependency rules). For a short review of such methods (and a detailed description of one algorithm), see e.g. Hämäläinen, W.: Kingfisher: an efficient algorithm for searching for both positive and negative dependency rules with statistical significance measures. Knowledge and Information Systems: An International Journal (KAIS) 32(2):383-414, 2012 (also https://pdfs.semanticscholar.org/59ff/5cda9bfefa3b188b5302be36e956b717e28e.pdf)