Association analysis isn't a technique I really use very often so hopefully someone else can chime in, but I think the approach you're using makes sense. The real question is: how do you want to define "difference" between your models? If your main concern is lift, then I believe your approach makes sense. You might also want to try measuring the difference in confidence and support as well, so you could have three sets of rules with large "differences" to explore (different lift, different support, and different confidence.
It also might be worthwhile to develop a metric that combines the differences in lift, confidence, and support. If you value one of these statistics more than the others (looks like you're mainly interested in lift) you could up/down weight particular statistics in your metric, i.e. take a weighted average. If you're going to combine these statistics, you should consider rescaling your values based on the max value each statistic achieves in a given model or across both models. You'll really have to play around a bit to decide what works best for you.
One problem I see with your approach is that you are filtering out rules below a threshhold from your models: it's very likely that some rules will appear in one model above your threshholds but not the other. As a consequence, rules that are the most different between your two models probably won't appear in your calculations at all. Perhaps one rule has very high support, confidence and lift in one model but negligible support, confidence and lift in the other. Theoretically, this is precisely the kind of rule you are trying to target, but you won't be able to calculate your difference metric at all (if I understand your stated process correctly).
Here's how I would recommend modifying your procedure:
Instead of removing rules below a threshhold from the rulesets for both models, retain all the rules in each ruleset but for the purposes of your calculation only consider rules that are above your given threshholds in at least one of your models. This way, you will target your most powerful rules, but still be able to calculate the difference between your two models in the case that some rules have very low support, confidence, or lift in one model but not the other. Alternatively, at the very least you should ensure that you calculate the appropriate statistics for the rules that appear in each model above your threshold (i.e. trim down the ruleset for each model, but then for all rules that appear in model A but not B, calculate the statistics for those rules in model B).
It's probable that this is already you're approach and I misunderstood the description you gave of your process, in which case my only suggested modification is considering metrics that take the other statistics of interest into account. I felt it was worth pointing out in case your approach was overly naive.
The two are combined to help find "interesting" rules. As you know,
$$
\newcommand{\Kulczynski}{{\rm Kulczynski}}
\newcommand{\support}{{\rm support}}
\Kulczynski = \frac{1}{2}\big(P(A|B) + P(B|A)\big)
$$
If Kulczynski is near 0 or 1, then we have an interesting rule that is negatively or positively associated respectively. If Kulczynski is near 0.5, then we may or may not have an interesting rule. We can have
$$
\Kulczynski = \frac{1}{2}\big(0.5 + 0.5\big) = 0.5
$$
Also, as in your case, we might also have
$$
\Kulczynski = \frac{1}{2}\big(0.863 + 0.012) = 0.4375
$$
While some people might consider these both uninteresting, others might want to know about this. To differentiate between the two situations, we can look at Imbalance Ratio where 0 is perfectly balanced and 1 is very skewed.
$$
IR = \frac{\big|\support(A) - \support(B)\big|}{\support(A) + \support(B) - \support(A \cup B)}
$$
So completely uninteresting rules would have both $\Kulczynski=0.5$ and $IR=0$.
Best Answer
The existing answer explains how the table is calculated. If you are still confused, one way to look at it is to start with the number of people who bought things.
Say 100 people visited the cafe, and 36 bought coffee, 18 bought pie, and 8 bought both. Then this is how the numbers in your table are calculated, using the formulas given by b-r-oleary:
Out of 18 people who bought pie, 8 also bought coffee, so the confidence is 8/18. But out of 36 people who bought coffee, only 8 also bought pie, so the confidence is 8/36.
The numbers in bold are the ones which aren't necessarily equal. This is just a consequence of how they are defined. The names "support", "lift" etc. are just names, which hopefully hint at how the numbers should be interpreted.