Solved – How is the support calculated using hash trees for Apriori algorithm

aprioriassociation-rules

I started studying association rules and specially the Apriori algorithm through this free chapter.

I understood most of the points in relation with this algorithm except the one on how to build the hash tree in order to optimize support calculation.

I tried to find some clear explanation through google without results.

Can some body explain this point to me please?

Best Answer

For example fig 6.11:

Hash function

  • Hash(1,4,7) = Left
  • Hash(2,5,8) = Middle
  • Hash(3,6,9) = Right

If root transaction: {1 4 5}, {1 2 4}, {4 5 7}, {1 2 5}, {4 5 8}, how to build the hash tree:

step1: {1 4 5} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 1, not full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635636_6180.png

step2: {1 2 4} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 2, not full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635656_3366.png

step3: {4 5 7} use the first element '1' to hash, hash(4) = Left. Count of Root-Left is 3, full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635673_7280.png

step3: {1 2 5} use the first element '1' to hash, hash(1) = Left. Count of Root-Left is 4, over. => use the second element to spilt,

Now, Root-Left transaction => {1 4 5}, {1 2 4}, {4 5 7}, {1 2 5}, we use the second element to spilt

step3-1: {1 4 5} use the second element '4' to hash, hash(4) = Left. Count of Root-Left-Left is 1, not full.

step3-2: {1 2 4} use the second element '2' to hash, hash(2) = Middle. Count of Root-Left-Middle is 1, not full.

step3-3: {4 5 7} use the second element '5' to hash, hash(5) = Middle. Count of Root-Left-Middle is 2, not full.

step3-4: {1 2 5} use the second element '2' to hash, hash(2) = Middle. Count of Root-Left-Middle is 3, full.

http://fanli7.net/uploads/allimg/2012-06-02/1338635687_7115.png

After splitting on root-left, we go back to root. Now, root transaction: {4 5 8}

step4: {4 5 8} use the first element '4' to hash, hash(4) = Left. Root-Left is splitted, so use the second element '5' to hash again, hash(5) = Middle, Count of Root-Left-Middle is 4, over. => use the third element to spilt ... And so on ...

Related Question