| written 9.4 years ago by | modified 3.9 years ago by |
Consider the following transaction database.
| TID | Items |
|---|---|
| 01 | A,B,C,D |
| 02 | A,B,C,D,E,G |
| 03 | A,C,G,H,K |
| 04 | B,C,D,E,K |
| 05 | D,E,F,H,L |
| 06 | A,B,C,D,L |
| 07 | B,I,E,K,L |
| 08 | A,B,D,E,K |
| 09 | A,E,F,H,L |
| 10 | B,C,D,F |
| written 9.4 years ago by | modified 3.9 years ago by |
| TID | Items |
|---|---|
| 01 | A,B,C,D |
| 02 | A,B,C,D,E,G |
| 03 | A,C,G,H,K |
| 04 | B,C,D,E,K |
| 05 | D,E,F,H,L |
| 06 | A,B,C,D,L |
| 07 | B,I,E,K,L |
| 08 | A,B,D,E,K |
| 09 | A,E,F,H,L |
| 10 | B,C,D,F |
| written 9.4 years ago by | modified 4.2 years ago by |
Step 1: Scan D for count of each candidate. The candidate list is { A,B,C,D,E,F,G,H,I,K,L}
C1=
| I=Itemsets | Support count |
|---|---|
| A | 6 |
| B | 7 |
| C | 6 |
| D | 7 |
| E | 6 |
| F | 3 |
| G | 2 |
| H | 3 |
| I | 1 |
| K | 4 |
| L | 4 |
Step 2: Compare candidate support count with minimum support count (i.e.3)
L1=
| I=Itemsets | Support count |
|---|---|
| A | 6 |
| B | 7 |
| C | 6 |
| D | 7 |
| E | 6 |
| F | 3 |
| H | 3 |
| K | 4 |
| L | 4 |
Step 3: Generate candidate C2 from L1 and find the support.
C2=
| I=Itemsets | Support count |
|---|---|
| A,B | 4 |
| A,C | 4 |
| A,D | 4 |
| A,E | 3 |
| A,F | 1 |
| A,H | 2 |
| A,K | 2 |
| A,L | 1 |
| B,C | 5 |
| B,D | 5 |
| B,E | 4 |
| B,F | 1 |
| B,H | 0 |
| B,K | 3 |
| B,L | 2 |
| C,D | 5 |
| C,E | 2 |
| C,F | 1 |
| C,H | 1 |
| C,K | 2 |
| C,L | 1 |
| D,E | 4 |
| D,F | 2 |
| D,H | 1 |
| D,K | 2 |
| D,L | 2 |
| E,F | 2 |
| E,H | 2 |
| E,K | 3 |
| E,L | 3 |
| F,H | 2 |
| F,K | 0 |
| F,L | 2 |
| H,K | 1 |
| H,L | 2 |
| K,L | 1 |
Step 4: Compare candidate (C2) support count with the minimum support count
L2=
| I =Itemsets | Support count |
|---|---|
| A,B | 4 |
| A,C | 4 |
| A,D | 4 |
| A,E | 3 |
| B,C | 5 |
| B,D | 5 |
| B,E | 4 |
| B,K | 3 |
| C,D | 5 |
| D,E | 4 |
| E,K | 3 |
| E,L | 3 |
Step 5: Generate candidate C3 from L2 and find the support.
C3=
| I =Itemsets | Support count |
|---|---|
| A,B,C | 3 |
| A,B,D | 4 |
| A,B,E | 2 |
| A,B,K | 1 |
| A,C,D | 3 |
| A,C,E | 1 |
| A,D,E | 2 |
| B,C,D | |
| B,C,E | 2 |
| B,C,K | 1 |
| B,D,E | 3 |
| B,D,K | 2 |
| B,E,K | 3 |
| C,E,D | 2 |
| D,E,K | 2 |
| D,E,L | 1 |
Step 6: Compare candidate (C3) support count with the minimum support count
L3=
| I =Itemsets | Support count |
|---|---|
| A,B,C | 3 |
| A,B,D | 4 |
| A,C,D | 3 |
| B,C,D | 5 |
| B,D,E | 3 |
| B,E,K | 3 |
Step 7: Generate candidate C4 from L3 and find the support.
C4=
| I =Itemsets | Support count |
|---|---|
| A,B,C,D | 3 |
| A,B,D,E | 2 |
| B,C,D,E | 2 |
| B,D,E,K | 2 |
Step 8: Compare candidate (C4) support count with the minimum support count
L4=
| I =Itemsets | Support count |
|---|---|
| A,B,C,D | 3 |
Step 9: So data contains the frequent itemsets: {A,B,C, D}
Generate the Association rule from frequent itemsets with the support and confidence.
| Association Rule | Support | Confidence | Confidence % |
|---|---|---|---|
| BCD => A | 3 | 3/5 | 60% |
| ACD =>B | 3 | 3/3 | 100% |
| ABD => C | 3 | 3/4 | 75% |
| ABC => D | 3 | 3/3 | 100% |
| CD =>AB | 3 | 3/5 | 60% |
| BD => AC | 3 | 3/5 | 60% |
| BC =>AD | 3 | 3/5 | 60% |
| AD =>BC | 3 | 3/4 | 75% |
| AC => BD | 3 | 3/4 | 75% |
| AB =>CD | 3 | 3/4 | 75% |
| D => ABC | 3 | 3/7 | 43% |
| C =>ABD | 3 | 3/6 | 50% |
| B =>ACD | 3 | 3/7 | 43% |
| A => BCD | 3 | 3/6 | 50% |
Given minimum confidence threshold is 70%. So only ACD => B, ABD => C, ABC => D, AD =>BC, AC =>BD, AB =>CD rules are output.
Final rules are:
Rule 1: ACD => B
Rule 2: ABD => C
Rule 3: ABC => D
Rule 4: AD =>BC
Rule 5: AC =>BD
Rule 6: AB =>CD
| written 8.0 years ago by | modified 3.7 years ago by |
How minimum support is 3 for 30 %? Can you show me the calculation?
Minimum Support = 30 % => 0.3 * Number of TID = 0.3 * 10 => Minimum Support = 3