Question: Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70%, and find all the association rules in the data set
0

Subject: Data Mining And Business Intelligence

Topic: Frequent pattern

Difficulty: High

Consider the following transaction database.

TID Items
01 A,B,C,D
02 A,B,C,D,E,G
03 A,C,G,H,K
04 B,C,D,E,K
05 D,E,F,H,L
06 A,B,C,D,L
07 B,I,E,K,L
08 A,B,D,E,K
09 A,E,F,H,L
10 B,C,D,F

Apply the Apriori algorithm with minimum support of 30% and minimum confidence of 70%, and find all the association rules in the data set


dmbi(26) • 12k views
ADD COMMENTlink
modified 18 months ago by gravatar for awari.swati831 awari.swati831250 written 3.0 years ago by gravatar for prachi.sagar prachi.sagar70
0

Step 1: Scan D for count of each candidate. The candidate list is { A,B,C,D,E,F,G,H,I,K,L}

C1=

I=Itemsets Support count
A 6
B 7
C 6
D 7
E 6
F 3
G 2
H 3
I 1
K 4
L 4

Step 2: Compare candidate support count with minimum support count (i.e.3)

L1=

I=Itemsets Support count
A 6
B 7
C 6
D 7
E 6
F 3
H 3
K 4
L 4

Step 3: Generate candidate C2 from L1 and find the support.

C2=

I=Itemsets Support count
A,B 4
A,C 4
A,D 4
A,E 3
A,F 1
A,H 2
A,K 2
A,L 1
B,C 5
B,D 5
B,E 4
B,F 1
B,H 0
B,K 3
B,L 2
C,D 5
C,E 2
C,F 1
C,H 1
C,K 2
C,L 1
D,E 4
D,F 2
D,H 1
D,K 2
D,L 2
E,F 2
E,H 2
E,K 3
E,L 3
F,H 2
F,K 0
F,L 2
H,K 1
H,L 2
K,L 1

Step 4: Compare candidate (C2) support count with the minimum support count

L2=

I =Itemsets Support count
A,B 4
A,C 4
A,D 4
A,E 3
B,C 5
B,D 5
B,E 4
B,K 3
C,D 5
D,E 4
E,K 3
E,L 3

Step 5: Generate candidate C3 from L2 and find the support.

C3=

I =Itemsets Support count
A,B,C 3
A,B,D 4
A,B,E 2
A,B,K 1
A,C,D 3
A,C,E 1
A,D,E 2
B,C,D
B,C,E 2
B,C,K 1
B,D,E 3
B,D,K 2
B,E,K 3
C,E,D 2
D,E,K 2
D,E,L 1

Step 6: Compare candidate (C3) support count with the minimum support count

L3=

I =Itemsets Support count
A,B,C 3
A,B,D 4
A,C,D 3
B,C,D 5
B,D,E 3
B,E,K 3

Step 7: Generate candidate C4 from L3 and find the support.

C4=

I =Itemsets Support count
A,B,C,D 3
A,B,D,E 2
B,C,D,E 2
B,D,E,K 2

Step 8: Compare candidate (C4) support count with the minimum support count

L4=

I =Itemsets Support count
A,B,C,D 3

Step 9: So data contains the frequent itemsets: {A,B,C, D}

Generate the Association rule from frequent itemsets with the support and confidence.

Association Rule Support Confidence Confidence %
BCD => A 3 3/5 60%
ACD =>B 3 3/3 100%
ABD => C 3 3/4 75%
ABC => D 3 3/3 100%
CD =>AB 3 3/5 60%
BD => AC 3 3/5 60%
BC =>AD 3 3/5 60%
AD =>BC 3 3/4 75%
AC => BD 3 3/4 75%
AB =>CD 3 3/4 75%
D => ABC 3 3/7 43%
C =>ABD 3 3/6 50%
B =>ACD 3 3/7 43%
A => BCD 3 3/6 50%

Given minimum confidence threshold is 70%. So only ACD => B, ABD => C, ABC => D, AD =>BC, AC =>BD, AB =>CD rules are output.

Final rules are:

Rule 1: ACD => B

Rule 2: ABD => C

Rule 3: ABC => D

Rule 4: AD =>BC

Rule 5: AC =>BD

Rule 6: AB =>C

ADD COMMENTlink
modified 3.0 years ago by gravatar for Ramnath Ramnath3.7k written 3.0 years ago by gravatar for prachi.sagar prachi.sagar70

How minimum support is 3 for 30%? Can you show me calculation?

ADD REPLYlink
written 19 months ago by gravatar for prachi.sagar prachi.sagar70
Please log in to add an answer.