Please copy and paste this embed script to where you want to embed

1) a) Define what is “Apriori principle” and briefly discuss why Apriori principle is useful in association rule mining. Apriori Principle:If an item set is frequent, then all of its subsets must also be frequent, Or If an item set is infrequent, then all of its supersets must be infrequent. Apriori principle reduces the number of candidate item sets in an association rule mining process by eliminating the candidates that are infrequent and leaving only those that are frequent. b) Compare and contrast FP-Growth algorithm with Apriori algorithm. Apriori Algorithm Use Apriori property and join and prune property.

FP-Growth Algorithm It constructs conditional frequent pattern tree and conditional pattern base from database which satisfy minimum support. Due to large number of candidates are Due to compact structure and no candidate generated require large memory space. generation require less memory. Multiple scans for generating candidate sets. Database scanning happens twice only. Execution time is higher than FP-Growth Execution time is less than Apriori algorithm. algorithm as time is wasted in producing candidates every time.

2) Consider the market basket transactions given in the following table. Let min_sup = 40% and min_conf = 40%.

a) Find all the frequent item sets using Apriori algorithm. Minimum Support = 40% Minimum Confidence = 40% Transaction ID T1 T2 T3 T4 T5

Items Bought A,B,C A,B,C,D,E A,C,D A,C,D,E A,B,C,D

C1 Item A B C D E

Number of Transactions 5 3 5 4 2

Minimum Support 5/5=100% 3/5=60% 5/5=100% 4/5=80% 2/5=40%

L1 Item A B C D E

Number of Transactions 5 3 5 4 2

C2 Item Pairs A,B A,C A,D A,E B,C B,D B,E C,E C,D E,D

Number of Transactions 3 5 4 2 3 2 1 2 4 2

Minimum Support 3/5=60% 5/5=100% 4/5=80% 2/5=40% 3/5=60% 2/5=40% 1/5=20% 2/5=40% 4/5=80% 2/5=40%

L2 Item Pairs A,B A,C A,D A,E B,C B,D C,D C,E E,D

AB & AC => ABC AB & AD =>ABD AB & AE => ABE AC & AD =>ACD AC & AE => ACE AD & AE => ADE

No of Transactions 3 5 4 2 3 2 4 2 2

BC & BD =>BCD

CD & CE =>CDE

C3 Item Set A,B,C A,B,D A,B,E A,C,D A,C,E A,D,E B,C,D C,D,E

Number of Transactions 3 2 1 4 2 2 2 2

L3 Item set A,B,C A,B,D A,C,D A,C,E A,D,E B,C,D C,D,E

Number of Transactions 3 2 4 2 2 2 2

Minimum Support 3/5=60% 2/5=40% 1/5=20% 4/5=80% 2/5=40% 2/5=40% 2/5=40% 2/5=40%

ABC & ABD => ABCD

ACD & ACE => ACDE

Item Set A, B,C, D A, C, D, E

Number of Transactions 2 2

Sets of {A, B, C, D} & {A, C, D, E} are bought together most frequently.

b) Obtain significant decision rules.

Subsets of {A, B, C, D} {A} {B} {C} {D} {A, B} {A, C} {A, D} {A} => {B, C, D} C= σ{A, B, C, D}/ σ{A} =2/5 = 40% Confidence {B} => {A, C, D} C= {A, B, C, D}/ {B} =2/3 = 66.66% Confidence {C} => {A, B, D} C= σ{A, B, C, D}/σ {C} =2/5=40% Confidence {D} => {A, B, C} C=σ {A, B, C, D}/ σ{D} =2/4=50% Confidence {A, B} => {C, D} C= σ{A, B, C, D}/ σ{A, B} =2/3=66.66% Confidence {A, C} => {B, D} C= σ{A, B, C, D}/σ{A, C} =2/5=40% Confidence

{A, D} => {B, C} C= σ{A, B, C, D}/σ{A, D} =2/4=50% Confidence {B, C} => {A, D} C=σ {A, B, C, D}/σ{B, C} =2/3=66.66% Confidence

{B, C} {B, D} {C, D} {A, B, C} {A, C, D} {A, B, D} {B, C, D}

{B, D} => {A, C} C= σ{A, B, C, D}/σ{B, D} =2/2=100% Confidence {C, D} => {A, B} C= σ{A, B, C, D}/σ{C, D} =2/4=50% Confidence {A, B, C} => {D} C= {A, B, C, D}/{A, B, C} =2/3=66.66% Confidence {A, C, D} => {B} C= σ{A, B, C, D}/σ{A, C, D} =2/4=50% Confidence {A, B, D} => {C} C= σ{A, B, C, D}/σ{A, B, D} =2/2=100% Confidence {B, C, D} => {A} C=σ {A, B, C, D}/σ{B, C, D} =2/2=100% Confidence

Subsets of {A, C, D, E} {A} {C} {D} {E} {A, C} {A, D} {A, E} {A} => {C, D, E} C=σ{A, C , D, E}/σ{A} =2/5=40% Confidence {C} => {A, D, E} C=σ{A, C, D, E}/σ{C} =2/5=40% Confidence {D} => {A, C, E} C=σ{A, C, D, E}/σ{D} =2/4=50% Confidence {E} => {A, C, D} C=σ{A, C, D, E}/σ{E} =2/2=100% Confidence {A, C} => {D, E} C= σ{A, C, D, E}/σ{A, C} =2/5=40% Confidence {A, D} => {C, E} C=σ{A, C, D, E}/σ{A, D} =2/4=50% Confidence {A, E} => {C, D} C=σ{A, C, D, E}/σ{A, E} =2/2=100% Confidence {C, D} => {A, E} C= σ{A, C, D, E}/ σ {C, D} =2/4=50% Confidence {C, E} => {A, D} C= σ {A, C, D, E}/ σ {C, E} =2/2=100% Confidence

{C, D} {C, E} {D, E} {A, C, D} {A, D, E} {A, C, E} {C, D, E}

{D, E} => {A, C} C= σ {A, C, D, E}/ σ {D, E} =2/2=100% Confidence {A, C, D} => {E} C= σ {A, C, D, E}/ σ {A, C, D} =2/4=50% Confidence {A, D, E} => {C} C= σ {A, C, D, E}/ σ {A, D, E} =2/2=100% Confidence {A, C, E} => {D} C= σ {A, C, D, E}/ σ {A, C, E} =2/2=100% Confidence {C, D, E} => {A} C= σ {A, C, D, E}/ σ {C, D, E} =2/2=100% Confidence

c) Derive the FP-Tree for the above transaction table. Step 01 Support for each item. A=5/5=100% B=3/5=60% C=5/5=100% D=4/5=80% E=2/5=40%

Transaction ID T1 T2 T3 T4 T5

Items Bought A,C,B A,C,D,B,E A,C,D A,C,D,E A,C,D,B

TID:1 => NULL

A:1

C:1 B:1

TID:2 =>

NULL

A:2

C:2 B:1

D:1 B:1 E:1

TID:3 =>

NULL

A:3

C:3 D:2

B:1

B:1 E:1

TID:4 =>

NULL

A:4

C:4 D:3

B:1

B:1 E:1

E:1

TID:5 =>

NULL

A:5

C:5 D:4 B:1 B:2

E:1

E:1

View more...
FP-Growth Algorithm It constructs conditional frequent pattern tree and conditional pattern base from database which satisfy minimum support. Due to large number of candidates are Due to compact structure and no candidate generated require large memory space. generation require less memory. Multiple scans for generating candidate sets. Database scanning happens twice only. Execution time is higher than FP-Growth Execution time is less than Apriori algorithm. algorithm as time is wasted in producing candidates every time.

2) Consider the market basket transactions given in the following table. Let min_sup = 40% and min_conf = 40%.

a) Find all the frequent item sets using Apriori algorithm. Minimum Support = 40% Minimum Confidence = 40% Transaction ID T1 T2 T3 T4 T5

Items Bought A,B,C A,B,C,D,E A,C,D A,C,D,E A,B,C,D

C1 Item A B C D E

Number of Transactions 5 3 5 4 2

Minimum Support 5/5=100% 3/5=60% 5/5=100% 4/5=80% 2/5=40%

L1 Item A B C D E

Number of Transactions 5 3 5 4 2

C2 Item Pairs A,B A,C A,D A,E B,C B,D B,E C,E C,D E,D

Number of Transactions 3 5 4 2 3 2 1 2 4 2

Minimum Support 3/5=60% 5/5=100% 4/5=80% 2/5=40% 3/5=60% 2/5=40% 1/5=20% 2/5=40% 4/5=80% 2/5=40%

L2 Item Pairs A,B A,C A,D A,E B,C B,D C,D C,E E,D

AB & AC => ABC AB & AD =>ABD AB & AE => ABE AC & AD =>ACD AC & AE => ACE AD & AE => ADE

No of Transactions 3 5 4 2 3 2 4 2 2

BC & BD =>BCD

CD & CE =>CDE

C3 Item Set A,B,C A,B,D A,B,E A,C,D A,C,E A,D,E B,C,D C,D,E

Number of Transactions 3 2 1 4 2 2 2 2

L3 Item set A,B,C A,B,D A,C,D A,C,E A,D,E B,C,D C,D,E

Number of Transactions 3 2 4 2 2 2 2

Minimum Support 3/5=60% 2/5=40% 1/5=20% 4/5=80% 2/5=40% 2/5=40% 2/5=40% 2/5=40%

ABC & ABD => ABCD

ACD & ACE => ACDE

Item Set A, B,C, D A, C, D, E

Number of Transactions 2 2

Sets of {A, B, C, D} & {A, C, D, E} are bought together most frequently.

b) Obtain significant decision rules.

Subsets of {A, B, C, D} {A} {B} {C} {D} {A, B} {A, C} {A, D} {A} => {B, C, D} C= σ{A, B, C, D}/ σ{A} =2/5 = 40% Confidence {B} => {A, C, D} C= {A, B, C, D}/ {B} =2/3 = 66.66% Confidence {C} => {A, B, D} C= σ{A, B, C, D}/σ {C} =2/5=40% Confidence {D} => {A, B, C} C=σ {A, B, C, D}/ σ{D} =2/4=50% Confidence {A, B} => {C, D} C= σ{A, B, C, D}/ σ{A, B} =2/3=66.66% Confidence {A, C} => {B, D} C= σ{A, B, C, D}/σ{A, C} =2/5=40% Confidence

{A, D} => {B, C} C= σ{A, B, C, D}/σ{A, D} =2/4=50% Confidence {B, C} => {A, D} C=σ {A, B, C, D}/σ{B, C} =2/3=66.66% Confidence

{B, C} {B, D} {C, D} {A, B, C} {A, C, D} {A, B, D} {B, C, D}

{B, D} => {A, C} C= σ{A, B, C, D}/σ{B, D} =2/2=100% Confidence {C, D} => {A, B} C= σ{A, B, C, D}/σ{C, D} =2/4=50% Confidence {A, B, C} => {D} C= {A, B, C, D}/{A, B, C} =2/3=66.66% Confidence {A, C, D} => {B} C= σ{A, B, C, D}/σ{A, C, D} =2/4=50% Confidence {A, B, D} => {C} C= σ{A, B, C, D}/σ{A, B, D} =2/2=100% Confidence {B, C, D} => {A} C=σ {A, B, C, D}/σ{B, C, D} =2/2=100% Confidence

Subsets of {A, C, D, E} {A} {C} {D} {E} {A, C} {A, D} {A, E} {A} => {C, D, E} C=σ{A, C , D, E}/σ{A} =2/5=40% Confidence {C} => {A, D, E} C=σ{A, C, D, E}/σ{C} =2/5=40% Confidence {D} => {A, C, E} C=σ{A, C, D, E}/σ{D} =2/4=50% Confidence {E} => {A, C, D} C=σ{A, C, D, E}/σ{E} =2/2=100% Confidence {A, C} => {D, E} C= σ{A, C, D, E}/σ{A, C} =2/5=40% Confidence {A, D} => {C, E} C=σ{A, C, D, E}/σ{A, D} =2/4=50% Confidence {A, E} => {C, D} C=σ{A, C, D, E}/σ{A, E} =2/2=100% Confidence {C, D} => {A, E} C= σ{A, C, D, E}/ σ {C, D} =2/4=50% Confidence {C, E} => {A, D} C= σ {A, C, D, E}/ σ {C, E} =2/2=100% Confidence

{C, D} {C, E} {D, E} {A, C, D} {A, D, E} {A, C, E} {C, D, E}

{D, E} => {A, C} C= σ {A, C, D, E}/ σ {D, E} =2/2=100% Confidence {A, C, D} => {E} C= σ {A, C, D, E}/ σ {A, C, D} =2/4=50% Confidence {A, D, E} => {C} C= σ {A, C, D, E}/ σ {A, D, E} =2/2=100% Confidence {A, C, E} => {D} C= σ {A, C, D, E}/ σ {A, C, E} =2/2=100% Confidence {C, D, E} => {A} C= σ {A, C, D, E}/ σ {C, D, E} =2/2=100% Confidence

c) Derive the FP-Tree for the above transaction table. Step 01 Support for each item. A=5/5=100% B=3/5=60% C=5/5=100% D=4/5=80% E=2/5=40%

Transaction ID T1 T2 T3 T4 T5

Items Bought A,C,B A,C,D,B,E A,C,D A,C,D,E A,C,D,B

TID:1 => NULL

A:1

C:1 B:1

TID:2 =>

NULL

A:2

C:2 B:1

D:1 B:1 E:1

TID:3 =>

NULL

A:3

C:3 D:2

B:1

B:1 E:1

TID:4 =>

NULL

A:4

C:4 D:3

B:1

B:1 E:1

E:1

TID:5 =>

NULL

A:5

C:5 D:4 B:1 B:2

E:1

E:1

Thank you for interesting in our services. We are a non-profit group that run this website to share documents. We need your help to maintenance this website.

To keep our site running, we need your help to cover our server cost (about $400/m), a small donation will help us a lot.