- It is a
**modeling technique**that helps to identify which items should be purchased together. - Assume that, there are large number of items
**like**“Tea”, “Coffee”, “Milk” “Sugar”. Among these, the customer buys the subset of items as per the requirement and market gets the information of items which customer has purchased together. So, the market uses this information to put the items on different positions (or locations).

- The market basket analysis is used to decide the perfect location, where the items can be placed inside the store.

**For example:**If the customer buys a coffee, it is possible that the customer may buy milk or sugar along with coffee. - So keeping the coffee and sugar next to each other in store will be helpful to customers to buy the items together and improves sales of the company.
- The problem of large volume transactions can be minimized by using
**differential market basket analysis**, which is capable of finding interesting results and eliminates the large volume.

- Frequent item-sets are used in association rules.
- An itemset is said to be frequent, if X's support is no less than a minimum support threshold.
- A frequent item-set is a set of items that may appear at least in pre- defined number of transactions.

- Consider two item-sets 'X' and 'Y', if each item of 'X' is in 'Y', but there is at least one item of 'Y', that is not in 'X' then 'Y' is not a proper super-item set of 'X'. In this case item set is closed.
- If 'X' is closed and frequent, then it is called as
**closed frequent item-set.**

- The objects or items from relational databases, transactional databases or other information repositories are considered to find frequent patterns, associations and correlations.
- Association rules search for interesting relationships among the items in a given data set by examining transactions.

**For example -**Frequently purchased item- sets can be easily found out by examining the shop cart. This helps in advertising or for placement of goods in the store. - Association rules have general form:

**I**_{1}→ (where I_{1}∩ I_{2}= 0) - The rule can be read as, Given that someone has purchased the items from the set
**I**they are likely to also buy the items in the set_{1}, then**I**_{2}.

- It is a set of single items, from transactions.
- If some items occur together, then they can form an association rule.

- It is the percentage of the transactions in which the items appear.
- If
**A => B** **Support (A → B) = # _ tuples containing both A and B / total #_tuples**- The support for an association X => Y is the percentage of transactions in the database that contains
**X U Y.**

- The confidence or strength for an association rule
**A => B**is the ratio of the number of transactions that contain**A.** - Consider a rule
**A => B, which**is a measure of ratio of the number of tuples containing both A and B to the number of tuples containing**'A'.**

TID | List of items |
---|---|

T2000 | A, B, C |

T1000 | A, C |

T4000 | A, D |

T5000 | B, E, F |

C

Items | Support |
---|---|

{A} | 3 |

{B} | 2 |

{C} | 2 |

{D} | 1 |

{E} | 1 |

{F} | 1 |

L

Items | Support |
---|---|

{A} | 3 |

{B} | 2 |

{C} | 2 |

C

Items |
---|

{A, B} |

{A, C} |

{B, C} |

C

Items | Support |
---|---|

{A, B} | 1 |

{A, C} | 2 |

{B, C} | 1 |

L

Items | Support |
---|---|

{A,C} | 2 |

Data contains the frequent item 1 (A, C), so that the association rule that can be generated from 'L' are as shown in the following table with the support and confidence.

Association Rule | Support | Confidence | Confidence % |
---|---|---|---|

A - > C | 2 | 2/3 = 0.66 | 66 % |

C - > A | 2 | 2/2 = 1 | 100 % |

- This algorithm allows to find frequent item-set without generation of candidate item-set.
- Once the F- P tree is generated, it is mined by calling FP_growth (FP_tree, null).

**Procedure FP_ growth (Tree, α):** **If**tree consist of a single path 'P', then for each combination (which is denoted as 'β' ) of the path 'P' generates the pattern**β U , α**with support count = of nodes in**β;****Else**, for each**a**in the header of tree, generates the patterns_{i}**β = a**_{i}.support_count;- So, construct β's conditional pattern base first and then β's conditional FP _Tree Treeβ;
**If**Tree β ≠ Ф**Then**, call FP_ growth (Treeβ, β); }

TID | List of item |
---|---|

T1 | A1, A2, A5 |

T2 | A2, A4 |

T3 | A2, A3 |

T4 | A1, A2, A4 |

T5 | A1, A3 |

T6 | A2, A3 |

T7 | A1, A3 |

T8 | A1, A2, A3, A5 |

T9 | A1, A2, A3 |

Item | Support |
---|---|

A1 | 6 |

A2 | 7 |

A3 | 6 |

A4 | 2 |

A5 | 2 |

Item | Conditional pattern base | Conditional FP-tree | Frequent patterns generation |
---|---|---|---|

A5 | {(A2 A1: 1)} | { A2: 2, A1: 2} | A2 A5: 2, A1 A5: 2, A2 A1A5: 2 |

A4 | {A2 A1 : 1), (A2:1)} | {A2: 2} | A2 A4: 2 |

A3 | {( A2, A1 :2), (A1: 2)} | (A2: 4, A1:2), (A1:2) | A2 A3: 4, A1, A3: 2, A2 A1A3: 2 |

A1 | {(A2: 4)} | {(A2: 4)} | A2 A1: 4 |