Let's say I have a csv file with the following structure (800k records) and I want to identify existing patters of product combinations (e.g. a pattern that Product XYZ are often brought together):
Customer_ID | Product_ID | Revenue
1 A X
1 B X
1 C X
2 A X
2 D X
3 A X
4 F X
How would you approach that from a data science perspective? Which methods would you use and which are the steps you need to take (e.g. pseudo code of the approach you would recommend, preferably in python).
Thank you so much for you help. It is highly appreciated! Regards Simon
There is a standard data mining task known as
aka market basket analysis.
It looks at products frequently bought together.
You really should read some basic books and Wikipedia first...