Search code examples
pythonpandasmatplotliblist-comprehension

Creating a plot with the most occurring combinations of two columns


For showing my data I want to create a frequency plot. This plot will be based on the number of times a combination between two columns occur. I have a two columns in a dataset (df):

acq_cc: [CN, IN, CN, CN, CN, IN, CN, CN, CN, ZA]

tar_cc: [GB, ZA, NO, AR, VG, JP, NL, PK, PK, SG]

The value of the columns needs to be combined e.g. CN - GB, IN - ZA. Afterwards, I want to count how often a combination of occurs. In this situation CN - PK occurs twice.

I used this example but since I have no 'Client' value, I receive an error code. The final frequency plot should show a combination of the top 10 most combinations (on the x-axis) and their counts (on the y-axis).


Solution

  • Just do value_counts on multiple columns then use head to select top 10 most occurring combinations:

    c = ['acq_cc', 'tar_cc']
    df[c].value_counts().head(10).plot(kind='bar')
    

    enter image description here