Search code examples
pythonpandasvenn-diagrammatplotlib-vennvenn

Create a Venn diagram in Python from two numeric pandas columns


I have two pandas columns (reproducible output below)

{'Test_actual': {0: 160.702, 1: 113.457, 2: 91.245, 3: 53.784, 4: 40.281, 5: 39.236, 6: 37.73, 
7: 32.692, 8: 29.983, 9: 29.983, 10: 29.69, 11: 29.69, 12: 26.232, 13: 25.779, 14: 24.094, 15: 
24.094, 16: 22.46, 17: 20.731, 18: 20.731, 19: 17.367}, 'Test_predicted': {0: 24.68385875, 1: 
21.22846956, 2: 46.41633486, 3: 0.196713859, 4: 18.22344042, 5: 13.87247076, 6: 28.5820542, 7: 
59.67648307, 8: 13.95858298, 9: 20.03071567, 10: 2.73188936, 11: 2.73188936, 12: 15.57722262, 
13: 9.469598881, 14: 0.311162267, 15: 38.29214566, 16: 11.47778436, 17: 8.754663155, 18: 
19.61416015, 19: 1.858514339}}

I am trying to build a Venn diagram in Python. Tried to create sets, but does not work.

IMPORTANT. I want to keep all values, not just unique.

import matplotlib.pyplot as plt
from matplotlib_venn import venn2

col_one_list = rf_test_actual_pred_10['Test_actual'].tolist()
col_two_list = rf_test_actual_pred_10['Test_predicted'].tolist()

set1 = set(col_one_list)
set2 = set(col_one_list)
venn2([set1, set2], ('Test actual', 'Test predicted'))

plt.show()

Current output is wrong

Current output is bad

Expected output

strong text

Should I round values for overlapping, if I do not have exact values?


Solution

  • No need to convert your columns to list first. Try directly using set:

    >>> venn2([set(rf_test_actual_pred_10["Test_actual"].astype(int)),
               set(rf_test_actual_pred_10["Test_predicted"].astype(int))],
               set_labels=("Test actual","Test predicted"))
    

    enter image description here