Search code examples
pythonpandasfrequency-table

Pandas value_counts() with multiple matches in same row


I have categorical data (A, B, etc.) in which multiple matches can exist within the same field such as A,B. I would like to break my data into additional rows just for the purpose of counting the number of instances of each value.

df = pd.DataFrame({"Values" : ["A", "B", "C", "A,B"]})
df
    Values
0   A
1   B
2   C
3   A,B

Currently:

df["Values"].value_counts()
B       1
A,B     1
A       1
C       1
Name: Values, dtype: int64

My ideal function would work something like this:

df["Values"].value_counts(split = ",")
A    2
B    2
C    1
Name: Values, dtype: int64

Solution

  • Use Series.str.split and then explode()

    print( df['Values'].str.split(',').explode().value_counts() )
    

    Prints:

    A    2
    B    2
    C    1
    Name: Values, dtype: int64
    

    EDIT:

    df = pd.DataFrame({"Values" : ["A", "B", "C", "A,B"]})
    print( df['Values'].str.split(',').explode().value_counts() )