Search code examples
pythonpandasdataframedictionarydelete-row

How to delete rows with values in the same list in Python?


Here is the dataframe:

mat = {'f1': ['A aaa', 'B sss', 'C ddd', 'B bbb'], 'f2': ['F eee', 'B bbb', 'A aaa', 'B sss']}
dict = {'A': ['A aaa'], 'B': ['B bbb', 'B sss'], 'C': ['C ddd'], 'F': ['F eee', 'F aaa']}
df = pd.DataFrame(mat)

We can see that the key 'B' has a list as its value in the dictionary, where the list consists of two items. What I need to do is to delete rows with values of f1 and f2 in the same list. For example, the second row and the fourth row.


Solution

  • You can rework your dictionary to map the keys from the values, then use a groupby to identify the rows with all unique values:

    dic = {'A': ['A aaa'], 'B': ['B bbb', 'B sss'],
           'C': ['C ddd'], 'F': ['F eee', 'F aaa']}
    
    dic2 = {v: k for k,l in dic.items() for v in l}
    # {'A aaa': 'A', 'B bbb': 'B', 'B sss': 'B', 'C ddd': 'C',
    #  'F eee': 'F', 'F aaa': 'F'}
    
    out = df[df.stack().map(dic2).groupby(level=0).nunique().ne(1)]
    

    alternative:

    df2 = df.stack().map(dic2).unstack()
    out = df[df2.ne(df2.iloc[:, 0], axis=0).any(1)]
    

    output:

          f1     f2
    0  A aaa  F eee
    2  C ddd  A aaa