Search code examples
pythonpandaspandas-loc

Apply definition on multiple columns based on conditions- python


i have dataframe that i want when one column equal to '02' apply the definition to 5 others columns. this is my code:

mask = (df['A']=='02')
z_valid = df[mask]
df.loc[((mask) & (df['B'] !=0)), 'B'] = z_valid['B'].apply(def)
df.loc[((mask) & (df['C'] !=0)), 'C'] = z_valid['C'].apply(def)
df.loc[((mask) & (df['D'] !=0)), 'D'] = z_valid['D'].apply(def)
df.loc[((mask) & (df['E'] !=0)), 'E'] = z_valid['E'].apply(def)
df.loc[((mask) & (df['F'] !=0)), 'F'] = z_valid['F'].apply(def)

i got this error: 'ValueError: Must have equal len keys and value when setting with an iterable'

but this code works when i apply def just on 2 columns:

mask = (df['A']=='02')
z_valid = df[mask]
df.loc[((mask) & (df['B'] !=0)), 'B'] = z_valid['B'].apply(def)
df.loc[((mask) & (df['C'] !=0)), 'C'] = z_valid['C'].apply(def)

how can i fix this error? thanks in advance.


Solution

  • It raised an exception because 'D' is an empty Series:

    Suppose the following dataframe:

    df = pd.DataFrame({'A': ['01', '02', '02', '02'], 'B': [9, 1, 2, 3], 'C': [9, 1, 2, 3], 'D': [9, 0, 0, 0]})
    print(df)
    
    # Output:
        A  B  C  D
    0  01  9  9  9
    1  02  1  1  0
    2  02  2  2  0
    3  02  3  3  0
    

    Now if I try your code:

    def f(sr):
        return 2*sr
    
    mask = (df['A']=='02')
    z_valid = df[mask]
    df.loc[((mask) & (df['B'] !=0)), 'B'] = z_valid['B'].apply(f)
    df.loc[((mask) & (df['C'] !=0)), 'C'] = z_valid['C'].apply(f)
    df.loc[((mask) & (df['D'] !=0)), 'D'] = z_valid['D'].apply(f)
    

    Output:

    ----> 8 df.loc[((mask) & (df['D'] !=0)), 'D'] = z_valid['D'].apply(f)
    ...
    ValueError: Must have equal len keys and value when setting with an iterable
    

    The exception is raised on 'D':

    # Left side
    >>> df.loc[((mask) & (df['D'] !=0)), 'D']
    Series([], Name: E, dtype: int64)  # <- Empty Series
    
    # Right side
    >>> z_valid['D'].apply(f)
    1    0
    2    0
    3    0
    Name: E, dtype: int64