Search code examples
pythonpandasdataframe

Pandas ValueError : Must have equal len keys and value when setting with an iterable


I have a question regarding a transformation I want to add to a Dataframe pandas. I have a dataframe df with the following columns :

df.columns = Index(['S', 'HZ', 'Z', 'Demand'], dtype='object')

I want to perform the following transformation:

for s in range(S):
   for t in range(HZ):
       for z in range(Z):
          df.loc[(df['S'] == s) & (df['HZ'] == t) & (df['Z'] == z), 'Demand'] = D[s][t][z]

Where D is a numpy array with the corresponding dimensions. Here is a simple example of what I try to do (with T = 0 for making it simpler).

Here is df before :

    S  HZ  Z  Demand
0   0   0  0       0
1   0   0  1       0
2   0   1  0       0
3   0   1  1       0
4   0   2  0       0
5   0   2  1       0
6   1   0  0       0
7   1   0  1       0
8   1   1  0       0
9   1   1  1       0
10  1   2  0       0
11  1   2  1       0

Here is D :

D = [[[1, 2], 
      [3, 4],
      [5, 6]],
     [[7, 8],
      [9, 10],
      [11, 12]]]

And here is what i want:

    S  HZ  Z  Demand
0   0   0  0       1
1   0   0  1       2
2   0   1  0       3
3   0   1  1       4
4   0   2  0       5
5   0   2  1       6
6   1   0  0       7
7   1   0  1       8
8   1   1  0       9
9   1   1  1       10
10  1   2  0       11
11  1   2  1       12

This code functions, but it is very long, so I tried something else to avoid the for loops :

df.loc[df['HZ'] >= T, 'Demand'] = D[df['S']][df['HZ']][df['Z']]

Which raises the following error :

ValueError: Must have equal len keys and value when setting with an iterable

I am searching what this error means, how to fix it if possible, and if not possible, is there a mean to do what I want without using for loops ?

Thanks by advance


Solution

  • After trying many things, I finally found something that works :

    filter_mask = df['HZ'] >= T
    df_subset_s = df.loc[filter_mask, 'S']
    df_subset_hz = df.loc[filter_mask, 'HZ']
    df_subset_z = df.loc[filter_mask, 'Z']
    
    df.loc[filter_mask, 'Demand'] = D[df_subset_s][df_subset_hz - T][df_subset_z]
    

    The problem is that with this line :

    df.loc[df['HZ'] >= T, 'Demand'] = D[df['S']][df['HZ']][df['Z']]
    

    I try to access the elements of D with series (df['S'] for example), which represent columns of the dataframes, and this is not possible. With the solution I found, I access the value of the column I wanted to locate the corresponding element of D.