Search code examples
pythonpandasvector

update vector field of pandas df


I have a DataFrame with one column holding vector values:

df = pd.DataFrame({"a": [1,2,3], "b": [4,5,6]}, index=["one", "two", "three"])
s = pd.Series([(i*10, i*11, i*12) for i in df["a"]], index=df.index)
df["vec"] = s
#df

but I cannot figure out, how to update these vector values. E.g:

df.loc[df["a"]>1, "vec"] = np.array((1,2,3)) # doesn't work...

always getting something like

ValueError: Must have equal len keys and value when setting with an iterable

Solution

  • You need to create a Series on the right hand side:

    m = df["a"]>1
    df.loc[m, "vec"] = pd.Series([np.array((1,2,3))]*m.sum(),
                                 index=df.index[m])
    

    Output:

           a  b           vec
    one    1  4  (10, 11, 12)
    two    2  5     [1, 2, 3]
    three  3  6     [1, 2, 3]