Search code examples
pythonpandasnested-for-loop

Iterate over columns of Pandas dataframe and create new variables


I am having trouble figuring out how to iterate over variables in a pandas dataframe and perform same arithmetic function on each.

I have a dataframe df that contain three numeric variables x1, x2 and x3. I want to create three new variables by multiplying each by 2. Here’s what I am doing:

existing = ['x1','x2','x3']
new = ['y1','y2','y3']

for i in existing:
    for j in new:
        df[j] = df[i]*2

Above code is in fact creating three new variables y1, y2 and y3 in the dataframe. But the values of y1 and y2 are being overridden by the values of y3 and all three variables have same values, corresponding to that of y3. I am not sure what I am missing.

Really appreciate any guidance/ suggestion. Thanks.


Solution

  • You are looping something like 9 times here - 3 times for each column, with each iteration overwriting the previous.

    You may want something like

    for e, n in zip(existing,new):
        df[n] = df[e]*2