I am trying to multiply a few specific columns by a portion of multiple rows and creating a new column from every result. I could not really find an answer to my question in previous stackoverflow questions or on google, so maybe one of you can help.
I would like to point out that I am quite the beginner in Python, so apologies ahead for any obvious questions or strange code.
This is how my DataFrame currently looks like:
So, for the column Rank of Hospital by Doctor_1, I want to multiply all its numbers by the values of the first row of column Rank of Doctor by Hospital_1 until column Rank of Doctor by Hospital_10. Which would result in:
1*1
2*1
3*1
4*4
...
and so on. I want to do this for every Doctor_ column. So for Doctor_2 its values should be multiplied by the second row of all those ten columns (Rank of Doctor by Hospital_. Doctor_3, multiplied by the third row etc.
So far, I have transposed the Rank of Doctor by Hospital_ columns in a new DataFrame:
and tried to multiply this by a DataFrame of the Rank of Hospital by Doctor_ columns. Here the first column of the first df should be multiplied by the first column of the second df. (and second column * second column, etc.):
But my current formula
preferences_of_doctors_and_hospitals_doctors_ranking.mul(preferences_of_doctors_and_hospitals_hospitals_ranking_transposed)
Does anybody know what I am doing wrong and how I could fix this? Maybe I could write a for loop so that a new column is created for every multiplication of columns? So Multiplication_column_1 of DF3 = Column 1 of DF1 * Column 1 of DF2 and Multiplication_column_2 of DF3 = Column 2 of DF1 * Column 2 of DF2.
Thank you in advance!
Jeff
You can multiple 2d arrays created by filtering column with filter
and values
first:
arr = df.filter(like='Rank of Hospital by').values * df.filter(like='Rank of Doctor by').values
Or:
arr = (preferences_of_doctors_and_hospitals_doctors_ranking.values *
preferences_of_doctors_and_hospitals_hospitals_ranking_transposed.values)
Notice - necessary is same ordering of columns, same length of columns names and index in both filtered DataFrames.
Get 2d array, so create DataFrame by constructor and join
to original:
df = df.join(pd.DataFrame(arr, index=df.index).add_prefix('Multiplied '))