I have several dataframes looking like this:
time_hr | cell_hour | id | attitude | hour |
---|---|---|---|---|
0.028611 | xxx | 1 | Cruise | 1.0 |
0.028333 | xxx | 4 | Cruise | 1.0 |
0.004722 | xxx | 16 | Cruise | 1.0 |
I want to do a specific multiplications between rows of the 'time_hr' column.
I need to multiply each row with other rows and store the value to use later.
eg. if the column values are [2,3,4], I would need 2x3, 2x4, 3x2, 3x4, 4x2, 4x3 values.
A part of the problem is that I have several dataframes which have different number of rows so I would need a generic way of doing this.
Is there a way? Thanks in advance.
It sounds like a cartesian product to me:
from io import StringIO
#sample data reading
data1 = """
time_hr cell_hour id attitude hour
0.028611 xxx 1 Cruise 1.0
0.028333 xxx 4 Cruise 1.0
0.004722 xxx 16 Cruise 1.0
"""
df = pd.read_csv(StringIO(data1), sep="\t")
#filtering dataset to needed columns
df_time = df[["id", "time_hr"]]
df_comb = df_time.merge(df_time, how='cross')
df_comb = df_comb[df_comb["id_x"] != df_comb["id_y"]]
df_comb["time_hr"] = df_comb["time_hr_x"] * df_comb["time_hr_y"]
df_comb.drop(columns=["time_hr_x", "time_hr_y"]).set_index(["id_x", "id_y"])
# time_hr
#id_x id_y
#1 4 0.000811
# 16 0.000135
#4 1 0.000811
# 16 0.000134
#16 1 0.000135
# 4 0.000134
If you want to have more pythonic code you automatise it
id_column = "id"
product_columns = ["time_hr"]
df_time = df[[id_column, *product_columns]]
df_comb = df_time.merge(df_time, how='cross')
df_comb = df_comb[df_comb[f"{id_column}_x"] != df_comb[f"{id_column}_y"]]
for column in product_columns:
df_comb[column] = df_comb[f"{column}_x"] * df_comb[f"{column}_y"]
df_comb.set_index([f"{id_column}_x", f"{id_column}_y"])\
.drop(columns=[drop for column in product_columns for drop in [f"{column}_x", f"{column}_y"]])
PS. I am not sure if that is what you were trying to achieve, if not, please add expected output data for those 3 input rows.