Here are two polars dfs. DF and serdf. DF Polars dataframe Serdf Polars dataframe For sake of replication of data i have converted it to pandas as below:
DF=pd.DataFrame({'client': {0: 'x',1: 'y',
2: 'z', 3: 'k', 4: 'z', 5: 'k', 6: 'y', 7: 'y', 8: 'k', 9: 'z',
10: 'x', 11: 'k', 12: 'x', 13: 'y', 14: 'k', 15: 'z', 16: 'x',
17: 'z', 18: 'k', 19: 'y', 20: 'x', 21: 'z', 22: 'k'},
'a': {0: 34, 1: 31, 2: 3, 3: 10, 4: 11, 5: 4, 6: 11,
7: 12, 8: 87, 9: 90, 10: 56, 11: 88, 12: 12, 13: 45,
14: 67, 15: 81, 16: 12, 17: 18, 18: 7, 19: 56, 20: 11,
21: 34, 22: 15},
'b': {0: 13, 1: 23, 2: 12, 3: 1, 4: 3, 5: 13, 6: 10,7: 67,
8: 90, 9: 9, 10: 10, 11: 34, 12: 9, 13: 87, 14: 34, 15: 11,
16: 90, 17: 11, 18: 65, 19: 19, 20: 47, 21: 92, 22: 11}})
serdf = pd.DataFrame({'df1': {0: 1, 1: 3, 2: 5, 3: 7, 4: 9,
5: 11, 6: 13, 7: 15, 8: 17, 9: 19, 10: 21,11: 23, 12: 25,
13: 27, 14: 29, 15: 31, 16: 33, 17: 35, 18: 37, 19: 39,
20: 41, 21: 43, 22: 45, 23: 47},
'df2': {0: 2, 1: 4, 2: 6, 3: 8, 4: 10, 5: 12, 6: 14, 7: 16,
8: 18, 9: 20, 10: 22, 11: 24, 12: 26, 13: 28, 14: 30,
15: 32, 16: 34, 17: 36, 18: 38, 19: 40, 20: 42, 21: 44,
22: 46, 23: 48},
'df3': {0: -1, 1: -3, 2: -1, 3: 7, 4: -2, 5: -1, 6: -3,
7: 7, 8: -1, 9: 90, 10: -1, 11: -1, 12: 8, 13: -1, 14: -1,
15: 6, 16: -1, 17: 4, 18: -3, 19: 2, 20: -1, 21: -1,
22: -4, 23: -8},
'df4': {0: 3, 1: 7, 2: -1, 3: -8, 4: -2, 5: -1, 6: -9,
7: -1, 8: 12, 9: -1, 10: -1, 11: -1, 12: 11, 13: -1,
14: -1, 15: 1, 16: -1, 17: 2, 18: -3, 19: -1, 20: 41,
21: -1, 22: -4, 23: -8}}
)
What I am trying to do is to create new columns on the fly on DF groupedby client; The first groupby object will have a new column that is the first row number of the the first column of serdf for groupby object client x, the second groupby object will have the column that contains first row number of the second column of serdf for groupy object client y and so on. After one iteration I am doing this for all rows of serdf.
As a pseudo code:
for i in range(len(serdf)):
for j in DF.groupedby(client)
J.CreateNewColumn = serdf[i,j]
For simplicity i have taken only the snapshot of serdf. it contains half a million rows. So for example after first iteration, DF.groupedby with client x will be,
----------------------
|client| a | b | df1|
-------------------- |
| x | 34| 13 | 1 |
| x | 12| 9 | 1 |
| x | 11| 47 | 1 |
| x | 56| 11 | 1 |
| x | 12| 90 | 1 |
----------------------
DF.groupedby with client y will be,
----------------------
|client| a | b | df1|
-------------------- |
| y | 31| 23 | 2 |
| y | 12| 10 | 2 |
| y | 12| 67 | 2 |
| y | 45| 87 | 2 |
| y | 56| 19 | 2 |
----------------------
and so on.
so in total there will be 552 dataframes created on the fly out of this operation; Is it possible to do this without loops at all please?
I do not know how to proceed. I appreciate any help. Thank you.
based on what I understand of your problem, you could do the following:
out = (
df_pl.join( serdf_pl.with_row_count(), how='cross')
.partition_by('row_nr','client')
)
[shape: (5, 8)
┌────────┬─────┬─────┬────────┬─────┬─────┬─────┬─────┐
│ client ┆ a ┆ b ┆ row_nr ┆ df1 ┆ df2 ┆ df3 ┆ df4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ u32 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞════════╪═════╪═════╪════════╪═════╪═════╪═════╪═════╡
│ x ┆ 34 ┆ 13 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ x ┆ 56 ┆ 10 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ x ┆ 12 ┆ 9 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ x ┆ 12 ┆ 90 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ x ┆ 11 ┆ 47 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
└────────┴─────┴─────┴────────┴─────┴─────┴─────┴─────┘,
shape: (5, 8)
┌────────┬─────┬─────┬────────┬─────┬─────┬─────┬─────┐
│ client ┆ a ┆ b ┆ row_nr ┆ df1 ┆ df2 ┆ df3 ┆ df4 │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ i64 ┆ u32 ┆ i64 ┆ i64 ┆ i64 ┆ i64 │
╞════════╪═════╪═════╪════════╪═════╪═════╪═════╪═════╡
│ y ┆ 31 ┆ 23 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ y ┆ 11 ┆ 10 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ y ┆ 12 ┆ 67 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ y ┆ 45 ┆ 87 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
│ y ┆ 56 ┆ 19 ┆ 0 ┆ 1 ┆ 2 ┆ -1 ┆ 3 │
└────────┴─────┴─────┴────────┴─────┴─────┴─────┴─────┘
The result is a list of 96 dataframes.
Here is what it does:
Is this what you are looking for?