I have a dataframe(df) like below (there are more rows actually).
number | |
---|---|
0 | 21 |
1 | 35 |
2 | 467 |
3 | 965 |
4 | 2754 |
5 | 34r |
6 | 5743 |
7 | 841 |
8 | 8934 |
9 | 275 |
I want to insert multiple 6 rows in between rows for example I want to get random 6 values within range of index 0 and 1 and add these 6 rows between index 0 and 1. Same goes to index 1 and 2, 2 and 3 and so forth until the end.
np.linspace(df["number"][0], df["number"][1],8)
Is there a function or any other method to generate 6 additional rows between all existing 9 rows so therefore the final number of rows will be not 9 but 64 rows (after adding 54 rows)?
You could try the following:
from random import uniform
def rng_numbers(row):
left, right = row.iat[0], row.iat[1]
n = left
if pd.isna(right):
return [n]
if right < left:
left, right = right, left
return [n] + [uniform(left, right) for _ in range(6)]
df["number"] = (
pd.concat([df["number"], df["number"].shift(-1)], axis=1)
.apply(rng_numbers, axis=1)
)
df = df.explode("number", ignore_index=True)
number
column and number
column shifted 1 forth..apply
the function rng_numbers
to the rows of the new dataframe: rng_numbers
first sorts the interval boundaries and then returns a list that starts with the resp. item from column number
and then num_rows
many random numbers in the interval. In the last row the left boundary is NaN
(due to the .shift(-1)
): in this case the function returns the list without the random numbers..explode
df
on the new column number
.You could do something similar with NumPy, which is probably faster:
rng = np.random.default_rng()
limits = pd.concat([df["number"], df["number"].shift(-1)], axis=1)
left = limits.min(axis=1).values.reshape(-1, 1)
right = limits.max(axis=1).values.reshape(-1, 1)
df["number"] = (
pd.Series(df["number"].values.reshape(len(df), 1).tolist())
+ pd.Series(rng.uniform(left, right, size=(len(df), 6)).tolist())
)
df["number"].iat[-1] = df["number"].iat[-1][:1]
df = df.explode("number", ignore_index=True)