I have a dataframe (220 rows × 2 columns) and a list of values [41, 84, 129, 174, 219, 45]. I want to insert new rows (containing -999, -999) into my dataframe below the index locations specified within my list. So for example.
40 400 -47.595322
41 410 13.159509
42 0 -235.865433
43 8 -102.183365
Would become:
40 400 -47.595322
41 410 13.159509
42 -999 -999
43 0 -235.865433
44 8 -102.183365
And so on and so on...
Thanks :)
What I have so far:
import pandas as pd
import numpy as np
import glob
path =r'MyPath'
allFiles = glob.glob(path + "/*.dat")
frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
df = pd.read_csv(file_, delim_whitespace=True, index_col=None, header=None)
list_.append(df)
frame = pd.concat(list_)
frame.columns = ['age', 'dt']
frame = frame.reset_index(drop=True)
idx = [] + list(frame['age'][frame['age'] == 410].index) + [df.index[-1]+1]
idx = np.array(idx)
df = pd.DataFrame(
np.insert(frame.values, idx + 1, -999, axis=0), columns=frame.columns)
print(df.to_string())
If your dataframe contains a monotonically increasing index, you can do this quite simply with np.insert
:
idx = np.array([41, 84, 129, 174, 219, 45])
df = pd.DataFrame(
np.insert(df.values, idx + 1, -999, axis=0), columns=df.columns
)
If not, you'll need to call index.get_loc
to get the right index in the array:
idx = [df.index.get_loc(i) + 1 for i in idx]
And call the insert code as before.
Demo:
df
A B C
0 40 400 -47.595322
1 41 410 13.159509
2 42 0 -235.865433
3 43 8 -102.183365
idx = np.array([1, 3])
pd.DataFrame(
np.insert(df.values, idx + 1, -999, axis=0), columns=df.columns
)
A B C
0 40.0 400.0 -47.595322
1 41.0 410.0 13.159509
2 -999.0 -999.0 -999.000000
3 42.0 0.0 -235.865433
4 43.0 8.0 -102.183365
5 -999.0 -999.0 -999.000000
Beware of invalid index accesses.