I have a dataframe like the following and I intend to extract windows with size = 30
and then write for loop for each block of data and call other functions.
index = pd.date_range(start='2016-01-01', end='2016-04-01', freq='D')
data = pd.DataFrame(np.random.rand(len(index)), index = index, columns=['random'])
I found the following function, but I wonder if there is more efficient way to do so.
def split(df, chunkSize = 30):
listOfDf = list()
numberChunks = len(df) // chunkSize + 1
for i in range(numberChunks):
listOfDf.append(df[i*chunkSize:(i+1)*chunkSize])
return listOfDf
You can use list comprehension. See this SO Post about how access dfs and another way to break up a dataframe.
n = 200000 #chunk row size
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]