I want to normalize my dataframe but normalization should be done every 16 rows where I have 16*1550 rows and 17 columns. I implemented that using the following code and it is giving a warning. Is this the correct way to do it?
for n in range(1550):
data_features[16*n:16*(n+1)][:] = (data_features[16*n:16*(n+1)][:] - data_features[16*n:16*(n+1)][:].mean())/data_features[16*n:16*(n+1)][:].std()
The way you access the dataframe is wrong. To modify cells you must always use loc
or iloc
(or, if relevant, at
and iat
) and NEVER select rows from a column. And if you want to normalize by blocs, you should process rows by blocs. So a simple fix could be:
for n in range(1550):
data_features.iloc[16*n:16*(n+1)] = (
data_features.iloc[16*n:16*(n+1)]
- data_features.iloc[16*n:16*(n+1)].mean()
)/data_features.iloc[16*n:16*(n+1)].std()