Search code examples
pythonpandasmachine-learningimputation

Imputation seems to change non NaN values


running

imputed_training=impyute.imputation.cs.em(X_train2.values, loops=50)
xtrain2_imputed=pd.DataFrame(imputed_training)
columns=('interest-over-time','hash-rate',...) # very long list
xtrain2_imputed.columns = columns

Returns a dataframe containing completely different values from the original dataframe (xtrain2). How can I impute my NaNs using expectation maximization in a way that returns a dataframe with the same columns, column order and row order as my original df?


Solution

  • When you do this you can assign it back

    mputed_training=impyute.imputation.cs.em(X_train2.values, loops=50)
    X_train2[:]= mputed_training