When i log transform pandas column i get NaNs should i replace these with 0?

I cannot find a similar question. But i have a df with some columns highly skewed. I then plan to log transform these columns then standardize. However when i log transform i then get NaNs, should i replace these with 0;s?

log_train[skew_cols]=np.log2(featuresdf[skew_cols]

error i get is:

RuntimeWarning: invalid value encountered in log2
  This is separate from the ipykernel package so we can avoid doing imports until

not sure what i am doing wrong

Solution

You shouldn't replace with 0's, because np.log(1) is equal to 0. So then both 1, and 0 will be 0 in your log data.

Instead, just +1 your data prior to the log. Therefore log2(1) becomes 0, log2(2) (which was 1) is still 1, then log2(3) (which was 2) is now 1.58)

So the code would be:

log_train[skew_cols]=np.log2(featuresdf[skew_cols]+1)

The other option is to use other scaling methods that can handle 0, such as square root (np.sqrt)