Search code examples
rstatisticstransformationskew

Transform left skewed data in R


I have a column that is left-skewed, I need to transform it. So I tried this

library(car)
vect<-c(1516201202, 1526238001, 1512050372, 1362933719, 1516342174, 1526502557 ,1523548827, 1512241202,1526417785, 1517846464)
powerTransform(vect)

The values in the vector are 13 digit numeric unix epoch timestamps like this I have few thousand values, pasting 10 of them here, I do the same operation on the entire column. This gave me an error

Error in qr.resid(xqr, w * fam(Y, lambda, j = TRUE, ...)) : NA/NaN/Inf in foreign function call (arg 5)

I was expecting transformed column back. Any Idea on how to do this in R?

Thanks Raj


Solution

  • Generally, car::powerTransform returns a powerTransform object (which is a list containing amongst other things the estimated Box-Cox transformation parameter(s)). To get the transformed values, you need bcPower, which takes the car::powerTransform output object to transform the original data.

    Unfortunately you don't provide sample data, so here's an example based on the iris dataset.

    library(car)
    
    # Box-Cox transformation of `Sepal.Length`
    df <- iris
    trans <- powerTransform(df$Sepal.Length)
    # Or the same using formula syntax:
    # trans <- powerTransform(Sepal.Length ~ 1, data = df)
    
    # Add the transformed `Sepal.Length` data to the original `data.frame`
    df <- cbind(
        df,
        Sepal.Length_trans = bcPower(
            with(iris, cbind(Sepal.Length)), coef(trans))[, 1])
    
    # Show a histogram of the Box-Cox-transformed data    
    library(ggplot2)
    ggplot(df, aes(Sepal.Length_trans)) +
        geom_histogram(aes(Sepal.Length_trans), bins = 30)
    

    enter image description here