I want to convert raw data to Gaussian (mean=0, std =1) using qqnorm function. What I realize though, is that for the same raw values, I get different Gaussian value. Eg:
mydata = c(2.4, 3.7, 2.1, 3, 1.6, 2.5, 2.9, 2.9 )
myquant = qqnorm(mydata)
myquant
-0.4727891 1.4342002 -0.8524950 0.8524950 -1.4342002 -0.1525060 0.1525060 0.4727891
Moreover, I have used the following code to transform data into normal one:
for (i in 1:ncol(sampledataSubGaus) ) {
sampledataSubGaus[,i] <- qqnorm( as.matrix(sampledataSub[,i]) )$x
}
where I face the same issue again. Is there an explanation for that? For your information, I have used another function called score.transform, which behaves properly.
I am not quite sure what you mean by "convert" your data to N(0,1) using qqnorm
. The qqnorm()
function returns x
, which are the normal quantiles associated with the corresponding quantiles from your data. The guts of qqnorm()
are doing the following:
mydata = c(2.4,3.7,2.1,3,1.6,2.5,2.9, 2.9 )
y <- mydata
n <- length(y)
x <- qnorm(ppoints(n))[order(order(y))]
plot(x,y)
If you took a subset of these values, you would get different values of x
, because it would be using a different number of points to generate the normal quantiles (i.e., the values of ppoints(n)
would be different).
I could be wrong, but I have never heard of someone using qqnorm()
to transform data - it is a diagnostic for normality, but not a remedy. Something like a Box-Cox transformation could, under the right circumstances, help transform a skewed variable into something that had a more normal distribution.