If you have a dataset and trim 2% from both the top and bottom, for a 4% total trim, you're left with the middle 96% of scores. Would this mean the only remaining scores would be ranging from the .02 quantile to .98 quantile of the original dataset?
If this is incorrect, how would I trim so as to have only data from the .02 quantile to the .98 quantile?
I am using R and want to trim outliers this way.
Indeed, the 0.02 probability quantile, or second percentile, is the value below which 2% of your data is found.
To obtain the data between the 2nd and the 98th percentiles, you can use the quantile
function:
# Random samples from a normal distribution
x <- rnorm(1000)
# Quantiles
q <- quantile(x, probs = c(2, 98)/100)
# Samples between quantiles
x2 <- x[x>q[1] & x<q[2]]
Edit: regarding cleaning of outliers you might want to check the comments to this answer to a similar question. The gist is: simply removing a fixed percentage of your data to get rid of outliers is probably wrong.