Search code examples
rtrim

Find 10% highest and lowest values, trim all columns in R


I have a dataset with different columns. I want to find the 10% highest and lowest values in the RES column and delete all corresponding values.

So far I have this:

    library(DescTools)
par(mfcol=c(1,2)); 
Tdata=Trim(data$res, trim=0.1)

Tdata = as.data.frame(Tdata)
hist(data$res); hist(Tdata)
View(cbind(data$res,Tdata))

It seems to me that this does the job in deleting the 10% highest and lowest values but it does so by creating a new variable. Instead, I want to delete all corresponding rows from the dataset.

In this GoogleDrive folder you can find the dataset.

Thank you in advance


Solution

  • Not tested, but

    library(data.table)
    #set to data.table object
    dt <- as.data.table(data)
    
    #select only rows between 0.1 and 0.9 quantiles
    dt <- dt[res >= quantile(res, 0.1) & res <= quantile(res, 0.9)]