Search code examples
rdata-management

R: delete columns from data.frame if condition fulfilled


I have got a data.frame with approx. 20,000 columns. From this data.frame I want to remove columns for which the follow vector has a value of 1.

u.snp <- apply(an[25:19505], 2, mean)

I am sure there must be a straight forward way to accomplish this but can´t see it right now. Any hints would be greatly appreciated. Thanks.

Update: Thanks for your help. Now I tried the following:

cm <- colMeans(an.mdr[25:19505])
tail(sort(cm), n=40)

With the tail function I see that 22 columns out of 19481 columns of an.mdr have mean=1. Next I remove these columns using the code as suggested.

an.mdr.s <- an.mdr
an.mdr.s[colMeans(an.mdr.s[25:19505])==1] <- NULL

As anticipated an.mdr.s has 22 columns less than an.mdr. But when I calculate the column means for all but the first 24 columns I again have 22 columns with column mean=1 in an.mdr.s.

cmm <- colMeans(an.mdr.s[25:19483])
tail(sort(cmm), n=40)

Honestly, I cannot see what is going on here right now.


Solution

  • That should be quite easily accomplished with the following command:

    df[colMeans(df)==1] <- NULL