I have 72 sample in my datExprSTLMS as gene expression dataset and ran clustering on this data set based on below code:
new_hclust = hclust(dist(datExprSTLMS), method = "average")
Cutreecluster_Sample <- cutreeDynamic(dendro = new_hclust, minClusterSize = 5,
method = "tree")
and then I got table as below:
table(Cutreecluster_Sample)
Cutreecluster_Sample
0 1 2 3 4
1 24 22 18 7
Now, sample in cluster by 0 is the outlier and I would like to remove it from my dataset. so I run below code for keeping all samples except the sample is in cluster 0
keepSamples = (Cutreecluster_Sample==!0)
but when I run table for keepsamples I see below result:
> table(keepSamples)
keepSamples
FALSE TRUE
48 24
As you see in keepSamples
I have just 24 samples instead of 71 samples.
I appreciate if anybody guides me in code level for solving my problem.
Change keepSamples = (Cutreecluster_Sample==!0)
to keepSamples = (Cutreecluster_Sample!=0)
Why? Evaluating your command from right to left: !0
is a logical negation of 0
, which is equivalent to !FALSE
in R. Thus !0
is equal to TRUE
. You then check if Cutreecluster_Sample
equals TRUE
. TRUE
coerced to numeric is 1
in R. Thus your check is actually TRUE
iff samples are in cluster 1, not cluster 0.
Try !0 == 1
and FALSE == 0
.