I am stuck in a problem, I am trying to apply SOM analysis using the kohonen package in R. The dataset I am using is a gene expression dataframe. I am using the code given below,
dim(bink)
[1] 401 1198
Where bink is a dataframe having expression values for 1198 genes, where gene names are in the columns as shown below,
The rest of the code is given below,
grid <- somgrid(xdim = 5, ydim = 5, topo = "hexagonal")
som.wines <- som(scale(bink), grid = somgrid(xdim = 5, ydim = 5, "hexagonal"))
str(som.wines)
plot(som.wines, type = "mapping")
After applying the code above I get the plot shown below,
But I am not able to get the name of the genes clustered in each circle, I have tried to use the answer given here for which the code is given below,
x= attr(som.wines$data,"scaled:center")
y= attr(som.mines$data,"scaled:scale")
for (i in 1:ncol(som.wines$data)){
z[,i] = som.wines$data[,i][som.wines$unit.classif==1] * y[i]+x[i]
}
Then I am getting the error given below,
# Error in 1:ncol(som.wines$data) : argument of length 0
I also tried changing the way to access the data by using som.wines$data[[1]]
but it does not work.
Is there any way to solve this problem ?
Thanks
Using data(wines)
as an example.
som.wines <- som(scale(wines), grid = somgrid(5, 5, "hexagonal"))
Each big circle in your plot is a cluster of samples found in the data by rows.
The profiles of the clusters are stored in som.wines$codes
. Each line here is a cluster, V1 - Vx. This corresponds, obviously, to the number of big circles. You find the associated rows, i.e. the original data, in som.wines$unit.classif
.
Associate the clusters with your original data with
cbind(wines, cluster=som.wines$unit.classif)
The arrangement of big circles used in the plot correspond with numbers in som.wines$codes
in that the bottom left big circle is V1
and the top right is Vx
, i.e. the last cluster.