Search code examples
rcluster-analysishierarchical-clustering

r: Retrieve optimal number of clusters from NbClust() according to majority rule without looking at console


I am running NbClust() on one-dimensional data:

nc <- NbClust(df, distance="euclidean", min.nc=2, max.nc=10, method="complete")

and get the following output on my console:

[1] "Frey index : No clustering structure in this data set"
*** : The Hubert index is a graphical method of determining the number of clusters.
                In the plot of Hubert index, we seek a significant knee that corresponds to a 
                significant increase of the value of the measure i.e the significant peak in Hubert
                index second differences plot. 

*** : The D index is a graphical method of determining the number of clusters. 
                In the plot of D index, we seek a significant knee (the significant peak in Dindex
                second differences plot) that corresponds to a significant increase of the value of
                the measure. 

******************************************************************* 
* Among all indices:                                                
* 1 proposed 4 as the best number of clusters 
* 1 proposed 8 as the best number of clusters 
* 2 proposed 9 as the best number of clusters 
* 2 proposed 10 as the best number of clusters 

                   ***** Conclusion *****                            

* According to the majority rule, the best number of clusters is  9 


*******************************************************************

How can I retrieve the value "9" (in the last line of the above output) without looking at it?

Thank you!

Normalized data looks as follows:

df <- structure(list(V1 = c(-0.142196220923589, 4.3271395706369, 5.00420146139183, 
    -0.292948282536991, -0.292948282536991, -0.292948282536991, -0.191455118249021, 
    -0.292948282536991, -0.292948282536991, -0.292948282536991, 1.04365387777657, 
    0.150712390018241, -0.275757257967042, -0.292948282536991, -0.292948282536991, 
    0.00392748792098075, -0.0235120320656692, 0.150712390018241, 
    -0.292948282536991, 0.22278245456149, -0.292948282536991, -0.292948282536991, 
    0.0888908208916921, -0.292948282536991, -0.269806518692829, -0.292948282536991, 
    -0.292948282536991, -0.292948282536991, -0.292948282536991, -0.287328139889123, 
    -0.030454561218918, 0.25980927671215, -0.292948282536991, -0.223192394378158, 
    -0.292948282536991, -0.292948282536991, -0.292948282536991, 0.0657490570475295, 
    -0.292948282536991, -0.292948282536991, -0.292948282536991, -0.215258075345874, 
    0.0862460478809306, 0.0862460478809306, -0.522051744594201, -0.518084585078059, 
    -0.496595804365622, -0.522051744594201, -0.516431601946333, -0.518084585078059
    )), .Names = "V1", row.names = c(NA, -50L), class = "data.frame")

Solution

  • Thanks to zx8754, I found out that the following yields the desired value from the console output

    length(unique(nc$Best.partition))