Could someone explain the meaning of isotropic gaussian blobs which are generated by sklearn.datasets.make_blobs()
. I am not getting its meaning and only found this Generate isotropic Gaussian blobs for clustering on sklearn documentation. Also I have gone through this question.
So,heres my doubt
from sklearn.datasets import make_blobs
# data set generate
X, y = make_blobs(n_samples = 100000, n_features = 2, centers = 2, random_state = 2, cluster_std = 1.5)
# scatter plot of blobs
plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = 'RdBu')
# distribution of first feature
sns.histplot(x = X[:, 0], kde = True)
As the the distribution followed by this feature is approximately Normal.
# distribuution of second feature
sns.histplot(x = X[ :, 1], kde = True, color = "green", alpha = 0.2 )
The distribution of the second feature is Bimodal which is not normal.
# overall distribution of values
sns.histplot(x = X.flatten(), color = "red", kde = True, alpha = .5)
Which is also not normal!
# Variance Covrariance Matrix of Features
np.cov(X[:, 0], X[:, 1])
Output
array([[ 3.55546911, 4.70526192],
[ 4.70526192, 19.00023664]])
What does it actually mean by Gaussian here!. It might be a silly question so appologies in advance.
I am sharing the things in the nutshell.
The code snippet for understanding the make_blobs()
is here. make_blobs_notebook