Using shogun 6.1.3 and python 3.6.
I am trying to get replicable results in shogun for testing purposes, but I do not see a way to control the random seed.
I have tried:
import random
import numpy
from shogun import KMeans
random.seed(0)
numpy.random.seed(0)
km = KMeans(seed=0)
I am wanting to do this for many shogun algorithms, but here is simple example using KMeans:
from shogun import KMeans, RealFeatures, MulticlassLabels, EuclideanDistance
import numpy
trainX = numpy.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]] * 3).astype(float)
trainY = numpy.array([[0], [1], [2]] * 3).astype(float).flatten()
testX = numpy.array([[0, 1, 0], [0, 0, 1], [1, 0, 0]]).astype(float)
Xtrain = RealFeatures(trainX.T)
Ytrain = MulticlassLabels(trainY)
Xtest = RealFeatures(testX.T)
km = KMeans()
km.set_distance(EuclideanDistance(Xtrain, Xtrain))
km.train(Xtrain)
labs = km.apply_multiclass(Xtest)
labs.get_labels()
labs.get_labels()
is different each time, but I believe setting the random seed should yield a consistent result. Is there an attribute I can set, or some other way to control the randomness and get a consistent result?
In Shogun 6.1.3 (and earlier versions), you can use a (global) static call Math.init_random(seed)
.
Since having a global seed leads to reproducibility issues in multi-threaded settings, in the develop branch of Shogun, we have recently removed this. Instead you can set the seed (recursively) of particular objects using obj.put("seed", my_seed)
. Or, even simpler, using kwargs style initializers in Python: km = sg.machine("KMeans", k=2, distance=d, seed=1)
.
Both of those are documented in the generated meta examples, using the 6.1.3 and develop branch respectively. The website examples will be updated with the next release.