Search code examples
pythonshogun

Can the random seed be controlled from shogun's python interface?


Using shogun 6.1.3 and python 3.6.

I am trying to get replicable results in shogun for testing purposes, but I do not see a way to control the random seed.

I have tried:

import random
import numpy
from shogun import KMeans

random.seed(0)
numpy.random.seed(0)
km = KMeans(seed=0)

I am wanting to do this for many shogun algorithms, but here is simple example using KMeans:

from shogun import KMeans, RealFeatures, MulticlassLabels, EuclideanDistance
import numpy

trainX = numpy.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]] * 3).astype(float)
trainY = numpy.array([[0], [1], [2]] * 3).astype(float).flatten()
testX = numpy.array([[0, 1, 0], [0, 0, 1], [1, 0, 0]]).astype(float)

Xtrain = RealFeatures(trainX.T)
Ytrain = MulticlassLabels(trainY)
Xtest = RealFeatures(testX.T)

km = KMeans()
km.set_distance(EuclideanDistance(Xtrain, Xtrain))
km.train(Xtrain)
labs = km.apply_multiclass(Xtest)
labs.get_labels()

labs.get_labels() is different each time, but I believe setting the random seed should yield a consistent result. Is there an attribute I can set, or some other way to control the randomness and get a consistent result?


Solution

  • In Shogun 6.1.3 (and earlier versions), you can use a (global) static call Math.init_random(seed).

    Since having a global seed leads to reproducibility issues in multi-threaded settings, in the develop branch of Shogun, we have recently removed this. Instead you can set the seed (recursively) of particular objects using obj.put("seed", my_seed). Or, even simpler, using kwargs style initializers in Python: km = sg.machine("KMeans", k=2, distance=d, seed=1).

    Both of those are documented in the generated meta examples, using the 6.1.3 and develop branch respectively. The website examples will be updated with the next release.