Search code examples
pythonmxnet

mxnet custom activation function / op in numpy


I have a question regarding the syntax used in creating a custom activation function / op in mxnet. I was looking at this example: https://github.com/dmlc/mxnet/blob/master/example/numpy-ops/custom_softmax.py

Specifically, this part:

class Softmax(mx.operator.CustomOp):
    def forward(self, is_train, req, in_data, out_data, aux):
        x = in_data[0].asnumpy()
        y = np.exp(x - x.max(axis=1).reshape((x.shape[0], 1)))
        y /= y.sum(axis=1).reshape((x.shape[0], 1))
        self.assign(out_data[0], req[0], mx.nd.array(y))

    def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
        l = in_data[1].asnumpy().ravel().astype(np.int)
        y = out_data[0].asnumpy()
        y[np.arange(l.shape[0]), l] -= 1.0
        self.assign(in_grad[0], req[0], mx.nd.array(y))

What's up with in_data[0] vs in_data[1], and out_data[0] vs out_data[1]? What do the indices correspond to?

Thanks!


Solution

  • in_data=[input, label], out_data=[output]

    Take a look at softmax output API: https://github.com/dmlc/mxnet/blob/master/src/operator/softmax_output-inl.h

    CHECK_EQ(in_data.size(), 2U) << "SoftmaxOutput Input: [data, label]";
    CHECK_EQ(out_data.size(), 1U) << "SoftmaxOutput Output: [output]";