Search code examples
mxnet

Custom MXNet Operators and kAddTo


I'm a writing a C++ custom operator in MXNet and am having trouble finding documentation on when kAddTo is set in an operator invocation. As a minimal example, let's say that my new operator is called foo() and I want to perform the following calculation:

A = mx.sym.Variable('A')
B = mx.sym.Variable('B')
T = mx.sym.foo(A)
T += mx.sym.foo(B)

In general, how do I ensure that the fourth line above accumulates into T as opposed to creating a new temporary storage for the result of mx.sym.foo(B) and then performing the T = T + temp calculation?

(Using the Kernighan-Ritchie debugger, aka print statements, I found that kWriteTo is set on both lines three and four. The enum kAddTo is never set.)

A bit more detail concerning my specific problem: in my current implementation foo() zeroes out the output memory before performing a calculation which populates it with the appropriate values. I definitely only want to perform this zeroing out when creating a new output location, not when accumulating into an existing one.

Update

Offline, a colleague suggested using

mx.sym.elemwise_add(lhs=T, rhs=mx.sym.foo(B), out=T)

in place of line 4, above. However, I still saw that kWriteTo was being set in both lines of computation. I then received the following response:

“Memory planning and inplace operations are automatic. It will be done automatically. Users don’t need to worry about it.”, which probably means that req[0] is not an accurate indicator in this case. If you want to verify whether it’s an inplace addTo, you can print out the value of outputs[0].dptr_ and lhs.dptr_ to see whether they are equal.

I haven't checked this, yet.


Solution

  • Operator can not control in which mode it will be executed. The thing is, only graph optimizer knows the context in which the operator is used and can make a decision if the operator need to be executed in the kWriteTo or kAddTo. More precisely, this happens here in the method DetectInplaceAddTo .And even if in some cases it has been executed in kAddTo this behavior might be changed in the future due to change in the logic that optimizes the computational graph.

    “Memory planning and inplace operations are automatic. It will be done automatically. Users don’t need to worry about it.”

    This means that operator can not control in which mode it is execute, however the operatro MUST strictly obey the mode that has been requested (kWriteTo or kAddTo). For example if the mode is kWriteTo and the operator tries to add diff to the outputs, instead of overriding what is in it, this would lead to an unpredictable results since outputs might be populated with the garbage. On the other hand if the mode is kAddTo however the operator does not support it it might be even worse, since, instead of adding the results to the outputs it will just override the outputs(cases like this usually very hard to debug). This leads, time to time, to bugs like this one.

    So, in short:

    In general, how do I ensure that the fourth line above accumulates into T as opposed to creating a new temporary storage for the result of mx.sym.foo(B) and then performing the T = T + temp calculation?

    You can not, it's not the operator decision in which mode to be executed. Even if the configuration is using mode kAddTo with future versions of the MXNet. Also in the future there might be possible to create new APIs to send a hint to a graph optimizer (or suggestion) to use particular mode. But I'm not aware of such development.

    Now the question: "in which particular case MXNet 0.10/0.11 will use kAddTo"?

    This is tricky, by looking on the following code:

      for (uint32_t nid = 0; nid < idx.num_nodes(); ++nid) {
        const auto& inode = idx[nid];
        if (inode.source->op() != ewise_plus_op) continue; // <= HERE
        int sid = storage_id[idx.entry_id(inode.inputs[0])];
    

    It looks like the kAddTo used only during _grad_add, which is sad. Also this might be a bug, since maybe instead of:

    static const Op* ewise_plus_op = Op::Get("_grad_add");
    

    Actual intention was:

    static const Op* ewise_plus_op = Op::Get("elemwise_add");