How to specify regularization parameter (L1 or L2) for a feed forward neural network in R using the mxnet package?

I am using R mxnet package. Here is the code block that I am currently using. But I am not sure how to specify regularization.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         eval.data          = list(data  = testX,
                                                                   label = testY
                                         ),
                                         initializer        = mx.init.normal(initValVar),
                                         epoch.end.callback = mx.callback.log.train.metric(5, logger)
)

Solution

As @leezu's answer says, you need to set weight decay to get L2 regularisation. In the R API, the argument you need is wd e.g.

dpLnModel <- mx.model.FeedForward.create(symbol             = out,
                                         X                  = trainX,
                                         y                  = trainY,
                                         ctx                = mx.cpu(),
                                         num.round          = numIter,
                                         eval.metric        = mx.metric.rmse,
                                         array.batch.size   = 50,
                                         array.layout       = "rowmajor",
                                         verbose            = TRUE,
                                         optimizer          = "rmsprop",
                                         wd                 = 0.00001)

I think you can include any arguments from mx.opt.rmsprop. Note that the documentation there says that the default value of wd is zero i.e. no regularisation.