Search code examples
pythonamazon-web-servicesmachine-learningamazon-sagemakermxnet

What is correct input for mxnet's linear learner in AWS SageMaker?


I am trying to create a simple linear learner in AWS SageMaker with MXNet. I have never worked with SageMaker or MXNet previously. Fitting the model gives runtime error as follows and shuts the instance:

UnexpectedStatusException: Error for Training job linear-learner-2020-02-11-06-13-22-712: Failed. Reason: ClientError: Unable to read data channel 'train'. Requested content-type is 'application/x-recordio-protobuf'. Please verify the data matches the requested content-type. (caused by MXNetError)

I think that the data should be converted to protobuf format before passing as training data. Could someone please explain to me what is the correct format for MXNet models? What is the best way to convert a simple data frame into protobuf?


Solution

  • This end-to-end demo shows usage of Linear Learner from input data pre-processed in pandas dataframes and then converted to protobuf using the SDK. But note that:

    • There is no need to use protobuf, you can also pass csv data with the target variable on the first column of the files, as indicated here.
    • There is no need to know MXNet in order to use the SageMaker Linear Learner, just use the SDK of your choice, bring data to S3, and orchestrate training and inference :)