python amazon-sagemaker scikit-learn-pipeline

debug and deploy featurizer (data processor for imodel inference) of sagemaker endpoint

I am looking at this example to implement the data processing of incoming raw data for a sagemaker endpoint prior to model inference/scoring. This is all great but I have 2 questions:

How can one debug this (e.g can I invoke endpoint without it being exposed as restful API and then use Sagemaker debugger)
Sagemaker can be used "remotely" - e.g. via VSC. Can such a script be uploaded programatically?

Thanks.

Solution

Sagemaker Debugger is only to monitor the training jobs.

https://docs.aws.amazon.com/sagemaker/latest/dg/train-debugger.html

I dont think you can use it on Endpoints.

The script that you have provided is used both for training and inference. The container used by the estimator will take care of what functions to run. So it is not possible to debug the script directly. But what are you debugging in the code ? Training part or the inference part ?

While creating the estimator we need to give either the entry_point or the source directory. If you are using the "entry_point" then the value should be relative path to the file, if you are using "source_dir" then you should be able to give an S3 path. So before running the estimator, you can programmatically tar the files and upload it to S3 and then use the S3 path in the estimator.