Runnning the dataeng-machine-learning
codelab on step 9. 4. Feature Engineering
.
The notebook step for running a tarin job is:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100
That works great no matter how many time I run it.
However if I run:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100 \
--verbosity DEBUG
Job fails after about 40 sec. with this in the logs:
The replica master 0 exited with a non-zero status of 2. Termination reason: Error.
I've found this usage in here: https://cloud.google.com/ml-engine/docs/how-tos/getting-started-training-prediction#cloud-train-single
So I guesss it's ok to use.
What am I doing wrong?
Note that every argument after the "-- \" line is a pass through to the tensorflow code and is therefore dependent on the individual sample code.
In this case, the "--verbosity" flag isn't supported by the sample you are running. Looking at the samples repo, it looks like the only sample that has that flag is the census estimator sample.