i am running a model for my own data set(the project was implemented for training/testing with ImageNet) with 2 classes. I have made all the changes (in config files etc) but after training finishes(successfully), i get the following error when starting testing:
wrote gt roidb to ./data/cache/ImageNetVID_DET_val_gt_roidb.pkl
Traceback (most recent call last):
File "experiments/dff_rfcn/dff_rfcn_end2end_train_test.py", line 20, in <module>
test.main()
File "experiments/dff_rfcn/../../dff_rfcn/test.py", line 53, in main
args.vis, args.ignore_cache, args.shuffle, config.TEST.HAS_RPN, config.dataset.proposal, args.thresh, logger=logger, output_path=final_output_path)
File "experiments/dff_rfcn/../../dff_rfcn/function/test_rcnn.py", line 68, in test_rcnn
roidbs_seg_lens[gpu_id] += x['frame_seg_len']
KeyError: 'frame_seg_len'
I cleaned the cache file before running. As i have read in previous topics, this might be an issue of previous datasets .pkl files in cache. What may have caused this error? I also want to mention that i changed .txt filenames that feed the neural network(if this is important), and that training finishes well. This is my first time running a project in Deep Learning so please show some understanding.
MXNet typically uses methods other than pickle
directly for serialization of the model architecture and trained weights.
With Gluon API, you can save weights of the model to a file (i.e. Block) with .save_params()
and then load the weights from a file with .load_params()
. You 'save' the model architecture by keeping the code used to define the model. See and example of this here.
With Module API, you can create checkpoints at the end of each epoch which will save the symbol (i.e. model architecture) and the parameters (i.e. model weights). See here.
checkpoint = mx.callback.do_checkpoint(model_prefix)
mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)
You can then load the model of a given checkpoint (e.g. 42 in this example)
sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 42)
mod.set_params(arg_params, aux_params)