I am using below code in tsai library to train my timeseries binary classification model. But I am unable to get the same accuracy on validation set when I try to load the best model and then test validation data on it. Perhaps my method of loading best model during testing is wrong. Can someone please help what I am doing wrong here and what is the correct method to load saved best model and test it.
Train code:
def train(dataset_idx):
print("starting process:", dataset_idx)
y_test = load(open(r"y_test_"+str(dataset_idx)+".pkl", 'rb'))
X_test = load(open(r"X_test_"+str(dataset_idx)+".pkl", 'rb'))
y_train = load(open(r"y_train_"+str(dataset_idx)+".pkl", 'rb'))
X_train = load(open(r"X_train_"+str(dataset_idx)+".pkl", 'rb'))
l = X_train.shape[0]
print("data loaded")
X_train = np.concatenate([X_train, X_test], axis=0)
y_train = np.concatenate([y_train, y_test], axis=0)
del X_test, y_test
splits = [i for i in range(l)], [i for i in range(l, X_train.shape[0])]
print("dataset generated")
tfms = [None, TSClassification()]
batch_tfms = TSStandardize()
precision = Precision()
recall = Recall()
save_callback = SaveModelCallback(monitor='valid_loss', comp=None, fname=str(dataset_idx)+'_best_model', every_epoch=False, at_end=False, with_opt=False, reset_on_fit=True)
early_stopping = EarlyStoppingCallback(monitor='valid_loss', patience=3)
clf = TSClassifier(X_train, y_train, splits=splits, arch="InceptionTimePlus", tfms=tfms, batch_tfms=batch_tfms, bs=[1024], metrics=[precision, recall], cbs=[save_callback, early_stopping])
clf.fit_one_cycle(50, 2.5e-4)
clf.export(str(dataset_idx)+".pkl")
Test code:
learner = TSClassifier(X_sample, y_sample, splits=splits, arch="InceptionTimePlus", tfms=tfms, batch_tfms=batch_tfms)
learner.load(str(dataset_idx)+'_best_model')
y_pred, _, _ = learner.get_X_preds(X_test)
y_pred = y_pred.numpy()[:,0]
y_pred = np.asarray(y_pred>0.5).astype(int)
I am getting all zeros in testing, but during training it showed 70% precision and 20% recall.
Also, if I load different models initially and then load same best model to 2 different instances of classifier, then my output for same data is different.But ideally it should be same as same best model is loaded for testing.
clf = load_learner(“models/clf1.pkl”)
clf .load(str(dataset_idx)+‘_best_model’)
probas, target, preds = clf.get_X_preds(X[splits[1]], y[splits[1]])
clf = load_learner(“models/clf2.pkl”)
clf .load(str(dataset_idx)+‘_best_model’)
probas, target, preds = clf.get_X_preds(X[splits[1]], y[splits[1]])
In above 2 scenerios , the output for same data is different. I dont understand why.Can someone please let me know the correct method.
use "at_end=False" in savemodel callback, save the model after training and then load the model:-
save_callback = SaveModelCallback(monitor='valid_loss', comp=None, fname='sample_best_model', every_epoch=False, at_end=False, with_opt=False, reset_on_fit=True)
early_stopping = EarlyStoppingCallback(monitor='valid_loss', patience=3)
clf = TSClassifier(X_sample, y_sample, splits=splits, arch="InceptionTimePlus", tfms=tfms, batch_tfms=batch_tfms, bs=[1024], metrics=[precision, recall], cbs=[save_callback, early_stopping])
clf.fit_one_cycle(10, 2.5e-4)
clf.export("sample_trained_model.pkl")
from tsai.inference import load_learner
mv_clf = load_learner("sample_trained_model.pkl")