Search code examples
pythonscikit-learncrfsuite

My sklearn_crfsuite model does not learn anything


I'm trying to create an annotations prediction model, following the tutorial here, but my model doesn't learn anything. Here is a sample of my training data and labels:

[{'bias': 1.0, 'word.lower()': '\nreference\nissue\ndate\ndgt86620\n4\n \n19-dec-05\nfalcon\n7x\ntype\ncertification\n27_4-100\nthis\ndocument\nis\nthe\nintellectual\nprop...nairbrakes\nhandle\nposition\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n0\ntable\n1\n:\nairbrake\ncas\nmessages\n', 'word[-3:]': 'es\n', 'word[-2:]': 's\n', 'word.isupper()': False, 'word.istitle()': False, 'word.isdigit()': False, 'postag': 'POS', 'postag[:2]': 'PO', 'w_emb_0': 0.03418987928976114, 'w_emb_1': 0.617338281 1066742, 'w_emb_2': 0.004420982990809508, 'w_emb_3': 0.08293022662242588, 'w_emb_4': 0.22162269482070363, 'w_emb_5': 0.4334545347397811, 'w_emb_6': 0.7844891779932379, 'w_emb_7': 0.028043262790094503, 'w_emb_8': 0.5233847386564157, 'w_emb_9': 0.9685677133128328, 'w_em b_10': 0.19379126558708126, 'w_emb_11': 0.2809608896964926, 'w_emb_12': 0.384759230815804, 'w_emb_13': 0.15385904662767336, 'w_emb_14': 0.5206500040610533, 'w_emb_15': 0.009148526006733215, 'w_emb_16': 0.5894118695171416, 'w_emb_17': 0.7356989708459056, 'w_emb_18': 0. 5576774100159024, 'w_emb_19': 0.2185294430010376, 'BOS': True, '+1:word.lower()': 'reference', '+1:word.istitle()': False, '+1:word.isupper()': True, '+1:postag': 'POS', '+1:postag[:2]': 'PO'}, {'bias': 1.0, 'word.lower()': 'reference', 'word[-3:]': 'NCE', 'word[-2:]' : 'CE', 'word.isupper()': True, 'word.istitle()': False, 'word.isdigit()': False, 'postag': 'POS', 'postag[:2]': 'PO', 'w_emb_0': -0.390038, 'w_emb_1': 0.30677223, 'w_emb_2': -1.010975, 'w_emb_3': 0.3656154, 'w_emb_4': 0.5319459, 'w_emb_5': 0.45572615, 'w_emb_6': -0.4 6090943, 'w_emb_7': 0.87250936, 'w_emb_8': 0.036648277, 'w_emb_9': -0.3057043, 'w_emb_10': 0.33427167, 'w_emb_11': -0.19664396, 'w_emb_12': -0.64899784, 'w_emb_13': -0.1785065, 'w_emb_14': -0.117423356, 'w_emb_15': 0.16247013, 'w_emb_16': 0.11694676, 'w_emb_17': -0.30 693895, 'w_emb_18': -1.0026807, 'w_emb_19': 0.9946743, '-1:word.lower()': '\nreference...n \n \n \n \n \n \n \n \n0\ntable\n1\n:\nairbrake\ncas\nmessages\n', '-1:word.istitle()': False, '-1:word.isupper()': False, '-1:postag': 'POS', '-1:postag[:2]': 'PO', '+1:word.lower()': 'issue', '+1:word.istitle()': False, '+1:word. isupper()': True, '+1:postag': 'POS', '+1:postag[:2]': 'PO'}, {'bias': 1.0, 'word.lower()': 'issue', 'word[-3:]': 'SUE', 'word[-2:]': 'UE', 'word.isupper()': True, 'word.istitle()': False, 'word.isdigit()': False, 'postag': 'POS', 'postag[:2]': 'PO', 'w_emb_0': -1.220 4882, 'w_emb_1': 0.8920707, 'w_emb_2': -3.8380668, 'w_emb_3': 1.5641377, 'w_emb_4': 2.1918254, 'w_emb_5': 1.8509868, 'w_emb_6': -2.0664182, 'w_emb_7': 3.1591077, 'w_emb_8': -0.33126026, 'w_emb_9': -1.4278139, 'w_emb_10': 0.9291533, 'w_emb_11': -0.6761407, 'w_emb_12': -2.9582167, 'w_emb_13': -0.5395561, 'w_emb_14': -0.8363763, 'w_emb_15': 0.25568742, 'w_emb_16': 0.4932978, 'w_emb_17': -1.6198335, 'w_emb_18': -4.183924, 'w_emb_19': 4.281094, '-1:word.lower()': 'reference', '-1:word.istitle()': False, '-1:word.isupper()': True, '-1:p ostag': 'POS', '-1:postag[:2]': 'PO', '+1:word.lower()': 'date', '+1:word.istitle()': False, '+1:word.isupper()': True, '+1:postag': 'POS', '+1:postag[:2]': 'PO'}...]
y_train = ['O', 'O', 'O'...'I-data-c-a-s_message-type'....'B-data-c-a-s_message-type']

and here is the model definition and training:

`

crf = sklearn_crfsuite.CRF(
            algorithm='lbfgs',
            c1=0.1,
            c2=0.1,
            max_iterations=100,
            all_possible_transitions=True
        )
crf.fit(X_train, y_train)

y_pred = crf.predict(X_test)
sorted_labels = sorted(labels, key=lambda name: (name[1:], name[0]))

msg = metrics.flat_classification_report(y_test, y_pred, labels=labels, digits=4)
print(msg)

`

and unfortunately, my model doesn't learn anything:

                           precision    recall  f1-score   support   
B-data-c-a-s_message-type     0.0000    0.0000    0.0000        23  
I-data-c-a-s_message-type     0.0000    0.0000    0.0000        90
                micro avg     0.0000    0.0000    0.0000       113
                macro avg     0.0000    0.0000    0.0000       113
             weighted avg     0.0000    0.0000    0.0000       113

Solution

  • The problem is solved. As you can see above, the support (number of evaluation samples) is a total of 113. However, the number of samples in the training set was just about 14 !! which is too small ! and I've just not noticed this difference. I've inverted the training and test datasets, and now, performances are something like this:

                                precision    recall  f1-score   support
    B-data-c-a-s_message-type     0.0000    0.0000    0.0000     0     
    I-data-c-a-s_message-type     0.6364    1.0000    0.7778     14
                    micro avg     0.6364    1.0000    0.7778     14                    
                    macro avg     0.3182    0.5000    0.3889     14             
                 weighted avg     0.6364    1.0000    0.7778      14