Search code examples
kerasdeep-learningnlptext-classificationelmo

Keras Prediction result (getting score,use of argmax)


I am trying to use the elmo model for text classification for my own dataset. The training is completed and the number of classes is 4(used keras model and elmo embedding).In the prediction, I got a numpy array. I am attaching the sample code and the result below...

import tensorflow as tf
import keras.backend as K
new_text_pr = np.array(data, dtype=object)[:, np.newaxis]
with tf.Session() as session:
    K.set_session(session)
    session.run(tf.global_variables_initializer())
    session.run(tf.tables_initializer())
    model_elmo = build_model(classes)
    model_elmo.load_weights(model+"/"+elmo_model)
    import time
    t = time.time()
    predicted = model_elmo.predict(new_text_pr)
    print("time: ", time.time() - t)
    print(predicted)
    # print(predicted[0][0])
    print("result:",np.argmax(predicted[0]))
    return np.argmax(predicted[0])

when I print the predicts variable I got this.

time:  1.561854362487793
 [[0.17483692 0.21439584 0.24001297 0.3707543 ]
 [0.15607062 0.24448264 0.4398888  0.15955798]
 [0.06494818 0.3439018  0.42254424 0.16860574]
 [0.08343349 0.37218323 0.32528472 0.2190985 ]
 [0.14868192 0.25948635 0.32722548 0.2646063 ]
 [0.0365712  0.4194748  0.3321385  0.21181548]
 [0.05350104 0.18225929 0.56712115 0.19711846]
 [0.08343349 0.37218323 0.32528472 0.2190985 ]
 [0.09541835 0.19085276 0.41069734 0.30303153]
 [0.03930932 0.40526104 0.45785302 0.09757669]
 [0.06377257 0.33980298 0.32396355 0.27246094]
 [0.09784496 0.2292052  0.44426462 0.22868524]
 [0.06089798 0.31685832 0.47317514 0.14906852]
 [0.03956613 0.46605557 0.3502095  0.14416872]
 [0.10513227 0.26166025 0.36598155 0.26722598]
 [0.15165758 0.22900137 0.50939053 0.10995051]
 [0.06377257 0.33980298 0.32396355 0.27246094]
 [0.11404029 0.21311268 0.46880838 0.2040386 ]
 [0.07556026 0.20502563 0.52019936 0.19921473]
 [0.11096822 0.23295449 0.36192006 0.29415724]
 [0.05018891 0.16656907 0.60114646 0.18209551]
 [0.08880813 0.2893545  0.44374797 0.1780894 ]
 [0.14868192 0.25948635 0.32722548 0.2646063 ]
 [0.09596984 0.18282187 0.5053091  0.2158991 ]
 [0.09428936 0.13995855 0.62395805 0.14179407]
 [0.10513227 0.26166025 0.36598155 0.26722598]
 [0.08244281 0.15743142 0.5462735  0.21385226]
 [0.07199708 0.2446867  0.44568574 0.23763043]
 [0.1339082  0.27288827 0.43478844 0.15841508]
 [0.07354636 0.24499843 0.44873005 0.23272514]
 [0.08880813 0.2893545  0.44374797 0.1780894 ]
 [0.14868192 0.25948635 0.32722548 0.2646063 ]
 [0.08924995 0.36547357 0.40014726 0.14512917]
 [0.05132649 0.28190497 0.5224545  0.14431408]
 [0.06377257 0.33980292 0.32396355 0.27246094]
 [0.04849219 0.36724472 0.39698333 0.1872797 ]
 [0.07206573 0.31368822 0.4667826  0.14746341]
 [0.05948553 0.28048623 0.41831577 0.2417125 ]
 [0.07582933 0.18771031 0.54879296 0.18766735]
 [0.03858965 0.20433436 0.5596278  0.19744818]
 [0.07443814 0.20681688 0.3933627  0.32538226]
 [0.0639974  0.23687115 0.5357675  0.16336392]
 [0.11005415 0.22901568 0.4279426  0.23298755]
 [0.12625505 0.22987585 0.31619486 0.32767424]
 [0.08893713 0.14554602 0.45740074 0.30811617]
 [0.07906891 0.18683094 0.5214609  0.21263924]
 [0.06316617 0.30398315 0.4475617  0.185289  ]
 [0.07060979 0.17987429 0.4829593  0.26655656]
 [0.0720717  0.27058697 0.41439256 0.24294883]
 [0.06377257 0.33980292 0.32396355 0.27246094]
 [0.04745338 0.25831962 0.46751252 0.22671448]
 [0.06624557 0.20708969 0.54820716 0.17845756]]
 result:3

Anyone have any idea about what is the use of taking the 0th index value only. Considering this as a list of lists 0th index means first list and the argmax returns index the maximum value from the list. Then what is the use of other values in the lists?. Why isn't it considered?. Also is it possible to get the score from this? I hope the question is clear. Is it the correct way or is it wrong?

I have found the issue. just posting it others who met the same problem.

Answer: When predicting with Elmo model, it expects a list of strings. In code, the prediction data were split and the model predicted for each word. That's why I got this huge array. I have used a temporary fix. The data is appended to a list then an empty string is also appended with the list. The model will predict the both list values but I took only the first predicted data. This is not the correct way but I have done this as a quick fix and hoping to find a fix in the future


Solution

  • To find the predicted class for each test example, you need to use axis=1. So, in your case the predicted classes will be:

    >>> predicted_classes = predicted.argmax(axis=1)
    >>> predicted_classes
    [3 2 2 1 2 1 2 1 2 2 1 2 2 1 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2
     2 2 2 2 2 2 3 2 2 2 2 2 1 2 2]
    

    Which means that the first test example belongs to the third class, and the second test example belongs to the second class and so on.

    The previous part answers your question (I think), now let's see what the np.argmax(predicted) does. Using np.argmax() alone without specifying the axis will flatten your predicted matrix and get the argument of the maximum number.

    Let's see this simple example to know what I mean:

    >>> x = np.matrix(np.arange(12).reshape((3,4)))
    >>> x
    matrix([[ 0,  1,  2,  3],
            [ 4,  5,  6,  7],
            [ 8,  9, 10, 11]])
    >>> x.argmax()
    11
    

    11 is the index of the 11 which is the biggest number in the whole matrix.