Search code examples
pythontensorflowlstmsentence-similarity

TypeError: Fetch argument array has invalid type numpy.ndarray, must be a string or Tensor. (Can not convert a ndarray into a Tensor or Operation.)


I am trying to reproduce the results in siaseme LSTM to compare the semantic similarity of two sentences from here :- https://github.com/dhwajraj/deep-siamese-text-similarity

I am using tensorflow 1.4 & python 2.7

The train.py is working properly. For evaluating the model, I created a match_valid.tsv file which is a subset of "train_snli.txt" available there. I have modified the getTsvTestData function present in the input_helpers.py file.

def getTsvTestData(self, filepath):
        print("Loading testing/labelled data from "+filepath+"\n")
        x1=[]
        x2=[]
        y=[]
        # positive samples from file
        for line in open(filepath):
            l=line.strip().split("\t")
            if len(l)<3:
                continue
            x1.append(l[1].lower()) # text
            x2.append(l[0].lower()) # text
            y.append(int(l[2])) # similarity score 0 or 1
        return np.asarray(x1),np.asarray(x2),np.asarray(y)

I am getting error from this part of code in eval.py

for db in batches:
            x1_dev_b,x2_dev_b,y_dev_b = zip(*db)
            #x1_dev_b = tf.convert_to_tensor(x1_dev_b,)
            print("type x1_dev_b {}".format(type(x1_dev_b))) # tuple
            print("type x2_dev_b {}".format(type(x2_dev_b))) # tuple
            print("type y_dev_b {}\n".format(type(y_dev_b))) # tuple

            feed = {input_x1: x1_dev_b, 
                    input_x2: x2_dev_b, 
                    input_y:y_dev_b, 
                    dropout_keep_prob: 1.0}

            batch_predictions, batch_acc, sim = sess.run([predictions,accuracy,sim], feed_dict=feed)

            print("type batch_predictions {}".format(type(batch_predictions))) # numpy.ndarray
            print("type batch_acc {}".format(type(batch_acc))) # numpy.float32
            print("type sim {}".format(type(sim))) # numpy.ndarray

            all_predictions = np.concatenate([all_predictions, batch_predictions])

            print("\n printing batch predictions {} \n".format(batch_predictions))

            all_d = np.concatenate([all_d, sim])

            print("DEV acc {} \n".format(batch_acc))

I am getting this error. I tried to use print statement in sess.run() to find the type but it didn`t work.

Traceback (most recent call last):
  File "eval.py", line 92, in <module>
    batch_predictions, batch_acc, sim = sess.run([predictions,accuracy,sim], feed_dict=feed)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1105, in _run
    self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 414, in __init__
    self._fetch_mapper = _FetchMapper.for_fetch(fetches)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 234, in for_fetch
    return _ListFetchMapper(fetch)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 341, in __init__
    self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 242, in for_fetch
    return _ElementFetchMapper(fetches, contraction_fn)
  File "/home/joe/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 275, in __init__
    % (fetch, type(fetch), str(e)))
TypeError: Fetch argument array([ 1.,  1.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  0.,  0.,  1.,  0.,
        0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,
        0.,  1.,  0.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,
        0.,  0.,  1.,  1.,  1.,  0.,  1.,  1.,  0.,  1.,  1.,  1.,  1.,
        1.,  0.,  0.,  0.,  0.,  1.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,  0.,
        0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
        0.,  0.,  1.,  1.,  0.,  0.,  0.,  1.,  1.,  1.,  0.,  0.,  0.,
        0.,  0.,  0.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
        0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        1.,  0.,  0.,  1.,  0.,  0.,  1.,  0.,  1.,  1.,  0.,  1.,  0.,
        0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  0.,  0.,
        1.,  1.,  1.,  1.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,  0.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,
        0.,  1.,  0.,  0.,  1.,  0.,  0.,  1.,  1.,  0.,  0.,  1.,  0.,
        0.,  0.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
        0.,  1.,  0.,  0.,  1.,  0.,  1.,  1.,  0.,  1.,  0.,  1.,  0.,
        0.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  1.,  0.,  0.,  1.,
        1.,  0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.], dtype=float32) has invalid type <type 'numpy.ndarray'>, must be a string or Tensor. (Can not convert a ndarray into a Tensor or Operation.)

Actually, I am trying to do a query similarity, comparing the query vector to all document vectors in my corpus and rank the sentences based on the similarity score. I know that currently the LSTM is just comparing two sentences with each other and output the similarity as 0 or 1. How can I do that?


Solution

  • The problem is that you are replacing the value of sim, which (I suppose) initially contains a reference to a TensorFlow tensor or operation, with the result of evaluating it (which is a NumPy array), so the second iteration fails because sim is not a TensorFlow tensor or operation anymore.

    You can try something like this:

    for db in batches:
                x1_dev_b,x2_dev_b,y_dev_b = zip(*db)
                #x1_dev_b = tf.convert_to_tensor(x1_dev_b,)
                print("type x1_dev_b {}".format(type(x1_dev_b))) # tuple
                print("type x2_dev_b {}".format(type(x2_dev_b))) # tuple
                print("type y_dev_b {}\n".format(type(y_dev_b))) # tuple
    
                feed = {input_x1: x1_dev_b, 
                        input_x2: x2_dev_b, 
                        input_y:y_dev_b, 
                        dropout_keep_prob: 1.0}
    
                batch_predictions, batch_acc, batch_sim = sess.run([predictions,accuracy,sim], feed_dict=feed)
    
                print("type batch_predictions {}".format(type(batch_predictions))) # numpy.ndarray
                print("type batch_acc {}".format(type(batch_acc))) # numpy.float32
                print("type batch_sim {}".format(type(batch_sim))) # numpy.ndarray
    
                all_predictions = np.concatenate([all_predictions, batch_predictions])
    
                print("\n printing batch predictions {} \n".format(batch_predictions))
    
                all_d = np.concatenate([all_d, batch_sim])
    
                print("DEV acc {} \n".format(batch_acc))