Search code examples
pythontensorflowkerasbatchsize

how to identify shape of tensor while using batch_size in keras (initial_value must have a shape specified)


I have a custom layer, in one line of this custome layer I do like this:

out = tf.Variable(tf.zeros(shape=tf.shape(tf_a1), dtype=tf.float32))

When I run code, I received this error:

ValueError: initial_value must have a shape specified: Tensor("lambda_1/zeros_2:0", shape=(?, 20), dtype=float32)

I searched and find out that I can use validate_shape=False

So I change the code to:

out = tf.Variable(tf.zeros(shape=tf.shape(tf_a1), dtype=tf.float32), validate_shape=False)

Then it raises this error:

ValueError: Input 0 is incompatible with layer repeater: expected ndim=2, found ndim=None

Update1

when I try this one:

out = tf.Variable(tf.zeros_like(tf_a1, dtype=tf.float32))

It raises again the error:

initial_value must have a shape specified: Tensor("lambda_1/zeros_like:0", shape=(?, 20), dtype=float32)

Also, when I give it explicitly like this:

out = tf.Variable(tf.zeros(shape=(BATCH_SIZE, LATENT_SIZE), dtype=tf.float32))

It raises this error:

ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

Just in case the model can help to figue out where this error comes from:

this is the lambda layer, in which just change the matrice a little bit:

def score_cooccurance(tf_a1):
    N = tf.shape(tf_a1)[0]
    n = 2
    input_tf = tf.concat([tf_a1, tf.zeros((1, tf_a1.shape[1]), tf_a1.dtype)], axis=0)
    tf_a2 = tf.sort(sent_wids, axis=1)
    first_col_change = tf.zeros([tf_a2.shape[0], 1], dtype=tf.int32)
    last_cols_change = tf.cast(tf.equal(tf_a2[:, 1:], tf_a2[:, :-1]), tf.int32)
    change_bool = tf.concat([first_col_change, last_cols_change], axis=-1)
    not_change_bool = 1 - change_bool
    tf_a2_changed = tf_a2 * not_change_bool + change_bool * N #here

    idx = tf.where(tf.count_nonzero(tf.gather(input_tf, tf_a2_changed, axis=0), axis=1) >= n)
    y, x = idx[:, 0], idx[:, 1]
    rows_tf = tf.gather(tf_a2, y, axis=0)

    columns_tf = tf.cast(x[:, None], tf.int32)

    out = tf.Variable(tf.zeros(shape=(BATCH_SIZE, LATENT_SIZE), dtype=tf.float32))

    rows_tf = tf.reshape(rows_tf, shape=[-1, 1])

    columns_tf = tf.reshape(
        tf.tile(columns_tf, multiples=[1, tf.shape(tf_a2)[1]]),
        shape=[-1, 1])

    sparse_indices = tf.reshape(
        tf.concat([rows_tf, columns_tf], axis=-1),
        shape=[-1, 2])
    v = tf.gather_nd(input_tf, sparse_indices)
    v = tf.reshape(v, [-1, tf.shape(tf_a2)[1]])

    scatter = tf.scatter_nd_update(out, tf.cast(sparse_indices, tf.int32), tf.reshape(v, shape=[-1]))
    return scatter

Actually when I print out the shape of out it print out <unknown>.

Any idea or tricks how can I fix this?

I am using tensorflow 1.13.

Thanks for your help:)


Solution

  • So the workaround in my case was to remove the tf.variable an only have tf.zeros. in this case, the tf.scater_nd_update raises error as it can not be applied on the tensors.

    It seems that there is tensor_scatter_nd_update I did not know before. so I changed that line as well, now the code working fine, though I did not get the main reason of the error. I just change it to this way to get it run sucessfully.

    out = tf.zeros(shape=tf.shape(tf_a1), dtype=tf.float32)
    scatter = tf.tensor_scatter_update(out, tf.cast(sparse_indices, tf.int32), tf.reshape(v, shape=[-1]))
    

    Thanks to @Daniel Moller for pointing out the concept of trainable variables... :)