Search code examples
pythontensorflowkerasquantization

Map each tensor value to the closest value in a list


I have a tensor A with size [batchSize,2,2,2] where batchSize is a placeholder. In a custom layer, I would like to map each value of this tensor to the closest value in a list c with length n. The list is my codebook and I would like to quantize each value in the tensor based on this codebook; i.e. find the closest value to each tensor value in the list and replace the tensor value with that. I could not figure out a 'clean' tensor operation that will quickly do that. I can not loop over the batchSize. Is there a method to do this in Tensorflow?


Solution

  • If I understand correctly, this is doable with tf.HashTable. As an illustration, I used a normal distribution with mean=0, stddev=4.

    a = tf.random.normal(
      shape = [batch, 2, 2, 2],
      mean=0.0,
      stddev=4
    )
    

    And I used a quantization with only 5 buckets (see the figure marked with number 0, 1, 2, 3, 4). This is extensible to any length n. Note I intentionally made the buckets have variable length.

    enter image description here

    My codebook is therefore:

    a <= -2          -> bucket 4
    -2 < a < -0.5    -> bucket 3
    -0.5 <= a < 0.5  -> bucket 0
    0.5 <= a < 2.5   -> bucket 1
    a >= 2.5         -> bucket 2
    

    The idea is to pre-create a key/value mapping from a scaled a to the bucket number. (the number of <key,value> pairs is dependent on the input granularity you need. Here I scaled by 10). Below is the code to initialize the mapping table and the produced mapping (input scaled by 10).

    # The boundary is chosen based on that we clip by min=-4, max=4. 
    # after scaling, the boundary becomes -40 and 40. 
    keys = range(-40, 41)
    values  = []
    for k in keys:
      if k <= -20:
        values.append(4)
      elif k < -5:
        values.append(3)
      elif k < 5:
        values.append(0)
      elif k < 25:
        values.append(1)
      else:
        values.append(2)
    for (k, v) in zip(keys, values):
      print ("%2d -> %2d" % (k, v))
    
    
    -40 ->  4
    -39 ->  4
    ...
    -22 ->  4
    -21 ->  4
    -20 ->  4
    -19 ->  3
    -18 ->  3
    ...
    -7 ->  3
    -6 ->  3
    -5 ->  0
    -4 ->  0
    ...
     3 ->  0
     4 ->  0
     5 ->  1
     6 ->  1
     ...
    23 ->  1
    24 ->  1
    25 ->  2
    26 ->  2
    ...
    40 ->  2
    
    batch = 3
    a = tf.random.normal(
        shape = [batch, 2, 2, 2],
        mean=0.0,
        stddev=4,
        dtype=tf.dtypes.float32
    )
    clip_a = tf.clip_by_value(a, clip_value_min=-4, clip_value_max=4)
    SCALE = 10
    scaled_clip_a = tf.cast(clip_a * SCALE, tf.int32)
    
    table = tf.contrib.lookup.HashTable(
        tf.contrib.lookup.KeyValueTensorInitializer(keys, values), -1)
    quantized_a = tf.reshape(
        table.lookup(tf.reshape(scaled_clip_a, [-1])), 
        [batch, 2, 2, 2])
    
    with tf.Session() as sess:
      table.init.run()
      a, clip_a, scaled_clip_a, quantized_a = sess.run([a, clip_a, scaled_clip_a, quantized_a])
      print ('a\n%s' % a)
      print ('clip_a\n%s' % clip_a)
      print ('scaled_clip_a\n%s' % scaled_clip_a)
      print ('quantized_a\n%s' % quantized_a)
    

    Result:

    a
    [[[[-0.26980758 -5.56331968]
       [ 5.04240322 -7.18292665]]
    
      [[-7.11545467 -3.24369478]
       [ 1.01861215 -0.04510783]]]
    
    
     [[[-0.28768024  0.2472897 ]
       [ 2.17780781 -5.79106379]]
    
      [[ 8.45582008  4.53902292]
       [ 0.138162   -6.19155598]]]
    
    
     [[[-7.5134449   4.56302166]
       [-0.30592337 -0.60313278]]
    
      [[-0.06204566  3.42917275]
       [-1.14547718  3.31167102]]]]
    clip_a
    [[[[-0.26980758 -4.        ]
       [ 4.         -4.        ]]
    
      [[-4.         -3.24369478]
       [ 1.01861215 -0.04510783]]]
    
    
     [[[-0.28768024  0.2472897 ]
       [ 2.17780781 -4.        ]]
    
      [[ 4.          4.        ]
       [ 0.138162   -4.        ]]]
    
    
     [[[-4.          4.        ]
       [-0.30592337 -0.60313278]]
    
      [[-0.06204566  3.42917275]
       [-1.14547718  3.31167102]]]]
    scaled_clip_a
    [[[[ -2 -40]
       [ 40 -40]]
    
      [[-40 -32]
       [ 10   0]]]
    
    
     [[[ -2   2]
       [ 21 -40]]
    
      [[ 40  40]
       [  1 -40]]]
    
    
     [[[-40  40]
       [ -3  -6]]
    
      [[  0  34]
       [-11  33]]]]
    quantized_a
    [[[[0 4]
       [2 4]]
    
      [[4 4]
       [1 0]]]
    
    
     [[[0 0]
       [1 4]]
    
      [[2 2]
       [0 4]]]
    
    
     [[[4 2]
       [0 3]]
    
      [[0 2]
       [3 2]]]]