Map each tensor value to the closest value in a list

I have a tensor A with size [batchSize,2,2,2] where batchSize is a placeholder. In a custom layer, I would like to map each value of this tensor to the closest value in a list c with length n. The list is my codebook and I would like to quantize each value in the tensor based on this codebook; i.e. find the closest value to each tensor value in the list and replace the tensor value with that. I could not figure out a 'clean' tensor operation that will quickly do that. I can not loop over the batchSize. Is there a method to do this in Tensorflow?

Solution

If I understand correctly, this is doable with tf.HashTable. As an illustration, I used a normal distribution with mean=0, stddev=4.

a = tf.random.normal(
  shape = [batch, 2, 2, 2],
  mean=0.0,
  stddev=4
)

And I used a quantization with only 5 buckets (see the figure marked with number 0, 1, 2, 3, 4). This is extensible to any length n. Note I intentionally made the buckets have variable length.

My codebook is therefore:

a <= -2          -> bucket 4
-2 < a < -0.5    -> bucket 3
-0.5 <= a < 0.5  -> bucket 0
0.5 <= a < 2.5   -> bucket 1
a >= 2.5         -> bucket 2

The idea is to pre-create a key/value mapping from a scaled a to the bucket number. (the number of <key,value> pairs is dependent on the input granularity you need. Here I scaled by 10). Below is the code to initialize the mapping table and the produced mapping (input scaled by 10).

# The boundary is chosen based on that we clip by min=-4, max=4. 
# after scaling, the boundary becomes -40 and 40. 
keys = range(-40, 41)
values  = []
for k in keys:
  if k <= -20:
    values.append(4)
  elif k < -5:
    values.append(3)
  elif k < 5:
    values.append(0)
  elif k < 25:
    values.append(1)
  else:
    values.append(2)
for (k, v) in zip(keys, values):
  print ("%2d -> %2d" % (k, v))


-40 ->  4
-39 ->  4
...
-22 ->  4
-21 ->  4
-20 ->  4
-19 ->  3
-18 ->  3
...
-7 ->  3
-6 ->  3
-5 ->  0
-4 ->  0
...
 3 ->  0
 4 ->  0
 5 ->  1
 6 ->  1
 ...
23 ->  1
24 ->  1
25 ->  2
26 ->  2
...
40 ->  2

batch = 3
a = tf.random.normal(
    shape = [batch, 2, 2, 2],
    mean=0.0,
    stddev=4,
    dtype=tf.dtypes.float32
)
clip_a = tf.clip_by_value(a, clip_value_min=-4, clip_value_max=4)
SCALE = 10
scaled_clip_a = tf.cast(clip_a * SCALE, tf.int32)

table = tf.contrib.lookup.HashTable(
    tf.contrib.lookup.KeyValueTensorInitializer(keys, values), -1)
quantized_a = tf.reshape(
    table.lookup(tf.reshape(scaled_clip_a, [-1])), 
    [batch, 2, 2, 2])

with tf.Session() as sess:
  table.init.run()
  a, clip_a, scaled_clip_a, quantized_a = sess.run([a, clip_a, scaled_clip_a, quantized_a])
  print ('a\n%s' % a)
  print ('clip_a\n%s' % clip_a)
  print ('scaled_clip_a\n%s' % scaled_clip_a)
  print ('quantized_a\n%s' % quantized_a)

Result:

a
[[[[-0.26980758 -5.56331968]
   [ 5.04240322 -7.18292665]]

  [[-7.11545467 -3.24369478]
   [ 1.01861215 -0.04510783]]]


 [[[-0.28768024  0.2472897 ]
   [ 2.17780781 -5.79106379]]

  [[ 8.45582008  4.53902292]
   [ 0.138162   -6.19155598]]]


 [[[-7.5134449   4.56302166]
   [-0.30592337 -0.60313278]]

  [[-0.06204566  3.42917275]
   [-1.14547718  3.31167102]]]]
clip_a
[[[[-0.26980758 -4.        ]
   [ 4.         -4.        ]]

  [[-4.         -3.24369478]
   [ 1.01861215 -0.04510783]]]


 [[[-0.28768024  0.2472897 ]
   [ 2.17780781 -4.        ]]

  [[ 4.          4.        ]
   [ 0.138162   -4.        ]]]


 [[[-4.          4.        ]
   [-0.30592337 -0.60313278]]

  [[-0.06204566  3.42917275]
   [-1.14547718  3.31167102]]]]
scaled_clip_a
[[[[ -2 -40]
   [ 40 -40]]

  [[-40 -32]
   [ 10   0]]]


 [[[ -2   2]
   [ 21 -40]]

  [[ 40  40]
   [  1 -40]]]


 [[[-40  40]
   [ -3  -6]]

  [[  0  34]
   [-11  33]]]]
quantized_a
[[[[0 4]
   [2 4]]

  [[4 4]
   [1 0]]]


 [[[0 0]
   [1 4]]

  [[2 2]
   [0 4]]]


 [[[4 2]
   [0 3]]

  [[0 2]
   [3 2]]]]