python tensorflow machine-learning keras sparse-matrix

Keras for N-tuple Network (sparse input)

I'm trying to train N-tuple Network using keras. N-tuple network is just sparse array of one-hot activated patterns. Imagine chess board with 64 squares, each square containing possible N types of pieces, so there will be always of 64 activated ones, for 64*N possible parameters, and stored as 2d array [64][N]. Or every possible 2x2 squares, so N^4 possible configuration for each such square. Such network is linear and will output 1 value. The training is a good old SGD and the likes.

I successfully trained the network using my code in c++, using lookup tables and summing. But I tried to do it keras, as keras allows for different optimization algorithms, use of GPUs etc. For starters I changed the 2d array into big vector, but soon it became impractical. There are thousands possible parameters, in which there are only handful (fixed) number of ones and the rest are zeros.

I was wondering if in keras (or similar library) it is possible to use training data like this: 13,16,11,11,5,...,3, where those numbers would be indexes, instead of using one big vector of 0,0,0,1,0,0,......,1,0,0,0,....,1,0,0,0,...

Solution

You could use, tf.sparse.SparseTensor(...), then set sparse=True, for tf.keras.Input(...).

def sparse_one_hot(y):

  m = len(y)
  n_classes = len(tf.unique(tf.squeeze(y))[0])

  dim2 = tf.range(m, dtype='int64')[:, None]
  indices = tf.concat([y, dim2], axis=1)

  ones = tf.ones(shape=(m, ), dtype='float32')

  sparse_y = tf.sparse.SparseTensor(indices, ones, dense_shape=(m, n_classes))
  
  return sparse_y

import tensorflow as tf

y = tf.random.uniform(shape=(10, 1), minval=0, maxval=4, dtype=tf.int64)

sparse_y = sparse_one_hot(y) # sparse_y.values, sparse_y.indices

# set sparse=True, for Input
# tf.keras.Input(..., sparse=True, ...)