With the tf.contrib module gone from Tensorflow, and with tf.train.Saver() also gone, I cannot find a way to store a set of embeddings and their corresponding thumbnails, so that the Tensorboard Projector can read them.
The Tensorboard documentation for Tensorflow 2.0 explains how to create plots and summaries, and how to use the summary tool in general, but nothing about the projector tool. Has anyone found how to store datasets for visualization?
If possible, I would appreciate a (minimal) code example.
It seems there are some issues left in tensorboard. However, there are some workarounds (for now) for preparing embeddings for projector with tensorflow2: (bug report at: https://github.com/tensorflow/tensorboard/issues/2471)
tensorflow1 code would look something like that:
embeddings = tf.compat.v1.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
# Write summaries for tensorboard
with tf.compat.v1.Session() as sess:
saver = tf.compat.v1.train.Saver([embeddings])
sess.run(embeddings.initializer)
saver.save(sess, CHECKPOINT_FILE)
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = embeddings.name
embedding.metadata_path = TENSORBOARD_METADATA_FILE
projector.visualize_embeddings(tf.summary.FileWriter(TENSORBOARD_DIR), config)
when using eager mode in tensorflow2 this should (?) look somehow like this:
embeddings = tf.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
ckpt = tf.train.Checkpoint(embeddings=embeddings)
ckpt.save(CHECKPOINT_FILE)
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = embeddings.name
embedding.metadata_path = TENSORBOARD_METADATA_FILE
writer = tf.summary.create_file_writer(TENSORBOARD_DIR)
projector.visualize_embeddings(writer, config)
however, there are 2 issues:
writer
created with tf.summary.create_file_writer
does not have the function get_logdir()
required by projector.visualize_embeddings
, a simple workaround is to patch the visualize_embeddings
function to take the logdir as parameter.embeddings
changes to something like embeddings/.ATTRIBUTES/VARIABLE_VALUE
(also there are additional variables in the map extracted by get_variable_to_shape_map()
but they are empty anyways).the second issue was solved with the following quick-and-dirty workaround (and logdir is now a parameter of visualize_embeddings()
)
embeddings = tf.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
ckpt = tf.train.Checkpoint(embeddings=embeddings)
ckpt.save(CHECKPOINT_FILE)
reader = tf.train.load_checkpoint(TENSORBOARD_DIR)
map = reader.get_variable_to_shape_map()
key_to_use = ""
for key in map:
if "embeddings" in key:
key_to_use = key
config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = key_to_use
embedding.metadata_path = TENSORBOARD_METADATA_FILE
writer = tf.summary.create_file_writer(TENSORBOARD_DIR)
projector.visualize_embeddings(writer, config,TENSORBOARD_DIR)
I did not find any examples on how to use tensorflow2 to directly write the embeddings for tensorboard, so I am not sure if this is the right way, but if it is, then those two issues would need to be addressed, and at least for now there is a workaround.