Search code examples
pythontensorflowtensorboard

Cleaning up tensorflow summaries


I have trained a model for a very long time (200 000 iterations). At each iteration, I have saved lots of data, such as the loss, accuracy, weights, etc, through the tf.summary.FileWriter() class. Yes, I know: that was stupid. As a result, I generated a huge summary that is almost 50 GB large. Now I would like to drop most of the information and keep, say, one line every 50. That would allow me to save lots of hard disk space and speed up tensorboard visualization while not having a significant impact on the quality of the summary. Is it possible to do so?


Solution

  • The function that allows you to read event files (where are stored summaries) is tf.train.summary_iterator. You could try something like this:

    import tensorflow as tf
    
    tfevents_filepath = path_to_existing_event_file
    tfevents_folder_new = path_to_new_event_file_folder
    
    writer = tf.summary.FileWriter(tfevents_folder_new)
    for e in tf.train.summary_iterator(tfevents_filepath):
      if e.step == 0 or e.step % 50 == 0: # or any other criterion but make sure
                                          # you keep events from step 0
        writer.add_event(e)
    writer.close()