Search code examples
tensorflowkeras

Identical Tensorflow/Keras models with different sized weight/variable files


I have a pretrained model built by a colleague. I have an identical model (network architecure) that I built and trained myself. By identical I mean the model summaries are the same, they have exactly the same number of trainable and non-trainable variables. I can load weights interchangebly between the 2 models.

Weirdly the variables file, in their model is about 50% of the size of mine. If I load and save their model the weights file remains the same (50%).

Possibly related, the performance of my model sucks compared to the pretrained model.

Any idea how 2 identical models can have weights files of different sizes?


Solution

  • Turns out we used different optimizers. Optimizer state is stored with the model network and weights.