Is it better in keras to save a model or to save only the weights?

I have a list of classes and subclasses, and since they amount to around 700 classes for only the last layer, I figured it would be better the train each set of classes separately. But after I trained them and was able to use them together, I figured out that more models I load the more it eats the memory up. As far as I saw there are only 2 options to use Keras one is to use load the model and another is to only load the model weights. The problem is that I saw almost not change in the ram, both are using an equal amount of memory. How to load a model that consumes less memory?

Solution

The memory consumption heavily depends on the architecture of the model.

It makes almost no difference between the different loading types, because the loading operation that really consumes the memory is the weight loading, with possible millions of floats.

The load_model practically saves you from writing separate code to load the architecture of the model (a json file which describes the architecture of your model).

Therefore, load_model ~ load_weights, where the load_model operation is equal to load_weights + load_json_architecture.

You cannot load a model that consumes less memory with the solutions above.

What you can do is to use model pruning/post-training-quantization, thus reducing the memory consumption.