Other compression methods for Federated Learning

I noticed that the Gradient Quantization compression method is already implemented in TFF framework. How about non-traditional compression methods where we select a sub-model by dropping some parts of the global model? I come across the "Federated Dropout" compression method in the paper "Expanding the Reach of Federated Learning by Reducing Client Resource Requirements" (https://arxiv.org/abs/1812.07210). Any idea if Federated Dropout method is already supported in Tensorflow Federated. If not, any insights how to implement it (the main idea of the method is dropping a fixed percentage of the activations and filters in the global model to exchange and train a smaller sub-model)?

Solution

Currently, there is no implementation of this idea available in the TFF code base.

But here is an outline of how you could do it, I recommend to start from examples/simple_fedavg

Modify top-level build_federated_averaging_process to accept two model_fns -- one server_model_fn for the global model, one client_model_fn for the smaller sub-model structure actually trained on clients.
Modify build_server_broadcast_message to extract only the relevant sub-model from the server_state.model_weights. This would be the mapping from server model to client model.
The client_update may actually not need to be changed (I am not 100% sure), as long as only the client_model_fn is provided from client_update_fn.
Modify server_update - the weights_delta will be the update to the client sub-model, so you will need to map it back to the larger global model.

In general, the steps 2. and 4. are tricky, as they depend not only what layers are in a model, but also the how they are connected. So it will be hard to create a easy to use general solution, but it should be ok to write these for a specific model structure you know in advance.