Search code examples
pythondeep-learningtensorflow-federated

How to prepare my dataset(Not Images) to implement FedAVG on Tensorflow Federated?


I want to train a federated model with the FedAvg Algorithm on TFF (Tensorflow Federated) using a 3-channel (X, Y, Z) accelerometer dataset with a time frame length of 128.

My goal is to train a federated model using

tff.learning.from_keras_model

The guides on the TensorFlow Federated website mostly deal with datasets which already comes in the desired format for the model

tensorflow_federated.python.simulation.hdf5_client_data.HDF5ClientData

I'm quite lost on how to convert my raw dataset to the desired format for TFF.

The dataset I am using has the following shape:

X: (-1, 128, 3) and Y: (-1)

X: are floats Y: are the integer labels of my dataset ranging from 0-6

Can anybody give me some pointers/examples on how I can tackle this?


Solution

  • First, for federated learning the dataset will need to be partitioned by user/participant. Does the dataset have a partitioning of the accelerometer readings and labels by user? If not, this is probably a task suited from standard centralized learning rather than federated learning.

    If there is a user partitioning, the following questions explain how to setup a tff.simulation.ClientData to model this distributed dataset. The fact that the data is images or not shouldn't matter, the techniques are applicable to any supervised learning of X, Y datasets: