Flipping the labels of a TF dataset

I want to create a malicious dataset for CIFAR-100 to test a Federated Learning Attack similar to this malicious dataset for EMNIST:

url_malicious_dataset = 'https://storage.googleapis.com/tff-experiments-public/targeted_attack/emnist_malicious/emnist_target.mat'
filename = 'emnist_target.mat'
path = tf.keras.utils.get_file(filename, url_malicious_dataset)
emnist_target_data = io.loadmat(path)

I tried the following to flip the label 0 to 4 in the extracted example dataset, but this method isn't working:

cifar_train, cifar_test = tff.simulation.datasets.cifar100.load_data(cache_dir=None)
example_dataset = cifar_train.create_tf_dataset_for_client(cifar_train.client_ids[0])
for example in example_dataset:
  if example['label'].numpy() == 0:
    example['label'] = tf.constant(4,dtype=tf.int64)

Any idea how to create a similar version of the malicious dataset for CIFAR-100 instead of EMNIST by correctly flipping labels?

Solution

In general, tf.data.Dataset objects can be modified using their .map method. So for example, a simple label flipping could be done as follows:

def flip_label(example):
  return {'image': example['image'], 'label': 99-example['label']}

flipped_dataset = example_dataset.map(flip_label)

This reverses the labels 0-99. You could do something similar to send 0 to 4 and fix all other labels.

Note that if you'd like to apply this to all client datasets in cifar_train, you'd have to use the .preprocess method of tff.simulation.datasets.ClientData. That is, you could do something like cifar_train.preprocess(lambda x: x.map(flip_label)).