Adversarially Robust Googlenet model

How train a googlenet model adversarially on an own image classification dataset?

For example: Using cleverhans library, the data that has batches to run the attacks on are MNIST and CIFAR.

I trained an image classifier with my own data (Googlenet) using Tensorflow, now I want to train the model with the adversarial examples. Any ideas that I can do with the cleverhans library. Thanks.

Solution

The easiest is probably to start from your own code to train GoogleNet and modify its loss. You can find an example modification of the loss that adds a penalty to train on adversarial examples in the CleverHans tutorial. It uses the loss implementation found here to define a weighted average between the cross-entropy on clean images and the cross-entropy on adversarial images.