Search code examples
pythontensorflowkerasmodelcleverhans

Attacking Tensorflow model with Cleverhans' CarliniWagnerL2 resulting in NotImplementedError


I'm trying to get familiar with tensorflow and cleverhans. But it seems that I get functionalities mixed up.

I set up a simple model with tensorflow, train it and then want to craft an adversarial image with cleverhans' CarliniWagnerL2-attack. I read through the code of tensorflows and cleverhans' documentation and tried to understand whats happening but I just don't get which function from which library I have to use.

This is my simplified example code. As far as I understood I have to turn a callable into a valid function using the CallableModelWrapper. Is that right? Or is my model no callable? Is it actually possible to use tensorflow to train a model and then attack it with cleverhans? The error occurs when I try to generate the adversarial image.

# TensorFlow and tf.keras
import tensorflow as tf

# Cleverhans
import cleverhans as ch
from cleverhans import attacks
from cleverhans import model

# Others
import numpy as np

sess = tf.Session()

# load data set
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

class_names = ['0', '1', '2', '3', '4',
               '5', '6', '7', '8', '9']

train_images = train_images / 255.0
test_images = test_images / 255.0

#set up model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='SGD',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# train
model.fit(train_images, train_labels, epochs=3)

# wrap 
wrap = ch.model.CallableModelWrapper(model, 'probs')

cw = ch.attacks.CarliniWagnerL2(wrap, sess=sess)

#set params and targeted image
cw_params = {'batch_size': 1,
             'confidence': 10,
             'learning_rate': 0.1,
             'binary_search_steps': 5,
             'max_iterations': 1000,
             'abort_early': True,
             'initial_const': 0.01,
             'clip_min': 0,
             'clip_max': 1}

image = np.array([test_images[0]])

# and here i get the error!!!
adv_cw = cw.generate_np(image, **cw_params)

I actually want to get the adversarial image but whatever I try it seems that I use a mix of the two libraries and they don't go well together. I get:

NotImplementedError: must implement get_logits or must define a logits output in fprop

Can anyone help?

Basically I just want to understand what models I can use for cleverhans.attacks ! :)

Thanks in advance.

Rolle

Edit

This is my Traceback:

Traceback (most recent call last):
  File "/usr/lib/python3.6/code.py", line 91, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/home/<me>/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/191.6605.12/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home/<me>/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/191.6605.12/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/<path_to_project>/tensorflow/untitled/minexample.py", line 57, in <module>
    adv_cw = cw.generate_np(image, **cw_params)
  File "/home/<me>/.local/lib/python3.6/site-packages/cleverhans/attacks/__init__.py", line 189, in generate_np
    self.construct_graph(fixed, feedable, x_val, hash_key)
  File "/home/<me>/.local/lib/python3.6/site-packages/cleverhans/attacks/__init__.py", line 161, in construct_graph
    x_adv = self.generate(x, **new_kwargs)
  File "/home/<me>/.local/lib/python3.6/site-packages/cleverhans/attacks/__init__.py", line 1196, in generate
    x.get_shape().as_list()[1:])
  File "/home/<me>/.local/lib/python3.6/site-packages/cleverhans/attacks_tf.py", line 628, in __init__
    self.output = model.get_logits(self.newimg)
  File "/home/<me/.local/lib/python3.6/site-packages/cleverhans/model.py", line 70, in get_logits
    " output in `fprop`")
NotImplementedError: <class 'cleverhans.model.CallableModelWrapper'>must implement `get_logits` or must define a logits output in `fprop`

I replaced my internal directory structure by path_to_project or me respectively.


Solution

  • The code snippet you shared defines and trains the model using Keras so it would be easier to use the specific KerasModelWrapper we have for Keras models. You can find a tutorial for doing that here.