Search code examples
rtensorflowkerasconv-neural-networktransfer-learning

Error specifying convolutional base from pre-trained XCeption-architecture with keras in R under Windows


Is there something really obvious that I am doing when trying to use the convolutional base from the XCeption architecture pre-trained on ImageNet? Here is my code that produces the error at the end of the question:

require(keras)

conv_base1 <- application_xception(
weights = "imagenet",
include_top = FALSE,
pooling=FALSE,
input_shape = c(300, 300, 3)
)

model51 <- keras_model_sequential() %>%
conv_base1 %>%
layer_flatten() %>%
layer_dense(units = 256, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")

In contrast, the almost identical code below using application_vgg16 works just fine:

require(keras)

conv_base2 <- application_vgg16(
weights = "imagenet",
include_top = FALSE,
pooling=FALSE,
input_shape = c(300, 300, 3)
)

model52 <- keras_model_sequential() %>%
conv_base2 %>%
layer_flatten() %>%
layer_dense(units = 2048, activation = "relu") %>%
layer_dense(units = 1, activation = "sigmoid")

I get the following errors (using R version 3.4.0 (2017-04-21) on Windows 10 x86_64-w64-mingw32/x64 (64-bit) using the keras_2.1.5 R package):

Error in py_call_impl(callable, dots$args, dots$keywords) : ValueError: Variable block1_conv1_bn_1/moving_mean/biased already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1269, in init self._traceback = _extract_stack() File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op op_def=op_def)

Detailed traceback: File "D:\Anaconda3\lib\site-packages\keras\models.py", line 467, in add layer(x) File "D:\Anaconda3\lib\site-packages\keras\engine\topology.py", line 617, in call output = self.call(inputs, **kwargs) File "D:\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2081, in call output_tensors, _, _ = self.run_internal_graph(inputs, masks) File "D:\Anaconda3\l

Further background in case it matters: I ran into this when trying to replace VGG16 with XCeption in the "Freature extraction with data augementation" subsection of Section 5.3.1 of Chollet and Allaire's most excellent book (everything in "Fast feature extraction without data augmentation" works just fine with both VGG16 and XCeption).


Solution

  • I don't understand the source of this error, but I suspect it's related to using a model inside another model (adding the basemodel into a sequential model).

    I suggest trying to use a functional API model. But unfortunately I'm not good with R to understand its notation.

    The idea is (copied from here, I hope the syntax is ok. Anyone with better R understanding can fix this code)

    First define the xception model normally:

    conv_base1 <- application_xception(
    weights = "imagenet",
    include_top = FALSE,
    pooling=FALSE,
    input_shape = c(300, 300, 3)
    )
    

    Try 1:

    Now let's get the output tensor of this model and pass it to further layers

    #the inputs of the following layers is the output of the exception model     
        #this is where I can't handle with R, these two lines may be wrong
    base_inputs <- conv_base1$input
    base_outputs <- conv_base1$output
    
    #the base_outputs are the input tensor to further layers
    #predictions is the output tensor from those layers
    predictions <- base_outputs %>%
        layer_flatten() %>% 
        layer_dense(units = 256, activation = "relu") %>% 
        layer_dense(units = 1, activation = 'sigmoid') 
    
    # create and compile model - model starts at base_inputs and ends at predcitions
    model <- keras_model(inputs = base_inputs, outputs = predictions)
    

    Try 2:

    Alternatively, if defining base_inputs and base_outputs is not possible the way it was indended in the other code:

    inputs <- layer_input(shape = c(300,300,3))
    
    # outputs compose input + layers, where conv_base1 should behave like a layer
    predictions <- inputs %>%
        conv_base1 %>%
        layer_flatten() %>% 
        layer_dense(units = 256, activation = "relu") %>% 
        layer_dense(units = 1, activation = 'sigmoid') 
    
    # create and compile model
    model <- keras_model(inputs = inputs, outputs = predictions)