I am using a AlexNet with pretrained weights(heuritech/convnets-keras) for a classification problem with 8 classes instead of 1000. After initialising the network with Model(input=..,output=..)
and loading the initial weights, I drop the last two layers, Dense(1000) and Activation(softmax), and add my own two layers: Dense(8) and Activation(softmax).
But then, after running I get an error
Error when checking model target: expected softmax to have shape (None, 1000) but got array with shape (32, 8)
The 32
is the batch size from the generator I guess, but I don't understand why the softmax still expects 1000 dimensions from the previous layer.
Can someone help me? I think it has something to do the output parameter of the Model, but this is only semi-wild guessing after trying and googling around. Thanks!
Code:
import ...
pp = os.path.dirname(os.path.abspath(__file__))
##### Define Model #####
inputs = Input(shape=(3,227,227))
conv_1 = Convolution2D(96, 11, 11,subsample=(4,4),activation='relu', name='conv_1')(inputs)
...
...
...
dense_1 = MaxPooling2D((3, 3), strides=(2,2),name="convpool_5")(conv_5)
dense_1 = Flatten(name="flatten")(dense_1)
dense_1 = Dense(4096, activation='relu',name='dense_1')(dense_1)
dense_2 = Dropout(0.5)(dense_1)
dense_2 = Dense(4096, activation='relu',name='dense_2')(dense_2)
dense_3 = Dropout(0.5)(dense_2)
dense_3 = Dense(1000,name='dense_3')(dense_3)
prediction = Activation("softmax",name="softmax")(dense_3)
model = Model(input=inputs, output=prediction)
for layer in model.layers[:27]:
print layer.name
layer.trainable = False
model.load_weights(pp+"/weights/alexnet_weights.h5")
print model.output_shape
print model.layers[-1]
model.layers.pop()
print model.output_shape
model.layers.pop()
print model.layers[-1]
print model.output_shape
model.layers.append(Dense(8, activation='softmax',name='dense_4'))
print model.layers[-1]
##### Get Data #####
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
pp+'/dataset/training',
target_size=(227,227),
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
pp+'/dataset/test',
target_size=(227,227),
class_mode='categorical')
##### Compile and Fit ####
sgd = SGD(lr=1e-4, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='mse')
model.fit_generator(
train_generator,
samples_per_epoch=500,
nb_epoch=5,
validation_data=validation_generator,
nb_val_samples=150)
model.save_weights('first_try.h5')
Ok, it seems that I can't just change the network definition, because even after popping/putting new layers in, nothing seems to change. So I did this:
1) Load the default AlexNet
2) Load the pre-trained weights
3) Pop the 2 top layers
4) Add two new top layers
5) Save the weights
6) Change Network definition to use the two new layers
7) Load the new AlexNet with the saved weights
8) Profit!
Although I would still like to know how to change a loaded network defined by the functional api.