Search code examples
pythontensorflowmachine-learningkeras

Combine models into one in Keras


I have to train two models: modelA and modelB with different optimizer and hiddenLayers. I would like to take outputs as a combination between them, resulting as

# w = weight I give to each model
output_modelC = output_modelA * w + output_modelB * (1 - w)

Both models share the same Input, but, after creating their compile, I don`t know how to follow it. My code is this:

Input = keras.layers.Input(shape=(2,))

#modelA
Hidden_A_1 = keras.layers.Dense(units=20)(Input)
Hidden_A_2 = keras.layers.Dense(units=20)(Hidden_A_1)
Output_A = keras.layers.Dense(units=1, activation='sigmoid')(Hidden_A_2)
optimizer_A = keras.optimizers.SGD(lr=0.00001, momentum=0.09, nesterov=True)
model_A = keras.Model(inputs=Input, outputs=Output_A)
model_A.compile(loss="binary_crossentropy",
                   optimizer=optimizer_slow,
                   metrics=['accuracy'])

#modelB
Hidden1_B = keras.layers.Dense(units=10, activation='relu')(Input)
Output_B = keras.layers.Dense(units=1, activation='sigmoid')(Hidden1_B)
model_B = keras.Model(inputs=Input, outputs=Output_B)
optimizer_B = keras.optimizers.Adagrad()
model_B.compile(loss="binary_crossentropy",
                   optimizer=optimizer_B,
                   metrics=['accuracy'])

Solution

  • Assuming that you will be providing the value of w the following code might help you:

    import keras 
    
    Input = keras.layers.Input(shape=(784,))
    
    #modelA
    Hidden_A_1 = keras.layers.Dense(units=20)(Input)
    Hidden_A_2 = keras.layers.Dense(units=20)(Hidden_A_1)
    Output_A = keras.layers.Dense(units=1, activation='sigmoid')(Hidden_A_2)
    optimizer_A = keras.optimizers.SGD(lr=0.00001, momentum=0.09, nesterov=True)
    model_A = keras.Model(inputs=Input, outputs=Output_A)
    model_A.compile(loss="binary_crossentropy",
                       optimizer=optimizer_A,
                       metrics=['accuracy'])
    
    #modelB
    Hidden1_B = keras.layers.Dense(units=10, activation='relu')(Input)
    Output_B = keras.layers.Dense(units=1, activation='sigmoid')(Hidden1_B)
    model_B = keras.Model(inputs=Input, outputs=Output_B)
    optimizer_B = keras.optimizers.Adagrad()
    model_B.compile(loss="binary_crossentropy",
                       optimizer=optimizer_B,
                       metrics=['accuracy'])
    
    (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
    
    x_train = x_train.reshape(60000, 784)
    x_test = x_test.reshape(10000, 784)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    
    model_A.fit(x_train,y_train)
    model_B.fit(x_train,y_train)
    
    w = 0.8
    output_modelC = model_A.predict(x_test) * w + model_B.predict(x_test) * (1 - w)
    

    Sample Output:

    array([[0.98165023],
           [0.9918817 ],
           [0.93426293],
           ...,
           [0.99940777],
           [0.9960805 ],
           [0.9992139 ]], dtype=float32)
    

    It may not be the right sample data that I have picked, but this is just to show how to combine both the networks.