I'm building a chain classifier for a multiclass problem that uses Keras binary Classifier model in a chain. I have 17 labels as classification target and shape of X_train is (111300,107) and y_train is (111300,17). After training, I got following Error in predict method;
*could not broadcast input array from shape (27839,1) into shape (27839)*
My code is here:
def create_model():
input_size=length_long_sentence
embedding_size=128
lstm_size=64
output_size=len(unique_tag_set)
#----------------------------Model--------------------------------
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
#out_current = Reshape((1,2*lstm_size))(out_current)
output = Dense(units=1, activation= 'sigmoid')(out_current)
#output = Dense(units=1, activation='softmax')(out_current)
model = Model(inputs=current_input, outputs=output)
#-------------------------------compile-------------
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model, epochs=1,batch_size=256, shuffle = True, verbose = 1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
history=chain.fit(X_train, y_train)
the result for chain.classes_ is given below:
[array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8),
array([0, 1], dtype=uint8)]
then trying to predict on Test data:
Y_pred_chain = chain.predict(X_test)
The summary of the model is given below:
Full Trace of error is here:
109/109 [==============================] - 22s 202ms/step
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-28-34a25ad06cd4> in <module>()
----> 1 Y_pred_chain = chain.predict(X_test)
/usr/local/lib/python3.6/dist-packages/sklearn/multioutput.py in predict(self, X)
523 else:
524 X_aug = np.hstack((X, previous_predictions))
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
526
527 inv_order = np.empty_like(self.order_)
ValueError: could not broadcast input array from shape (27839,1) into shape (27839)
Can any one help about how to fix this error?
Going by the model summary as posted in the question, I start with that the input size of 107
and the output size is 1
(binary classification task)
Lets break it into pieces and understand.
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,1)) # NOTE: The y should have two dimensions
model = KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle = True, verbose = 1,validation_split=0.2)
model.fit(X, y)
y_hat = model.predict(X)
Output:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 21ms/step - loss: 0.6951 - accuracy: 0.4432 - val_loss: 0.6898 - val_accuracy: 0.5652
111/111 [==============================] - 0s 2ms/step
(111, 1)
Ta-da! it works
model=KerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
oops! it trains but predictions fails as OP points out Error:
ValueError: could not broadcast input array from shape (111,1) into shape (111)
This error is because of the below line in sklearn
--> 525 Y_pred_chain[:, chain_idx] = estimator.predict(X_aug)
It is because classifier chain runs the estimators one at a time and saves each estimators predictions in Y_pred_chain
at the estimators index (determined by the order
parameter). It assumes that the estimators return the predictions in a 1D array. But keras models return output of shape batch_size x output_size
which in out our case is 111 x 1
.
We need a way to reshape the predictions of shape 111 X 1
to 111
or in general batch_size x 1
to batch_size
. Lets bank on the concepts of OOPS and overload the predict method of KerasClassifier
class MyKerasClassifier(KerasClassifier):
def __init__(self, **args):
super().__init__(**args)
def predict(self, X):
return super().predict(X).reshape(len(X)) # Here we are flattening 2D array to 1D
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
Output:
Epoch 1/1
88/88 [==============================] - 2s 19ms/step - loss: 0.6919 - accuracy: 0.5227 - val_loss: 0.6892 - val_accuracy: 0.5652
111/111 [==============================] - 0s 3ms/step
(111, 1)
Ta-da! it works
Lets look deeper into ClassifierChain class
A multi-label model that arranges binary classifiers into a chain.
Each model makes a prediction in the order specified by the chain using all of the available features provided to the model plus the predictions of models that are earlier in the chain.
So what we really need is a y
of shape 111 X 17
so that the chain contains 17 estimators. Lets try it
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
Output:
ValueError: Error when checking input: expected input_62 to have shape (107,) but got array with shape (108,)
It cannot train the model; the reason is pretty simple. The chain first trains the first estimator with 107
feature with works fine. Next the chain picks up the next estimator and then trains it with 107
features + the single output of the previous estimator (=108). But since our model has input size of 107
it will fail as so the error message. Each estimator will get 107
input features + the output of all the previous estimators.
We need a way to change the input_size
of the model as they are created from the ClassifierChain
. There seem to be no callbacks or hooks into the ClassifierChain
, so I have a hacky solution.
input_size = 107
# define the model
def create_model():
global input_size
embedding_size=128
lstm_size=64
output_size=1
vocab_size = 100
current_input=Input(shape=(input_size,))
emb_current = Embedding(vocab_size, embedding_size, input_length=input_size)(current_input)
out_current=Bidirectional(LSTM(units=lstm_size))(emb_current )
output = Dense(units=output_size, activation= 'sigmoid')(out_current)
model = Model(inputs=current_input, outputs=output)
model.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
input_size += 1 # <-- This does the magic
return model
X = np.random.randint(0,100,(111, 107))
y = np.random.randint(0,2,(111,17))
model=MyKerasClassifier(build_fn=create_model, epochs=1, batch_size=8, shuffle=True, verbose=1,validation_split=0.2)
chain=ClassifierChain(model, order='random', random_state=42)
chain.fit(X, y)
print (chain.predict(X).shape)
Output:
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6901 - accuracy: 0.6023 - val_loss: 0.7002 - val_accuracy: 0.4783
Train on 88 samples, validate on 23 samples
Epoch 1/1
88/88 [==============================] - 2s 22ms/step - loss: 0.6976 - accuracy: 0.5000 - val_loss: 0.7070 - val_accuracy: 0.3913
Train on 88 samples, validate on 23 samples
Epoch 1/1
----------- [Output truncated] ----------------
111/111 [==============================] - 0s 3ms/step
111/111 [==============================] - 0s 3ms/step
(111, 17)
As expected it trains 17
estimators and predict
method returns output of shape 111 x 17
each column corresponding to the predictions made by the corresponding estimator.