I was reading some project that involves dealing with both image & text-sequence input at the same time, I was wondering why do we take same dimension in Keras add() function when we combine outputs from the different neural network in Dense layer
Q1: Is there any benefit of doing this?
Q2: If we take non-equal dimension in add( ) or merge( ) will it affect model performance ?
Q3: Also can we consider this as a another hyper-parameter and tune it to get the best fit of model ?
Add()
requires tensors of the same dimensions. So you simply can't use an add() operation with different dimensions.
Addition of a matrix (N, A) with a matrix (N, B) only makes sense if A == B, for values of A and B > 1. When A or B == 1 you can apply broadcast rules.