Please help me to understand unit thing in neuron networks. From the book I understood that a unit in input layer represents an attribute of training tuple. However, it is left unclear, how exactly it does.
Here is the diagram:
There are two "thinking paths" about the input units. The first it could could be that X1 stands for attr1, X2 stands for attr2... Otherwise, it could be that X1, X2, and X3 represent attr1, but X1 stands for Value.VALUE_ONE, ... , X3 stands for Value.VALUE_THREE. So in least case, if attr1 = Value.VALUE_TWO then it weighted and fed simultaneously to a second layer.
public class Tuple
{
private Value attr1
private Value attr2
private Value attr3
}
public enum Value
{
VALUE_ONE,
VALUE_TWO,
VALUE_THREE
}
The second question is about hidden layer units. How it is decided how much units it shall be in hidden layer, and what they represent in the model?
The "units" are just floating point values.
All computations happening there are vector multiplications, and thus can be parallelized well using matrix multiplications and GPU hardware.
The general computation looks like this:
double v phi(double[] x, double[] w, double theta) {
double sum = theta;
for(int i = 0; i < x.length; i++)
sum += x[i] * w[i];
return tanh(sum);
}
except that you don't want to do this in Java code yourself. You want to do this on a GPU in a parallelized way, because this will be 100x faster.