Thank you for the CNTK Tool, the examples are running pretty fast. Since some days, I try to set up a simple network, but I dont get it. I need a network with 2 input and 3 output, for example:
|features 0.3 0.5 |labels 0.2 0.7 0.9
The output is not a one-hot-vector, the network has to learn the label-values 0.2 0.7 0.9
. But most examples have a one-hot-vector as output, so it is not clear to me how to solve this. I have tried to change the tutorial with 3 classification, but it does not work, the network does not learn the output correctly. The network I have tried is:
BrainScriptNetworkBuilder = {
SDim = 2 # feature dimension
H1Dim = 50 # hidden dimension
H2Dim = 50 # hidden dimension
LDim = 3 # number of classes (labels)
model (features) = {
W0 = ParameterTensor {(H1Dim:SDim)} ; b0 = ParameterTensor {H1Dim}
W1 = ParameterTensor {(H2Dim:H1Dim)} ; b1 = ParameterTensor {H2Dim}
W2 = ParameterTensor {(LDim:H2Dim)} ; b2 = ParameterTensor {LDim}
r1 = ReLU(W0 * features + b0) # hidden layer 1
r2 = ReLU(W1 * r1 + b1) # hidden layer 2
z = ReLU(W2 * r2 + b2)
}.z
# define inputs
features = Input {SDim, sparse = false}
labels = Input {LDim, sparse = false}
# apply model to features
z = model (features)
# define criteria and output(s)
ce = SquareError(labels, z) # criterion (loss)
err = SquareError(labels, z) # additional metric
# connect to the system. These five variables must be named exactly like this.
featureNodes = (features)
inputNodes = (labels)
criterionNodes = (ce)
evaluationNodes = (err)
outputNodes = (z)
}
So my question is: How to set up a network in CNTK, so that the output is not a one hot vector?
Thank you for help.
When your label is not a one-hot vector, squareError is a good loss function to minimize. If some examples have a one-hot label you can still user squareError. So I think you are doing everything right, you might have to just tune the learning rate to get it to work well.