I am trying to make a neural network using softmax regression. I am using the following regression formula:
Lets say I have an input of 1000x100. In other words, lets say I have 1000 images each of dimensions 10x10. Now, let's say the images are images of letters from A, B, C, D, E, F, G, H, I, J and I'm trying to predict this. My design is the following: to have 100 inputs (each image) and 10 outputs.
I have the following doubts. Given that n is superscript in x^n, with regards to the numerator, should I perform the dot product of w (w = weights whose dimensions are 10x100 - 10 representing the number of outputs and 100 representing the number of inputs) and a single x(a single image) or all the imagines combined(1000x100)? I am coding in python and so if I do dot product of w and x^T (10x100 dot 100x1000), then I am not sure how I can make that an exponent. I am using numpy. I am having a hard time wrapping my mind around these matrices on how they can be raised as an exponent.
If you are training Neural Networks, it might we worth while to check Theano library. It features various output thresholding functions like tanh, softmax, etc. and allows training of neural networks on GPU.
Also x^n is the output of the last layer in the above formula, not input raised to some exponent. You can't put matrix in exponent.
You should check more about softmax regression. This might be of help.