First off, here's my code:
"""Softmax."""
scores = [3.0, 1.0, 0.2]
import numpy as np
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
num = np.exp(x)
score_len = len(x)
y = np.array([0]*score_len)
sum_n = np.sum(num)
#print sum_n
for index in range(1,score_len):
y[index] = (num[index])/sum_n
return y
print(softmax(scores))
The error comes up at the line:
y[index] = (num[index])/sum_n
I run the code with:
# Plot softmax curves
import matplotlib.pyplot as plt
x = np.arange(-2.0, 6.0, 0.1)
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
plt.plot(x, softmax(scores).T, linewidth=2)
plt.show()
What exactly is going wrong here?
Just editing a print
statement as "debugger" reveals what is happening:
import numpy as np
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
num = np.exp(x)
score_len = len(x)
y = np.array([0]*score_len)
sum_n = np.sum(num)
#print sum_n
for index in range(1,score_len):
print((num[index])/sum_n)
y[index] = (num[index])/sum_n
return y
x = np.arange(-2.0, 6.0, 0.1)
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
softmax(scores).T
this prints
[ 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504 0.00065504 0.00065504 0.00065504 0.00065504
0.00065504 0.00065504]
so you are trying to assign this array to one element of another array. Which is not allowed!
There are several ways to do it so that it is working. Just changing
y = np.array([0]*score_len)
to a multidimensional array would work:
y = np.zeros(score.shape)
That should do the trick but I'm not sure if it's what you intended.
EDIT:
It seems you did not want multidimensional input so you just need to change:
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
to
scores = np.hstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
verify the shape of these arrays by printing scores.shape
really helps you find such errors by yourself. The first one stacks along the first axis (vstack) and the hstack by the zeroth-axis (which is what you want)