I'm training a Neural Network, but that's not relevant. The only relevant info is this line:
print(f"{it*100/iters}%, Pos: ({Xcg:.4},{Ycg:.4}), Ideal position: ({Xcg:.4},{Xcg**2-Xcg:.4}), Rew: {c_reward:.3}, Ang: {ang:.3}, ε: {self.eps:.3}")
where Xcg, Ycg and ang are floats (either 0. or something like 0.5-0.5, which is also a float);
c_reward is calculated with this expression:
r = math.exp(-abs(posy-posx**2+posx))*math.sqrt(posx**2+posy**2)*abs(posx)/posx if posx!= 0 else 0
and self.eps is calculated using this expression:
self.eps = 0.8 + 0.8 * math.exp(-0.001 * self.steps)
where steps goes from 0 to 4000. So, eps is also a float.
None of those can possibly be floats. What's going on here? It doesn't happen at a specific value of self.steps either, right now it looks random.
I can see at least one int
already without seeing all the details, your else
is for 0
which is an int
. Need to do 0.0
for a float.
r = math.exp(-abs(posy-posx**2+posx))*math.sqrt(posx**2+posy**2)*abs(posx)/posx if posx!= 0 else 0.0