I have been playing around with the MIT DeepTraffic Challenge Also watching the lecture and reading the slides
After getting a General understanding of the architecture I was wondering what exactly the reward function given by the Environment is.
I also found this javascript Codebase, which does not really help my understanding either.
The reward is scaled average speed within the interval: [-3, 3].
The implementation of the deeptraffic environment locates in this file: https://selfdrivingcars.mit.edu/deeptraffic/gameopt.js
I'm trying to make it readable. Here's the WIP one: https://github.com/mljack/deeptraffic/blob/master/gameopt.js
var reward = (avgSpeedMeasurement - 60) / 20;