I am using basic genetic algorithm (GA) on different test functions with different hyper parameters to determine the effects of hyper parameters on the algorithm.
Criteria: The guessed answer (minimum) by GA close enough to real answer.
"close enough" is determined by "level of accuracy" or LOA.
if |guessed answer - real answer | < LOA => guessed answer is considered correct
Problem: Different functions have different input ranges and a using a static LOA for all test function does not seem right.
Question: How should I decide on LOA value? Should it be related to input range of the function being tested?
Example: Schwefel test function has input range of (-500,500)
for all inputs and minimum of 0
. If the guessed minimum by GA is 0.08
and LOA is 0.1
then this guessed answer is correct because |0 - 0.08| < 0.1
but if the guessed answer is 0.12
, it is considered wrong.
Rastrigin test function has range of (-5.12, 5.12)
for all inputs. Using same LOA for Rastrigin does not seems right since it has very smaller range and same GA hyper parameters will do better here with the same LOA.
Should LOA be related to range? for example should LOA of 0.001
be used for Rastrigin since its range is 1/100 range of Schwefel .
PS: Stopping condition is "maximum number of generations" and number of dimensions is 45 for all cases.
As mentioned in the answers, Since the goodness of results is compared, using simple LOA is not a suitable way.
Edited Goal(Example): On average, how good the result of using GA on 10 different test functions is, when using "Population size" of 500 (all hyper parameters are consistent through the tests).
Considering the goodness is determined by running the same GA with the same function 100time and finding out how many time guessed answer is closer to the real answer than epsilon
, I still need to adjust this epsilon
so an average on all ten test function leads to broader conclusion.
PS: The epsilon
itself is supposed to be defined by me.
This is a good observation. The thing you're really getting at here is that, when optimizing, the convergence criteria you choose carries some implications.
How should I decide on LOA value? Should it be related to input range of the function being tested?
A static epsilon
criteria (in your case, difference from LOA), will be comparatively easier vs. harder to achieve based on both the size (input combinations) and shape of the space (are there many inputs, or few inputs, that achieve that epsilon?)
There are some alternatives to this:
Do what you suggest, and relate the epsilon to the size of the space. This can cause issues, though, on spaces of different sizes but similar shapes. Consider a space (roughly) shaped like -v-
(a disk with a depression - if you want, you can let the disk slope gently inward outside the v, and drastically inward inside the v) - the optimally minimal point being the bottom of the v
. If I also have a space ----v----
with the same size v
but a much larger disk, should I use a bigger epsilon
to account for the larger space? At some point, e.g. ----------v----------
any point on the disk is within epsilon
of v
, if epsilon
keeps growing with size.
Use a different metric (not epsilon
to define convergence). Very popular are 1. time and 2. rate of improvement. I.e.
t
is r
, and the result at t-1
is r'
, let x
be the % change from r'
to r
. If x < n%
, stop. Otherwise, continue. This can be extended to consider e.g. average rate of improvement over a trailing series t-1
, t-2
, t-3
, ...
if you expect your result to be non-monotonically better over time.... determine the effects of hyper parameters on the algorithm
This isn't very well-defined. Are you trying to find a set of hyper-parameters that's "optimal" in some way across all your functions? A set that's optimal for each function? These are optimization tasks.
If you're trying to make a broader assessment, as stated by your question, of "what are the effects", you're probably going to want to just run each function a whole bunch with different hyper-parameter values, and see what the correlations are between hp values and goodness of result (delta from LOA). Explicitly, you probably don't want to analyze hyper-parameters by whether they produce a result within epsilon
or not, rather you want to analyze them by to what extent they improve (or not) the result.