while finding a square-root, we perform number**0.5, but what is the meaning of - sign before 0.5.
I was looking at a code (I was specificially looking at ViT Code, and here for scaling, they have added:
self.scale = self.head_dim ** -0.5
Please help me understanding it meaning. I did few experiments on my terminal but did not understand what's happening:
>>> a = 4
>>> a**0.5
2.0
>>>
>>> a**-0.5
0.5
Reciprocal. x**-n == 1/(x**n)
.......