I'm attempting to understand how bias can be measured using word embeddings. Reading the article https://towardsdatascience.com/gender-bias-word-embeddings-76d9806a0e17
What is the bias being identified in the above statement ? Is the bias here that a woman cannot be seen as a doctor when a man is involved ?
Is a neutral bias for a either a man or woman being identified is where there is a small difference between woman,doctor man,doctor , represented a vector : $woman + doctor \approx man + doctor$ ?
You would expect that
woman + doctor = man + doctor
Or rewritten:
woman + doctor - man = doctor
But since it is 'nurse' in that word embedding space, that is an indicator for bias towards women in healthcare to be percieved as nurses. Doctors are associated more with men in the corpus from which the embeddings were trained, so it can be concluded that the corpus (and the learned word embedding) has a gender bias.