Gensim offers evaluate_word_pairs function for evaluating semantic similarity.
Here is an example from its page:
model.wv.evaluate_word_pairs(datapath('wordsim353.tsv'))
Out:
((0.1014236962315867, 0.44065378924434523), SpearmanrResult(correlation=0.07441989763914543, pvalue=0.5719973648460552), 83.0028328611898)
I would like to know what metrics are used to generate each value(0.1014236962315867, 0.44065378924434523,...) in the output?
Per the documentation for evaluate_word_pairs()
:
Returns
- pearson (tuple of (float, float)) – Pearson correlation coefficient with 2-tailed p-value.
- spearman (tuple of (float, float)) – Spearman rank-order correlation coefficient between the similarities from the dataset and the similarities produced by the model itself, with 2-tailed p-value.
- oov_ratio (float) – The ratio of pairs with unknown words.
Per your output, it looks like the Pearson result is still just a plain tuple, while the Spearman result has been reported as a named tuple. But in each case, it appears the correlation-coefficient is 1st, then the p-value.
Note that the oov_ratio
is reported as the percentage of test words that weren't known to the model.
Consult other references for definitions/explanations of the Pearson and Spearman coefficients/p-values.