Search code examples

I am getting this error: TypeError: '<' not supported between instances of 'str' and 'float'

I have this table that in which I am comparing list of articles (Article_body) with 4 baseline articles using cosine similarity:

Article_body articleScores1 articleScores2 articleScores3 articleScores4 articleScores5
a***** 0.6 0.2 0.7 0.9 0.2
a***** 0.3 0.8 0.1 0.5 0.1

I want to add a column that gives which column has the largest cosine similarity out of 5, condition it should be at least 0.5. If none of CosineSim(i)

Article_body articleScores1 articleScores2 articleScores3 articleScores4 Most_similar_to
a***** 0.6 0.2 0.7 0.9 CosineSim4
a***** 0.3 0.8 0.1 0.5 CosineSim2
a****** 0.1 0.2 0.3 0.4 False

I am using this code to achieve this:

cos_cols = [f"articleScores{i}" for i in range(1, 6)]    
def n_lar(text):
    if (df[cos_cols].idxmax(axis=1)) <0.5:
        return False
        df['Max'] = (df[cos_cols].idxmax(axis=1))

df['Most_similar_to'] = df.apply(n_lar)

However, I am getting this error:

TypeError: '<' not supported between instances of 'str' and 'float'

How can I resolve this?


I have this table that in which I am comparing list of articles (Article_body) with 4 baseline articles using cosine similarity:

I want to add a column that gives which column has the largest cosine similarity out of 5, condition it should be at least 0.5. If none of CosineSim(i) is atleast 0.5 then return False as in the table 2


  • (df.iloc[:, 1:-1]
         lambda x: ('CosineSim' + x.idxmax()[-1]) if x.max() >= 0.5 else False 
         , axis=1)


    0    CosineSim4
    1    CosineSim2
    2         False

    make result to Most_similar_to column