I'm quite new to the whole HuggingFace pipeline world, and I have stumbled upon something which I can't figure out. I have googled quite a bit for an answer, but haven't found anything yet, so any help would be great. I am trying to get just the score from the HF pipeline sentiment classifier, not the label, as I want to apply the scores to a dataframe containing many cells of text. I know how to achieve this on just a single sentence, namely like so:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("This is a positive sentence")[0]
(result['score'])
This gives me the following output:
0.9994597434997559
I know how to apply the classifier to my dataframe. However, when I adapt the code above to the dataframe, like so:
result = df['text'].apply(lambda x: classifier(x[:512]))[0]
df['sentiment'] = result['score']
My code fails on the second line, with the following error:
TypeError: list indices must be integers or slices, not str
Does anyone know how to fix this? I have tried a few things, but I haven't been able to figure it out so far. Any help would be immensely appreciated!
If your classifier output looks like this:
[{'label': '1', 'score': 0.9999555349349976}]
then you could extract the score with the following:
result['sentiment'] = df['text'].apply(lambda x: classifier(x[:512]).apply(
lambda x: classifier(x)).str[0].str['score']
Alternatively:
Get the classifier output:
df['result'] = df['text'].apply(lambda x: classifier(x[:512]))
Extract the score from the output:
df['sentiment'] = df['result'].str[0].str['score']