Search code examples
pythonpython-3.xpandasdataframepandas-apply

DataFrame apply/append a function that returns a dict to each row


I'm looking to apply get_sentiment to each row in a dataframe and have the returned dict append to that row. Is there a good way of doing this?

def get_sentiment(txt: str) -> dict:
    response = client.detect_sentiment(Text=txt, LanguageCode='en')

    sentiment_data = dict()
    sentiment_data['Sentiment'] = response['Sentiment']
    sentiment_data['Sentiment_Score_Positive'] = response['SentimentScore']['Positive']
    sentiment_data['Sentiment_Score_Neutral'] = response['SentimentScore']['Neutral']
    sentiment_data['Sentiment_Score_Negative'] = response['SentimentScore']['Negative']
    return sentiment_data


def analyze_txt(df: DataFrame):
    df[] = df['Text'].apply(get_sentiment) #<- what I'm trying to do

Basically want the df to go from

id Text
1 hello world
2 this is something here

to

id Text Sentiment Sentiment_Score_Positive Sentiment_Score_Neutral Sentiment_Score_Negative
1 hello world Neutral .5 .5 .5
2 this is something here Neutral .5 .5 .5

Solution

  • When you apply get_sentiment to the Text column, it returns a Series of dicts, so one way to get the desired output is to convert it to a list of dicts and construct a DataFrame with it; then join it to df:

    new_df = df.join(pd.DataFrame(df['Text'].apply(get_sentiment).tolist()))
    

    If df has a specific index that needs to be retained, you could assign it when constructing the DataFrame to be joined:

    s = df['Text'].apply(get_sentiment)
    new_df = df.join(pd.DataFrame(s.tolist(), index=s.index))
    

    A faster method maybe to simply map get_sentiment to the Text column values:

    new_df = df.join(pd.DataFrame(map(get_sentiment, df['Text'].tolist())))