Hi :) there are two columns: sentiment and comment. How to filter only duplicate comments in the dataset? Thank you four your help :)
It depends on the columns using which you would like to output only duplicate records.
Example 1 - based on all columns in a data frame called df
duplicates = df[df.duplicated(keep=False)] #False means retaining all duplicates
Example 2 - based on a certain column or columns
duplicate = dictionary_df[dictionary_df[0].duplicated(keep=False)]#This is on the first column