Find repeated sentences within text

I would like to know how I could find similarity within the same sentence. I have a list of sentences like these:

my_list=["do you want pizza for dinner? Do you want pizza for dinner?", "I like pizza", "I have no money I have no money"]

I would like to create a pandas dataframe where, if a sentence is repeated within the same, I assign 1, otherwise 0.

Something like this:

Text                                                              Repeated?
do you want pizza for dinner? Do you want pizza for dinner?            1
I like pizza                                                           0
I have no money I have no money                                        1

I was thinking of something like this:

from collections import Counter


my_list = dict(Counter(my_list.split()))
for i in sorted(my_list.keys()):
    print ('"'+i+'" is repeated '+str(my_list[i])+' time.')

Then counting how many words there are in total and how many unique words there are in total in that sentence. But I think it would be not good as coding. Do you know if there is another way to get the expected result?

Solution

You can use regular expression for the task (regex101):

import re
import pandas as pd

my_list=["do you want pizza for dinner? Do you want pizza for dinner?", "I like pizza", "I have no money I have no money"]
df = pd.DataFrame({'Text': my_list})

r = re.compile(r'(.+)\s*\1$', flags=re.I)
df['Repeated'] = df['Text'].apply(lambda x: bool(r.match(x))).astype(int) 
print(df)

Prints:

                                                Text  Repeated
0  do you want pizza for dinner? Do you want pizz...         1
1                                       I like pizza         0
2                    I have no money I have no money         1