This is the format of my List currently:
["'There's no going back', 'pop'", "'Mark my words', 'pop'", "'This love will make you levitate', 'pop'", "'Like a bird, like a bird without a cage', 'pop'"]
I want to convert it to the following format:
[('There\'s no going back', 'pop'), ('Mark my words', 'pop'), ('This love will make you levitate', 'pop'), ('Like a bird, like a bird without a cage', 'pop')]
So I need to tokenize the input strings into tuples. But I'm not sure how can this be done since the "" are present as it is primarily a String.
If additional context is required, I'm scraping a large chunk of data in the above mentioned format and to process it with a Naive Bayes Classifier I need it in the bracket format. I'm open to try a different approach if it's more efficient.
Use replace
and split
:
lst = ["'There's no going back', 'pop'", "'Mark my words', 'pop'", "'This love will make you levitate', 'pop'", "'Like a bird, like a bird without a cage', 'pop'"]
print([tuple(x.replace('\'', '').split(',')) for x in lst])
Output:
[('Theres no going back', ' pop'), ('Mark my words', ' pop'), ('This love will make you levitate', ' pop'), ('Like a bird', ' like a bird without a cage', ' pop')]