Search code examples
pythonstringlisttext-processingnaivebayes

Replace double quotations with brackets in a Python List of Strings


This is the format of my List currently:

["'There's no going back', 'pop'", "'Mark my words', 'pop'", "'This love will make you levitate', 'pop'", "'Like a bird, like a bird without a cage', 'pop'"]

I want to convert it to the following format:

[('There\'s no going back', 'pop'), ('Mark my words', 'pop'), ('This love will make you levitate', 'pop'), ('Like a bird, like a bird without a cage', 'pop')]

So I need to tokenize the input strings into tuples. But I'm not sure how can this be done since the "" are present as it is primarily a String.

If additional context is required, I'm scraping a large chunk of data in the above mentioned format and to process it with a Naive Bayes Classifier I need it in the bracket format. I'm open to try a different approach if it's more efficient.


Solution

  • Use replace and split:

    lst = ["'There's no going back', 'pop'", "'Mark my words', 'pop'", "'This love will make you levitate', 'pop'", "'Like a bird, like a bird without a cage', 'pop'"]
    
    print([tuple(x.replace('\'', '').split(',')) for x in lst])
    

    Output:

    [('Theres no going back', ' pop'), ('Mark my words', ' pop'), ('This love will make you levitate', ' pop'), ('Like a bird', ' like a bird without a cage', ' pop')]