Search code examples
pythonfiletextnlp

Text manipulation of a comma separated string in python


I want to read a textfile test.txt where the txt is in the format

'Jon, Stacy, Simon, ..., Maverick'

I'd want to save the string into test2.txt as

'Jon AS t1_Jon, Stacy AS t1_Stacy, Simon AS t1_Simon, ..., Maverick AS t1_Maverick'

It could be that there is a linebreak every now and then, I would want to ignore that. How would I do it in an efficient and easy way?

PS: I couldn't come up with a more fitting title, how would you name it?


Solution

  • One nice approach is to use the re module.

    import re
    
    s_in = 'apple, banana, orange,\n mango, guava'
    words = re.split(r'[,\n]\s*',s_in)
    s_out = ', '.join([f'{word} AS t1_{word}' for word in words])
    print(s_out)
    

    Result:

    apple AS t1_apple, banana AS t1_banana, orange AS t1_orange, mango AS t1_mango, guava AS t1_guava