Search code examples
pythonregexstripping

How to remove text in between \r's including \r from a text file?? And if any text contains \r in between ' ' that also should be removed


'Aadhirai' 'A special star' '6' 'Boy' '' "\rgoogletag.cmd.push(function() { googletag.display('div-gpt-ad-1445572280350-0'); });\r" 'Aadhiren' 'Dark' '6' 'Boy' '' 'Aadhish' 'King Commanded Counselled' '5' 'Boy' '' 'Aadhyatm' 'Dhyan' '1' 'Boy' '' 'Aadi' 'First Most important Beginning Ornament Adornment' '6' 'Boy' '' 'Aadia' 'Being a gift' '7' 'Boy' '' 'Aadidev' 'The first God' '1' 'Boy' '' 'Aadijay' 'The first victory' '6' 'Boy' '' 'Aadim' 'Entire universe' '1' 'Boy' '' 'Aadinath' 'The first Lord Lord Vishnu' '4' 'Boy' '' 'Aadipta' 'Bright' '7' 'Boy' '' 'Aadish' 'Full of wisdom Intelligent' '6' 'Boy' '' 'Aadishankar' 'Sri shankaracharya Founder of Adwaitha philosophy' '6' 'Boy' '' 'Aadit' 'Peak Lord of Sun' '8' 'Boy' '' 'Aaditey' 'Son of Aditi' '11' 'Boy' '' '\r        (adsbygoogle = window.adsbygoogle || ).push({});\r    '

Solution

  • What you want to do is remove data between \r and another \r.The correct thing to use here would be regex.

    Code:

    import re
    check="""'Aadhirai' 'A special star' '6' 'Boy' '' "\rgoogletag.cmd.push(function() { googletag.display('div-gpt-ad-1445572280350-0'); });\r" 'Aadhiren' 'Dark' '6' 'Boy' '' 'Aadhish' 'King Commanded Counselled' '5' 'Boy' '' 'Aadhyatm' 'Dhyan' '1' 'Boy' '' 'Aadi' 'First Most important Beginning Ornament Adornment' '6' 'Boy' '' 'Aadia' 'Being a gift' '7' 'Boy' '' 'Aadidev' 'The first God' '1' 'Boy' '' 'Aadijay' 'The first victory' '6' 'Boy' '' 'Aadim' 'Entire universe' '1' 'Boy' '' 'Aadinath' 'The first Lord Lord Vishnu' '4' 'Boy' '' 'Aadipta' 'Bright' '7' 'Boy' '' 'Aadish' 'Full of wisdom Intelligent' '6' 'Boy' '' 'Aadishankar' 'Sri shankaracharya Founder of Adwaitha philosophy' '6' 'Boy' '' 'Aadit' 'Peak Lord of Sun' '8' 'Boy' '' 'Aaditey' 'Son of Aditi' '11' 'Boy' '' '\r        (adsbygoogle = window.adsbygoogle || ).push({});\r    '"""
    print re.sub(r"\r.*?\r"," ",check)
    

    Output:

    'Aadhirai' 'A special star' '6' 'Boy' '' " " 'Aadhiren' 'Dark' '6' 'Boy' '' 'Aadhish' 'King Commanded Counselled' '5' 'Boy' '' 'Aadhyatm' 'Dhyan' '1' 'Boy' '' 'Aadi' 'First Most important Beginning Ornament Adornment' '6' 'Boy' '' 'Aadia' 'Being a gift' '7' 'Boy' '' 'Aadidev' 'The first God' '1' 'Boy' '' 'Aadijay' 'The first victory' '6' 'Boy' '' 'Aadim' 'Entire universe' '1' 'Boy' '' 'Aadinath' 'The first Lord Lord Vishnu' '4' 'Boy' '' 'Aadipta' 'Bright' '7' 'Boy' '' 'Aadish' 'Full of wisdom Intelligent' '6' 'Boy' '' 'Aadishankar' 'Sri shankaracharya Founder of Adwaitha philosophy' '6' 'Boy' '' 'Aadit' 'Peak Lord of Sun' '8' 'Boy' '' 'Aaditey' 'Son of Aditi' '11' 'Boy' '' '     '
    

    Notes:

    • re module is used for doing regex matches
    • \r.*?\r is the regex I am trying to match it say to start from \r match everything until next \r