Search code examples
pythonpython-3.xtext-miningspacyremoving-whitespace

How to remove white spaces within a word using python?


This is the input given John plays chess and l u d o. I want the output to be in this format (given below)

John plays chess and ludo.

I have tried Regular expression for removing spaces but doesn't work for me.

import re
sentence='John plays chess and l u d o'
sentence = re.sub(r"\s+", "", sentence, flags=re.UNICODE)

print(sentence)

I expected the output John plays chess and ludo. .
But the output I got is Johnplayschessandludo


Solution

  • This should work! In essence, the solution extracts the single characters out of the sentence, makes it a word and joins it back to the remaining sentence.

    s = 'John plays chess and l u d o'
    
    chars = []
    idx = 0
    
    #Get the word which is divided into single characters
    while idx < len(s)-1:
    
        #This will get the single characters around single spaces
        if s[idx-1] == ' ' and s[idx].isalpha() and s[idx+1] == ' ':
            chars.append(s[idx])
    
        idx+=1
    
    #This is get the single character if it is present as the last item
    if s[len(s)-2] == ' ' and s[len(s)-1].isalpha():
        chars.append(s[len(s)-1])
    
    #Create the word out of single character
    join_word = ''.join(chars)
    
    #Get the other words
    old_words = [item for item in s.split() if len(item) > 1]
    
    #Form the final string
    res = ' '.join(old_words + [join_word])
    
    print(res)
    

    The output will then look like

    John plays chess and ludo