Search code examples
pythonregexlistsyspathlib

Program for considering a word such as colour's as 2 words


I would like my code to consider [colour's] as 2 words [colour] & [s] and take the count for it in python. I tried doing in this way but causes many errors

import sys
from pathlib import Path
import re

text_file = Path(sys.argv[1])

if text_file.exists() and text_file.is_file():
    read = text_file.read_text()
    length = len(read.split())
    addi = len(re.search(r'*.[["a-zA-Z"]]', text_file))
    length += addi
    print(f'{text_file} has', length, 'words')
else:
    print(f'File not found: {text_file}')

Solution

  • Perhaps you could use the function .split() and re.findall for your purpose.. With the latter function, you could count the number of words (with [color's] as 2 words) instead of looking for the individual words in group. For example

    import re
    
    read = "today is Color's birthday"
    print(read.split())
    print(len(read.split()))
    
    read2 = re.findall(r'[a-zA-Z]+', read)
    print(read2)
    print(len(read2))
    

    Output:

    ['today', 'is', "Color's", 'birthday']
    4
    ['today', 'is', 'Color', 's', 'birthday']
    5