I use python 2.7 and i want to find the frequencies of the words in text file , I write a code using this following expression but there is no output :
import nltk
import os
import re
import string
path="C:\Python27\Lib"
os.chdir(path)
frequency = {}
document_text = open('1.txt', 'r')
text_string = document_text.read().lower()
match_pattern = re.findall(r'^[\u0621-\u064A\u0660-\u0669 ]+$',
text_string)
for word in match_pattern:
count = frequency.get(word,0)
frequency[word] = count + 1
frequency_list = frequency.keys()
for words in frequency_list:
print words, frequency[words]
This is because you do not match all characters.If you remove anchors you will get a match.See demo.