I'm trying to find a way to iterate through a text file and list to find character frequency. I understand that I could use Count() for this. But Count() gives everything including spaces periods and whatnots. Also it does not show the character frequency in alphabetical order. I found a way to do it and it works but not really. I'll explain later. Also when I try to put the frequency I get a KeyError. I'll also explain.
I don't want to put my whole project on here so I'll explain some stuff first. I have a separate list called alphabet_list which includes the alphabet. There's a text file that is already read through and converted into uppercase called new_text.
Character frequency Code:
for i in range(len(alphabet_list)):
for c in new_text:
if c == alphabet_list[i]:
count += 1
else:
count = 0
print(alphbet_list[i] + " " + str(count)
i += 1
Output
A 0
A 0
.
.
.
A 1
A 0
.
.
.
B 0
.
.
.
B 1
B 2
B 0
.
.
.
Z 0
P.S the str(count) is temporarily there because I want to see how it looks like print out, I needed to store the result in dictionary
My output would be that, like I said it works but not really. It will iterate but it iterates through every letter and prints out the result already and does not iterate the whole text file and just print final result. It will add to the result if there is another letter same as before right next to each other. Ex (... bb...) it will be B 1, B 2 like shown in my output. And for some reason when I use return it doesn't work. It returns nothing and just ends the program.
Second Code with KeyError:
for i in range(len(alphabet_list)):
for c in new_text:
if c == alphabet_list[i]:
count += 1
else:
count = 0
c_freq[alphabet_list[i]] == count
print(c_freq)
i += 1
This one was pretty simple I got a KeyError: 'A'. I tried only doing the
i = 3 #just random number to test
count = 50
c_freq[alphabet_list[i]] == count
print(c_freq)
and it works, so I'm thinking that problem is also related to the problem above(? maybe). Anyways any help would be great. Thanks!
Sorry for long question but I really needed help.
This should help you:
lst = ['A', 'Z', 'H', 'A', 'B', 'N', 'H', 'Y', '.' , ',','Z'] #Initial list. Note: The list also includes characters such as commas and full stops.
alpha_dict = {}
for ch in lst:
if ch.isalpha(): #Checks if the character is an alphabet
if ch in alpha_dict.keys():
alpha_dict[ch] += 1 #If key already exists, value is incremented by 1
else:
alpha_dict[ch] = 1 #If key does not exist, a new key is created with value 1
print(alpha_dict)
Output:
{'A': 2, 'Z': 2, 'H': 2, 'B': 1, 'N': 1, 'Y': 1}
Since you want the output to be sorted in alphabetical order, add these lines to your code:
key_list = list(alpha_dict.keys()) #Creates a list of all the keys in the dict
key_list.sort() #Sorts the list in alphabetical order
final_dict = {}
for key in key_list:
final_dict[key] = alpha_dict[key]
print(final_dict)
Output:
{'A': 2, 'B': 1, 'H': 2, 'N': 1, 'Y': 1, 'Z': 2}
Thus, here is the final code:
lst = ['A', 'Z', 'H', 'A', 'B', 'N', 'H', 'Y', '.' , ',','Z']
alpha_dict = {}
for ch in lst:
if ch.isalpha():
if ch in alpha_dict.keys():
alpha_dict[ch] += 1
else:
alpha_dict[ch] = 1
key_list = list(alpha_dict.keys())
key_list.sort()
final_dict = {}
for key in key_list:
final_dict[key] = alpha_dict[key]
print(final_dict)
Output:
{'A': 2, 'B': 1, 'H': 2, 'N': 1, 'Y': 1, 'Z': 2}