The code is supposed to create a dictionary that counts the number of occurrences for a word in a body of text. This is doing using a While loop, Try block with EOFError exception and Ctrl + D when the program is done looping through the text.
from collections import Counter
print("Enter/Paste your multiline / paragraph of text. Hit Ctrl + D to run it ")
print("DO NOT PRESS ENTER OR CTRL+V AFTER THE FIRST TIME YOU DROP IN AND RUN YOUR TEXT ")
#DO NOT PUT IN TEXT IN input() TEXT WILL JUST APPEAR OVER AND OVER AGAIN AND LOOK ANNOYING
work_on_this_string_built_from_loop = "" # this initializes out of the loop and adds
# text from input as a single lined string
while True: # the loop will always = True so this creates an infinite loop
print("DO NOT PRESS ENTER OR CTRL+V AFTER YOU HAVE INITIALLY RAN YOUR TEXT ")
print("INSTEAD HIT CTRL + D TO RUN IT AFTER TEXT IS INPUTTED")
print()
print("THE OUTPUT ABOVE IS NOT THE FINAL RESULT IT IS NOT DONE YET")
print("IT IS ONLY OUTPUT FROM THE FIRST PARAGRAPH OR LINE BREAK ")
try:
multi_lined_input = input("\n") # blank screen of input will appear after text is pasted in
work_on_this_string_built_from_loop+=multi_lined_input
print(f"line is equal to {work_on_this_string_built_from_loop} ") # INPUTS FILE/USER INPUT AS A SINGLE LINED STRING
print()
print()
print(work_on_this_string_built_from_loop.split('\n'))
frequency_of_words_dict = dict( Counter( work_on_this_string_built_from_loop.split() ) )
print()
except EOFError as error : # this allows for quiting the program using CTRL + D
# once the error is excepted, the program ends and this
#causes the break statement to execute and the loop ends
print("ITERATING THROUGH EXCEPTION")
print(frequency_of_words_dict)
for i in frequency_of_words_dict.items():
print(i)
break
print("PROGRAM COMPLETE")
print(f"the length of words in entry is : {len(work_on_this_string_built_from_loop.split())}")
But when I copy and paste text into the program, it joins the last word of a line with the first word of the line after it.
Example:
Example string used below
You may charge a reasonable fee for copies of or *providing
access* to or distributing Project Gutenberg-tm electronic works
provided that
Example Output:
{'You': 1, 'may': 1, 'charge': 1, 'a': 1, 'reasonable': 1, 'fee': 1,
'for': 1, 'copies': 1, 'of': 1, 'or': 2,
'*providingaccess*': 1, 'to': 1,
'distributing': 1, 'Project': 1,
'Gutenberg-tm': 1, 'electronic': 1, '*worksprovided*': 1, 'that': 1}
('You', 1)
('may', 1)
('charge', 1)
('a', 1)
('reasonable', 1)
('fee', 1)
('for', 1)
('copies', 1)
('of', 1)
('or', 2)
('*providingaccess*', 1)
('to', 1)
('distributing', 1)
('Project', 1)
('Gutenberg-tm', 1)
('electronic', 1)
('*worksprovided*', 1)
('that', 1)
How can I split up the last word of the previous line and the first word of the next line?
When you process the input, you're removing the line feeds. As you append the lines to work_on_this_string_built_from_loop
, append a space. If you're worried about a superfluous space, then check for that condition.
work_on_this_string_built_from_loop += \
multi_lined_input + ('' if multi_lined_input == ' ' else ' ')