I have a .rtf file containing a unique string on each line, 1000 lines or so. I want to remove the last character off of each of those lines, and output the result into a new .csv file so that I can add it later on into a database.
I tried to do it in python, by opening both files and looping through each line, however the output file would remain empty and I got 0 errors throughout the whole thing. How can I implement another solution in python? Thanks in advance!
input_file = "usernames.rtf"
output_file = "outputfile.csv"
character_to_remove = "."
with open(input_file, "r") as input_f, open(output_file, "w") as output_f:
for line in input_f:
if line[-1] == character_to_remove:
modified_line = line.rstrip(line[-1])
output_f.write(modified_line + "\n")
else:
output_f.write(line + "\n")
input_f.close()
output_f.close()
print("Character removal completed. Check the output file:", output_file)
Each line ends with '\n'
. So there are two mistakes:
character_to_remove = "."
with character_to_remove = ".\n"
line.rstrip(line[-1])
with line.rstrip(line[-(len(character_to_remove)])
This would just be a fix for YOUR code. As mentioned, you should use .readlines()
to avoid such problems. Also, use line.endswith(<string>)
for your if statement. This function returns True if the string line
ends with the given string. There is also the equivalent for checking the begin of a string called line.startswith(<string>)
.
Edit: This should fit for your problem. As mentioned in some comments, you have to use the right format for a csv file. The csv module, which is part of the standard Python installation, should be enough. Pandas also provides a good csv writing function.
import csv # to create a csv file
input_file = "usernames.rtf"
output_file_txt = "outputfile.txt"
output_file_csv = "outputfile.csv"
character_to_remove = "."
# if you want to remove multiple characters
char_len = len(character_to_remove)
with open(input_file, "r") as input_f, open(output_file_txt, "w") as output_f_txt, open(output_file_csv, "w") as output_f_csv:
csvwriter = csv.writer(output_f_csv)
for line in input_f.readlines():
line = line.strip() # remove trailing spaces and linebreaks
if line.endswith(character_to_remove):
line = line.rstrip(line[-char_len])
output_f_txt.write(line + "\n")
csvwriter.writerow(line) # for further options read csv manual
print("Character removal completed. Check the output file:", output_file_txt, output_file_csv)