I have this text file and let's say it contains 10 lines.
Bye
Hi
2
3
4
5
Hi
Bye
7
Hi
Every time it says "Hi"
and "Bye"
I want it to be removed except for the first time it was said.
My current code is (yes filename is actually pointing towards a file, I just didn't place it in this one)
text_file = open(filename)
for i, line in enumerate(text_file):
if i == 0:
var_Line1 = line
if i = 1:
var_Line2 = line
if i > 1:
if line == var_Line2:
del line
text_file.close()
It does detect the duplicates, but it takes a very long time considering the amount of lines there are, but I'm not sure on how to delete them and save it as well
You could use dict.fromkeys
to remove duplicates and preserve order efficiently:
with open(filename, "r") as f:
lines = dict.fromkeys(f.readlines())
with open(filename, "w") as f:
f.writelines(lines)