I'm creating a csv file from reading multiple text file i have created as below
Col1, Col2, Col3, Col4
name1, copy, create, copy
cut paste
name2, data, null , data
cut cut
i want to remove duplicates from column4 comparing with column2 before writing to csv. like above from row1, column4 should only be paste like wise in row2, column4 should be empty
desired output be like:
Col1, Col2, Col3, Col4
name1, copy, create, paste
cut
name2, data, null ,
cut
i have something like below
stat2 = 'Col1,Col2,Col3,Col4\n'
text_file=os.listdir('.data/')
for pack in text_file:
file = open("./data/"+ pack, "r")
perp = file.read()
stat2 += pack + ',"'
#I'm iterating through different set of list and matching with all multiple files.
for word in package:
stat2 += word + "\n"
stat2 += '","'
for word in data:
stat2 += word + "\n"
stat2 += '","'
for word in file:
stat2 += word + "\n"
stat2 += '"' + "\n"
f = open("data/csv_file.csv", "w")
f.write(stat2)
I want to remove duplicates before writing it to csv. Can anyone please suggest any update on this. Thanks
the question is not very clear. however what you can generally do is compare and edit the elements of one list with another list and remove the duplicates from the target list. suppose in this instance, col2 is the target list:
col1 = ['copy','create','cut']
col2 = ['copy','create','cut','delete']
you can use a list comprehension to create a new list that has only the unique values:
col2 = [i for i in col2 if i not in col1 ]
and then if you print the result, you'll get this for col2:
['delete']