I am trying to split a large CSV file into several parts as files with Python. as a first try, I read the first 261579 lines from the CSV dataset file using this part of the code:
for c in range(261579):
line = datasetFile.readline()
if len(line) == 0:print("empty line detected at : " ,c)
lines.append(line)
print("SAVING LINES ......")
split = open(outputDirectoryName+"spilt" + str(x+1) +".csv","w")
split.writelines(lines)
print("SPLIT " + str(x+1) + " END with " ,str(len(lines)) , "lines .")
OK, for the moment, the code works well and shows me
"SPLIT 1 END with 261579 lines."
, But the problem is that when I open my file "Split1.csv" with notpad++, I only find 261575 instead of 261579, it's a loss of data for 4 lines somewhere in the file.
With this proportion, I want to know what exactly happens with the "file.writeLines (lines)" method when do we use it to save my data in a split file?
I had same issue and then I found out that I should have closed my file.for you
split.close()