I have a CSV file parser script in Python to do some stuff with a big CSV file. There is around 1 mil. rows, so the process takes some time.
import csv
import sys
with open('csvfeed.csv', newline='', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile, delimiter=';', quotechar='|')
for row in reader:
ParserFunction(row)
def ParserFunction(row):
#Some logic with row
Is there a way to multi-thread this loop function, to lower the execution time?
Thanks
You can divide each row to be processed with a single thread instead of the main thread waiting for the previous row to finish processing to proceed with the next row:
import csv
import sys
import threading
def ParserFunction(row):
#Some logic with row
pass
with open('csvfeed.csv', newline='', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile, delimiter=';', quotechar='|')
for row in reader:
threading.start_new_thread(ParserFunction, row)
But the exact way of doing so requires knowing what is the logic exactly you want to do with each row and whether it depends on other rows or not