I have a csv file named input.csv
with the following columns:
row_num | start_date_time | id | json_message |
---|---|---|---|
120 | 2024-02-02 00:01:00.001+00 | 1020240202450 | {'amount': 10000, 'currency': 'NZD','seqnbr': 161 } |
121 | 2024-02-02 00:02:00.001+00 | 1020240202451 | {'amount': 20000, 'currency': 'AUD','seqnbr': 162 } |
122 | 2024-02-02 00:03:00.001+00 | 1020240202452 | {'amount': 30000, 'currency': 'USD','seqnbr': None } |
123 | 2024-02-02 00:04:00.001+00 | 1020240202455 | {'amount': 40000, 'currency': 'INR','seqnbr': 163 } |
I'm using python3 to read this csv file and i need to replace seqnbr
field under json_message
column with a random integer digit for each row. If the seqnbr contains None then that row should not be replaced with random seqnbr rather it should be retained as is. Delimiter for my csv file is pipe (|) symbol. I'm using the below python code to replace the file with the random generated integer value but still it doesn't overwrite it and here is my code:
def update_seqnbr(cls):
filename = 'input.csv'
seqnbr_pattern = "'seqnbr': ([\s\d]+)"
with open(filename, 'r') as csvfile:
datareader = (csv.reader(csvfile, delimiter="|"))
next(datareader, None) # skip the headers
for row in datareader:
json_message = row[3]
match = re.findall(seqnbr_pattern, json_message)
if len(match) != 0:
replaced_json_message = json_message.replace(match[0], str(random.randint(500, 999)))
row[3] = replaced_json_message
x = open(filename, "a")
x.writelines(row)
x.close()
Below is how my file should look like:
row_num | start_date_time | id | json_message |
---|---|---|---|
120 | 2024-02-02 00:01:00.001+00 | 1020240202450 | {'amount': 10000, 'currency': 'NZD','seqnbr': 555 } |
121 | 2024-02-02 00:02:00.001+00 | 1020240202451 | {'amount': 20000, 'currency': 'AUD','seqnbr': 897 } |
122 | 2024-02-02 00:03:00.001+00 | 1020240202452 | {'amount': 30000, 'currency': 'USD','seqnbr': None } |
123 | 2024-02-02 00:04:00.001+00 | 1020240202455 | {'amount': 40000, 'currency': 'INR','seqnbr': 768 } |
Can someone please help me on this?
I made a few changes to your code, using csv to write the new csv too, and most importantly unidenting the write
part, that was ignoring the None
lines:
def update_seqnbr(cls):
filename = 'input.csv'
seqnbr_pattern = "'seqnbr': ([\s\d]+)"
with open(FILENAME, 'a', newline='') as targetfile:
targetcsv = csv.writer(targetfile, delimiter='|')
with open(filename, 'r') as csvfile:
datareader = (csv.reader(csvfile, delimiter="|"))
targetcsv.writerow(next(datareader, None)) # skip the headers
for row in datareader:
json_message = row[3]
match = re.findall(seqnbr_pattern, json_message)
if len(match) != 0:
replaced_json_message = json_message.replace(match[0], str(random.randint(500, 999)))
row[3] = replaced_json_message
targetcsv.writerow(row)
Here's the output:
row_num|start_date_time|id|json_message
120|2024-02-02 00:01:00.001+00|1020240202450|{'amount': 10000, 'currency': 'NZD','seqnbr': 961}
121|2024-02-02 00:02:00.001+00|1020240202450|{'amount': 20000, 'currency': 'NZD','seqnbr': 863}
122|2024-02-02 00:03:00.001+00|1020240202450|{'amount': 30000, 'currency': 'NZD','seqnbr': None }
123|2024-02-02 00:04:00.001+00|1020240202450|{'amount': 40000, 'currency': 'NZD','seqnbr': 666}