I have with me a script that converts jsonl files in a selected directory to csv files in another specified location. However, upon converting the files to csv format, the final created csv file contains a .jsonl extension before the .csv (Think file.jsonl.csv
) Any ideas on how to remove the .jsonl extension before adding the csv extension at the back? I hope I can be able to get rid of the .jsonl extension for the csv file as it may be confusing in future. Thank you!
Sample CSV file created:
20210531_CCXT_FTX_DOGEPERP.jsonl.csv
My script:
import glob
import json
import csv
import time
start = time.time()
#import pandas as pd
from flatten_json import flatten
#Path of jsonl file
File_path = (r'C:\Users\Natthanon\Documents\Coding 101\Python\JSONL')
#reading all jsonl files
files = [f for f in glob.glob( File_path + "**/*.jsonl", recursive=True)]
i = 0
for f in files:
with open(f, 'r') as F:
#creating csv files
with open(r'C:\Users\Natthanon\Documents\Coding 101\Python\CSV\\' + f.split("\\")[-1] + ".csv", 'w' , newline='') as csv_file:
thewriter = csv.writer(csv_file)
thewriter.writerow(["symbol", "timestamp", "datetime","high","low","bid","bidVolume","ask","askVolume","vwap","open","close","last","previousClose","change","percentage","average","baseVolume","quoteVolume"])
for line in F:
#flatten json files
data = json.loads(line)
data_1 = flatten(data)
#headers should be the Key values from json files that make Column header
thewriter.writerow([data_1['symbol'],data_1['timestamp'],data_1['datetime'],data_1['high'],data_1['low'],data_1['bid'],data_1['bidVolume'],data_1['ask'],data_1['askVolume'],data_1['vwap'],data_1['open'],data_1['close'],data_1['last'],data_1['previousClose'],data_1['change'],data_1['percentage'],data_1['average'],data_1['baseVolume'],data_1['quoteVolume']])
The problem is because you are not getting rid of the extension when writing to the new file, something like this to replace your creation of the csv file should fix it
file_name = f.rsplit("\\", 1)[-1].replace('.jsonl', '')
with open(r'C:\Users\Natthanon\Documents\Coding 101\Python\CSV\\' + file_name + ".csv", 'w' , newline='') as csv_file: