I have an script in python that basically have this workflow
here a fragment of code of step 3
file_object = open(file_csv)
cur = connection.cursor()
copy_sql = """
COPY %sFROM stdin WITH CSV HEADER
DELIMITER as '""" + delimiter +"'"
cur.copy_expert(sql=copy_sql % table,file = file_object)
connection.commit()
cur.close()
This script works ok, but some csv inputs have the last columns without header and the code above fails
File "copy_to_psql.py", line 18, in load_csv_psql
cur.copy_expert(sql=copy_sql % table,file = file_object)
psycopg2.DataError: extra data after last expected column
Is there any way to only select the the columns with headers from csv?
Is there any solution using only PostgreSQL?
Any other suggestion?
Thanks in advance
As mention @ABAbhi my best option was clean the csv.
So to the workflow of my algorithm, I add a step to remove columns without
And here the code of step 3:
def remove_empty_colums(input_csv="in.csv", output_csv="out.csv", delimiter=','):
reader = csv.DictReader(open(input_csv), delimiter=delimiter)
headers = reader.fieldnames
writer = csv.DictWriter(open(output_csv, 'wb'),
fieldnames=headers, delimiter=delimiter)
writer.writeheader()
for row in reader:
row_dict = {}
for header in headers[:-1]:
row_dict[header] = row[header]
writer.writerow(row_dict)