I have a piece of code in python that reads from a excel file and saves into redshift database.
import psycopg2
def from_redshift():
book = xlrd.open_workbook("excelfile.xlsx")
sheet = book.sheet_by_index(0)
con = psycopg2.connect(dbname='dbname', host='something.com', port=portnum, user='username', password='password')
cursor=con.cursor()
query = """INSERT INTO table_name (col1, col2, col3, start_date, update_date) VALUES (%s, %s, %s, %s, %s)"""
for r in range(1, sheet.nrows):
col1 = sheet.cell(r,0).value
col2 = sheet.cell(r,1).value
col3 = sheet.cell(r,2).value
start_date = sheet.cell(r,3).value
update_date = sheet.cell(r,4).value
# Assign values from each row
values = (col1, col2, col3, start_date, update_date)
# Execute sql Query
cursor.execute(query, values)
print("Executed")
# Close the cursor
cursor.close()
The code works fine in reading and inserting into the database, but my question is that the 'start_date
' and the 'update_date
' fields are of datetime
in the database, so when I try to insert then, it gives me error that the values from these two columns are not in the right format, and when I changed these two columns to varchar
in the database, it inserts these values are some weird number like 23.12345
(something like that).
The values in these two column look like YYYY-MM-DD HH:MM:[SS]
(custom format).
How do I get these date time values in the database correctly?
# Commit the transaction
con.commit()
con.close()
From the documentation on xlrd
To read the date values you can use the xldate_as_tuple function
Because the dates are stored as numbers within the excel file format
I have't tested this, but with your code:
def from_redshift():
book = xlrd.open_workbook("excelfile.xlsx")
sheet = book.sheet_by_index(0)
for r in range(1, sheet.nrows):
start_date = xldate_as_tuple(sheet.cell(r,3).value, book.datemode)
start_date = datetime.datetime(*start_date)
BTW, if your method name is any indication of what you are doing. If you are loading this data into AWS Redshift, copying from a CSV file is always faster, easier and generally recommended over executing inserts from excel data like this.