I have developed python
and used pandas module
to write excel
.
While executing command print(df1.columns), I get dtype as 'Object'.
and using same excel
to load in Teradata
table using TPT script
and getting below error
FILE_READER[1]: TPT19108 Data Format 'DELIMITED' requires all 'VARCHAR/JSON/JSON BY NAME/CLOB BY NAME/BLOB BY NAME/XML BY NAME/XML/CLOB' schema.
Using Description in TPT:-
DEFINE SCHEMA Teradata__DATA
DESCRIPTION 'SCHEMA OF Teradata data'
(
Issue_Key VARCHAR(255),
Log_Date VARDATE(10) FORMATIN ('YYYY-MM-DD') FORMATOUT ('YYYY-MM-DD'),
User_Name VARCHAR(255),
Time_Spent NUMBER(10,2)
Please help in resolving the failure message. Error might be due different Datatype or due to defined delimeter as "TAB". Please suggest if any other reason is causing this failure.
df = pd.read_excel('Time_Log_Source_2019-05-30.xlsx', sheet_name='Sheet1', dtype=str)
print("Column headings:")
print(df.columns)
df = pd.DataFrame(df,columns=['Issue Key', 'Log Date', 'User', 'Time Spent(Sec)'])
df['Log Date'] = df['Log Date'].str[:10]
df['Time Spent(Sec)'] = df['Time Spent(Sec)'].astype(int)/3600
print(df)
df.to_excel("Time_Log_Source_2019-05-30_output.xlsx")
df1 = pd.read_excel('Time_Log_Source_2019-05-30_output.xlsx', sheet_name='Sheet1',dtype=str)
df1['Issue Key'] = df1['Issue Key'].astype('str')
df1['Log Date'] = df1['Log Date'].astype('str')
df1['User'] = df1['User'].astype('str')
df1['Time Spent(Sec)'] = df1['Time Spent(Sec)'].astype('str')
df1.to_excel("Time_Log_Source_2019-05-30_output.xlsx",startrow=0, startcol=0, index=False)
print(type(df1['Time Spent(Sec)']))
print(df.columns)
print(df1.columns)
Result
Index([u'Issue Key', u'Log Date', u'User', u'Time Spent(Sec)'], dtype='object')
Index([u'Issue Key', u'Log Date', u'User', u'Time Spent(Sec)'], dtype='object')
A TPT Schema describes fields in client-side records, not columns in the database table. You would need to change the schema to say the (input) Time_Spent is VARCHAR.
But TPT does not natively read .xlsx files. Consider using to_csv
instead of to_excel
.