I am importing an excel file into a pandas dataframe with the pandas.read_excel()
function.
One of the columns is the primary key of the table: it's all numbers, but it's stored as text (the little green triangle in the top left of the Excel cells confirms this).
However, when I import the file into a pandas dataframe, the column gets imported as a float. This means that, for example, '0614' becomes 614.
Is there a way to specify the datatype when importing a column? I understand this is possible when importing CSV files but couldn't find anything in the syntax of read_excel()
.
The only solution I can think of is to add an arbitrary letter at the beginning of the text (converting '0614' into 'A0614') in Excel, to make sure the column is imported as text, and then chopping off the 'A' in python, so I can match it to other tables I am importing from SQL.
You just specify converters. I created an excel spreadsheet of the following structure:
names ages
bob 05
tom 4
suzy 3
Where the "ages" column is formatted as strings. To load:
import pandas as pd
df = pd.read_excel('Book1.xlsx',sheetname='Sheet1',header=0,converters={'names':str,'ages':str})
>>> df
names ages
0 bob 05
1 tom 4
2 suzy 3