I have a input file which contain three columns: "start date", "end date" and "interval". Input file:
start date end date interval
01/01/2020 10/01/2020 15
i want to use these values in pandas date_range
function. my
import pandas as pd
timeDF = pd.read_excel('inputFile.xlsx')
startDate = timeDF['start date']
endDate = timeDF['end date']
interval = timeDF['interval']
timeStamp = pd.date_range(start = startDate, end = endDate, freq = str(interval) + 'min')
print(timeStamp)
error that I am getting:
Name: Interval, dtype: int64min
You're running into this problem because you're concatenating the string "min" onto interval, which is a pandas.core.series.Series object.
When you call pd.read_excel('inputFile.xlsx')
, you get a dictionary of Series objects, so startDate, endDate, and interval are all Series objects you should extract data from instead of using directly.
In order to get around this error, you can use timeDF['name'].values.item() to get your values for all three variables like so:
import pandas as pd
timeDF = pd.read_excel('inputFile.xlsx')
startDate = timeDF['start date'].values.item()
endDate = timeDF['end date'].values.item()
interval = timeDF['interval'].values.item()
timeStamp = pd.date_range(start = startDate, end = endDate, freq = str(interval) + 'min')
print(timeStamp)
Output:
DatetimeIndex(['2020-01-01 00:00:00', '2020-01-01 00:15:00',
'2020-01-01 00:30:00', '2020-01-01 00:45:00',
'2020-01-01 01:00:00', '2020-01-01 01:15:00',
'2020-01-01 01:30:00', '2020-01-01 01:45:00',
'2020-01-01 02:00:00', '2020-01-01 02:15:00',
...
'2020-09-30 21:45:00', '2020-09-30 22:00:00',
'2020-09-30 22:15:00', '2020-09-30 22:30:00',
'2020-09-30 22:45:00', '2020-09-30 23:00:00',
'2020-09-30 23:15:00', '2020-09-30 23:30:00',
'2020-09-30 23:45:00', '2020-10-01 00:00:00'],
dtype='datetime64[ns]', length=26305, freq='15T')