I have this big set of data where date and time have separate columns. I want to convert them to string and subsequently convert them to datetime time. However, this value error always pops up.
ValueError: time data '200101030' does not match format '%Y%m%d%H%M%S' (match)
Here is the data format:
<DTYYYYMMDD> <TIME> Adj. Open Adj. High Adj. Low Adj. Close volume
20010102 230100 0.5617 0.5617 0.5617 0.5617 4
20010102 230400 0.5616 0.5616 0.5616 0.5616 4
Below is the code:
df=pd.read_csv('AUDUSD.txt')
df['DATE']=df['<DTYYYYMMDD>'].map(str)+df['<TIME>'].map(str)
df['DATE']=pd.to_datetime(df['DATE'],format="%Y%m%d%H%M%S")
print(df.head(15))
Any help is greatly appreciated!!
There is problem some bad data - in column <TIME>
is zero
.
Need parameter errors='coerce'
for convert bad data to NaT
:
print (df)
<DTYYYYMMDD> <TIME> Adj. Open Adj. High Adj. Low Adj. Close volume
0 20010102 230100 0.5617 0.5617 0.5617 0.5617 4
1 20010103 0 0.5616 0.5616 0.5616 0.5616 4
df['DATE']=df['<DTYYYYMMDD>'].astype(str)+df['<TIME>'].astype(str)
df['DATE']=pd.to_datetime(df['DATE'],format="%Y%m%d%H%M%S", errors='coerce')
print (df)
<DTYYYYMMDD> <TIME> Adj. Open Adj. High Adj. Low Adj. Close volume \
0 20010102 230100 0.5617 0.5617 0.5617 0.5617 4
1 20010103 0 0.5616 0.5616 0.5616 0.5616 4
DATE
0 2001-01-02 23:01:00
1 NaT