I am trying to read in a dataset that has a column with a date in it. As a default, this column is read in as character but I want it read in as date.
If I read in by using read_csv using the defaults, the dates in the column display like so in the viewer:12/04/2019 (i.e., dmy)
However, when using the following I get parsing failures:
data<- read_csv("file.csv",
col_types = cols(dob = col_date("%d-%m-%Y"))
Warning: 4160 parsing failures.
row col expected actual
1 dob date like %d-%m-%Y 12/04/2019
At first, I thought this was because I had specified hyphens (-) in col_date. But I get the same errors if I change the hyphen to a forward slash:
data<- read_csv("file.csv",
col_types = cols(dob = col_date("%d/%m/%Y"))
Warning: 4160 parsing failures.
row col expected actual
1 dob date like %d/%m/%Y 12-04-2019
Using problems() just expands on this message. I'm struggling to know how to proceed because whatever I change in col_date() doesn't seem help. In fact, it then reports (as can be seen above) that the opposite formatting was found in the file.
EDIT: Trying suggestion from Bernhard
I read in the column as character and ran the following:
head(data$dob, 20)
[1] "12/04/20" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020" "20/04/2020"
[12] "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019" "12/04/2019"
After suggestions from Bernhard, I found that there were inconsistencies with the input date format with different rows using different separators. I fixed this by converting them all into the same separators.