I am using pypdf2's function for extracting document info. The results are something like this but I am unable to interpret the creation date format. What are the last few digits representing?
pdf.documentInfo
[Output]: {'/Creator': 'Rave (http://www.nevrona.com/rave)',
'/Producer': 'Nevrona Designs',
'/CreationDate': 'D:20060301072826' }
and at one point I also saw this:
CreationDate': "D:20170920114835+02'00'"
how can I read or convert it into a normal date time readable format?
you can clean & parse like
from datetime import datetime
CreationDate = "D:20170920114835+02'00'"
dt = datetime.strptime(CreationDate.replace("'", ""), "D:%Y%m%d%H%M%S%z")
# UTC offset is set correctly:
print(dt)
# 2017-09-20 11:48:35+02:00
print(repr(dt))
# datetime.datetime(2017, 9, 20, 11, 48, 35, tzinfo=datetime.timezone(datetime.timedelta(seconds=7200)))
...which I think is more straight forward than the answer to this related question shows.