I use python 2.7 and it turned out that datetime.strftime produces different output on different environments (both unix-based) with the same locale settings.
locale.setlocale(locale.LC_ALL, ('RU', 'utf-8'))
print locale.getlocale()
print datetime.date.today().strftime("%Y %d %B, %A")
On first env I got:
('ru_RU', 'UTF-8')
2016 21 января, четверг (month name is in genitive form)
On the second:
('ru_RU', 'UTF-8')
2016 21 Январь, Четверг (month name is in infinitive form)
As you can see there are also some differences in upper/lowercase letters. PYTHONIOENCODING is set to utf_8 in both cases.
What is the reason of this behavior and more important is there a way to make the second env work the way the first does?
You are looking at the output of the C strftime()
call here; Python delegates to it. That function picks those strings from locale files stored outside the control of Python.
See the locale
man page for a description of the file format; you are looking for the LC_TIME
mon
and day
lists here.
On Mac OS X stores things slightly differently, the files are stored in /usr/share/locale/
; for ru_RU
time definitions there is a file called /usr/share/locale/ru_RU.UTF-8/LC_TIME
; it puts values one per line in a certain order. The first 24 lines are months (abbreviated and full), for example; with the full month names defined as:
января
февраля
марта
апреля
мая
июня
июля
августа
сентября
октября
ноября
декабря
Because this is OS and system specific, you'd have to use a different system altogether to format your dates if you need these strings to be consistent across different platforms.
If you are trying to parse a date string, you won't get far with the datetime
or time
modules. Try the dateparser
project instead, which understands the different Russian forms:
>>> import dateparser
>>> dateparser.parse(u'2016 21 января, четверг')
datetime.datetime(2016, 1, 21, 0, 0)
>>> dateparser.parse(u'2016 21 Январь, Четверг')
datetime.datetime(2016, 1, 21, 0, 0)