I'm currently working on a complex documentation project with python sphinx. My next step is to enable internationalization.
Project overview (simplified):
doc\
build\ # contains sphinx build output
images\ # contains image resources
locales\ # gnu gettext structure (simplified)
en\LC_MESSAGES\index.po+mo
en\LC_MESSAGES\articles\connect.po+mo
de\LC_MESSAGES\index.po+mo
de\LC_MESSAGES\articles\connect.po+mo
source\
_static\
articles\
connect.rst
commission.rst
troubleshoot\
bugs.rst
reference\
generated.rst
about.rst
conf.py # contains sphinx configuration
index.rst
terminology.rst
Makefile
Workbench\ # contains work contained in generated reference
Localization options in conf.py
:
locale_dirs = [
'../locales/'
]
gettext_compact = False
Rule in Makefile
to create html output
html:
sphinx-build -M html "source" "build" -Dlanguage="de" -v
Rule in Makefile
to create *.pot files:
gettext:
sphinx-build -b gettext "source" "build\gettext"
Rule in Makefile
to update localizations:
update_po:
sphinx-intl update -p "build\gettext" -Dlanguage="en" -Dlanguage="de"
As you may already can tell from the directory structure and path delimiter: I am using Windows 10.
Cutout from build output for make html
containing localization output
Running Sphinx v4.2.0
loading translations [de]... done
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: 0 added, 15 changed, 0 removed
My problem is the following:
Sphinx does not match localized strings in textdomains that are contained in a subdirectory of LC_MESSAGES.
I've configured sphinx gettext with gettext_compact=False
because I want to have separate translation files for each document.
This makes it easier for our team's workflow to manage translations and progress.
When generating *.pot files using command make gettext
I'm using the same configuration.
Now when I generate html/pdf output only the toplevel documents textdomains are processed correctly and localized strings are substituted in the resulting document. Also no errors are thrown during loading of the translations (as you can see in the cutout above). The number of files also matches the number of documents - I assume until here everything works fine.
I am wondering if this has something to do with windows using a different path seperator than unix? Maybe gettext doesn't find the correct textdomain because "articles/connect" != "articles\connect"
.
Or am I just missing something? I assumed that the make update_po
command produces a valid file/directory structure under LC_MESSAGES that gettext is able to process. Is this assumption correct? I haven't found any information on this topic, yet.
Any help and/or ideas appreciated!
I have found the solution/cause.
My first assumption was that it might have to do with the locale_dirs
entry in conf.py
.
I moved the directory with *.po
files containing sphinx-build localized strings to the location recommended in sphinx-intl
docs.
Nothing changed.
When again inspecting the generated *.po
files I noticed something weird (I guess).
Some msgid's were contained in multiple *.po
files.
It turned out that sphinx generates a *.po
file for each *.rst
document in the directory structure or at least for each document that is part of the document hierarchy.
When one document imports another via the include
directive the texts of the included document are also treated as part of the including document.
And also the textdomain is matched that way when generating the documentation for a specific language.
This kinda makes sense because the include
directive just inserts the contents of the included document in the current document...
To work around this, texts have to be translated in the *.po
file of the including document. Texts translated in the *.po
file of the actual document are ignored.
I think this behavior applies to the whole recursive stack of documents inluding other documents, but havent tested yet.
Hope someone else finds this useful.
I'm going to accept this answer as the correct answer.