Download an article with unicode title from Wikipedia using wget in xml format

I am currently downloading the XML from Wikipedia for individual articles. For this I use wget with the following call format

https://de.wiktionary.org/wiki/Special:Export/?title=Special:Export&pages=**<page>**&curonly=1&templates=1&action=submit

This also works, but I have problems with e.g. Cyrillic characters. They are encoded for the page (a lot of %). But this does not seem to work. I always get back only the schema definition. If I enter the address (see above) in the browser it works. I have already tried with --remote-encoding=UTF-8 . It affects windows!

Solution

It is not sufficient to set the encoding for the target server via

 --remote-encoding=UTF8

to specify. For the input it is also mandatory to do this.

--local-encoding=UTF8

Then wget does not replace it with the % replacement. Otherwise wget assumes ASCII encoding and uses % replacement.