When publishing a post or page, wordpress auto generates permalink from the title. If title contains cyrillic characters the wordpress would generate permalink with the same cyrrilic characters. How can I fix this and make it use only latin characters (replacing cyrrilic with latin chars) in permalink.
Its really odd that WP allows non-latin chars in the permalink. Do I need to hack its core to fix it? Any ideas... Thanks!!
I assume you know about the Settings > Permalinks options in WP-Admin. But I assume you do, so...
The URL specification requires non-ASCII characters to be be encoded (good answer and links here can't open unicode url with python) and it looks like this is a long-standing issue with WP (see http://core.trac.wordpress.org/ticket/10690).
I am not sure why it's not fixed -- it should only apply to the "post slug" part of the URL, which is already "cleaned" when it is converted from the title of the blog post. The same code, presumably is run when the post slug is edited by the user. You may want to read the bug to see if there's some reason for it still being open.
Of course most web servers have no problem with these URLs (they may convert them internally anyway). It may be that this is why the issue isn't resolved -- even if WP doesn't adhere to the RFC spec for URLs, if they work 99% of the time, then there's really no issue. Further, without stripping language-specific encodings, the URLs are prettier for users, and more likely easier for Google to use to find content in documents.
So perhaps the best answer is: if it ain't broke, don't fix it :-)