Search code examples
regexbashdiacriticsfile-rename

Troubles renaming filenames with diacritics


Why does this bash command try to replace é with ee and not with e?

$ rename 's/[éè]/e/g' tést                
Can't rename tést teest: Aucun fichier ou dossier de ce type

How may I get it work the way I'm expecting?


Solution

  • Because your terminal is set to UTF-8, but rename operates on bytes. Therefore, it in fact sees s/[\303\251\303\250]/e/g, and your string t\303\251st contains two of them, so each one is replaced by e.

    You can add any Perl code to the expression, so you can turn utf-8 for the regular expression on by use utf8 and you can decode the argument by decoding the topic variable $_:

    rename 'use utf8; use Encode; $_ = decode("UTF-8", $_); s/[éè]/e/g' tést