Search code examples
batch-filecmdjrepl

Renaming file which is contain Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü


I am trying to rename files with batch. I want to replace letters Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü with O o C c S s I i G g U u but its failing. What can i do for fix this problem.

@echo OFF
set TargetFolder=%~dp0target
setlocal enableDelayedExpansion
set srch=Ö ö Ç ç Ş ş İ ı Ğ ğ Ü ü
set rplc=O o C c S s I i G g U u
set /a n=0

for %%a in (!srch!) do set /a n+=1&set srch[!n!]=%%a
set /a n=0
for %%a in (!rplc!) do set /a n+=1&set rplc[!n!]=%%a

for /f "tokens=* delims=" %%a in ('dir /b /a-d "%TargetFolder%\*"') do (
  set NewFileName=%%~na
  for /l %%x in (1,1,!n!) do (
    for /f "tokens=* delims=" %%t in ('jrepl !srch[%%x]! !rplc[%%x]! /s NewFileName') do set "NewFileName=%%t"
  )
  ren "%TargetFolder%\%%~nxa" "!NewFileName!%%~xa"
)
endlocal
pause

PS: This code require JREPL.BAT file from @dbenham.


Solution

  • The problem characters are Unicode that do not have an ASCII equivalent. The file system allows such unicode charactesr, but the command line has limited support for unicode.

    It is possible to manipulate unicode characters with JREPL by using the \uNNNN escape sequence. But even if you do it correctly, the command line corrupts the value when you attempt to rename the file.

    I have written another hybrid JScript/batch utility called JREN.BAT that renames files or folders via regular expression replacements. I didn't plan on this, but this is a perfect application for JREN.BAT. This works because JScript actually does the rename, and JScript natively works with unicode.

    In order to do the rename, you must first establish the unicode code value for the problem characters. I copied the characters into a MicroSoft Word document, and used the process described at http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=UTTUsingUnicodeMacros to figure out the code values.

    I wrote three solutions using JREN.

    1) This first version is fairly easy to follow, and it is easy to maintain - simply add an additional
    "find replace" line for each needed translation. The big disadvantage is slow performance because it renames every file repeatedly - once for each character to be translated.

    @echo off
    for %%A in (
      "00D6 O"
      "00F6 o"
      "00C7 C"
      "00E7 c"
      "015E S"
      "015F s"
      "0130 I"
      "0131 i"
      "011E G"
      "011F g"
      "00DC U"
      "00FC u"
    ) do for /f "tokens=1,2" %%B in (%%A) do call jren "\u%%B" "%%C" %*
    

    2) This second version is a bit wicked to follow, and difficult to maintain. But it is much faster because it fully renames each file in one pass.

    @echo off
    call jren "(\u00D6)|(\u00F6)|(\u00C7)|(\u00E7)|(\u015E)|(\u015F)|(\u0130)|(\u0131)|(\u011E)|(\u011F)|(\u00DC)|(\u00FC)" ^
              "$1?'O':$2?'o':$3?'C':$4?'c':$5?'S':$6?'s':$7?'I':$8?'i':$9?'G':$10?'g':$11?'U':'u'" /j %*
    

    3) This last version gives the best of both worlds. The translation list is easily maintained like the first version, but then it dynamically builds the search and replace expressions that are used by the second method. So it is able to rename all files in one pass.

    @echo off
    setlocal enableDelayedExpansion
    set "find="
    set "repl="
    set /a n=0
    for %%A in (
      "00D6 O"
      "00F6 o"
      "00C7 C"
      "00E7 c"
      "015E S"
      "015F s"
      "0130 I"
      "0131 i"
      "011E G"
      "011F g"
      "00DC U"
      "00FC u"
    ) do for /f "tokens=1,2" %%B in (%%A) do (
      set /a n+=1
      set "find=!find!|(\u%%B)"
      set "repl=!repl!$!N!?'%%C':"
    )
    call jren "!find:~1!" "!repl!$0" /j %*
    

    Assume you name any of the above scripts "fixUnicode.bat" and you place it along with JREN.BAT somewhere in your PATH, then you could use any of the following:

    Rename all files in the current directory

    fixUnicode
    

    Rename all files in the d:\test folder

    fixUnicode /p "d:\test"
    

    Recursively rename all files and folders on the c: drive

    fixUnicode /s /p "c:\"
    fixUnicode /d /s /p "c:\"
    

    There are other options you can append to specify which files and/or paths to include or exclude. Use jren /? to get help on all the options that are available to JREN. Most of them can be used with fixUnicode.bat