Search code examples
windowsbatch-filerarwinrar

How to create subfolder on RAR/ZIP extraction if archive does not have one?


I have a lot of RAR or ZIP archives to decompress. Some of the archives contain a single folder with all files in this folder. Some other archives have all files at root level.

Case 01

Archive01.rar
    MyFolder
       file01.txt
       file02.txt 
       file03.txt 
       etc.

Case 02

Archive02.rar
    -file01.txt
    -file02.txt
    -file031.txt
    etc.

I know how extract all archives into a subfolder.

But how to create the subfolder only when there is none present in archive?

What I mean is within a batch process for processing thousands of archives there should be no folder created additionally on extraction if the archive file belongs to case 01 . But if archive file belongs to case 02 the extraction should be done into a subfolder with name of archive file.

Case 01 result

 MyFolder <- Folder
   file01.txt
   file02.txt 
   file03.txt 
   etc.

Case 02 result

 Archive02 <-Folder base on the archive name
   -file01.txt
   -file02.txt
   -file031.txt
   etc.

Solution

  • The console version Rar.exe has the command l to list archive file contents according to text file Rar.txt in program files folder of WinRAR being the manual for console version. The list output on running Rar.exe with command l (or L) could be processed in a batch file to determine if the RAR archive file contains at top level just a single directory and nothing else. But Rar.exe supports like free UnRAR.exe just RAR archives.

    To support also ZIP archives it is necessary to use GUI version WinRAR.exe which supports extraction of RAR and ZIP archives and some other archive types.

    The manual for WinRAR.exe is the help of WinRAR which can be opened on clicking in menu Help on menu item Help topics on running WinRAR. On help tab Contents there is the list item Command line mode with all necessary information in referenced help pages for running WinRAR.exe from command line.

    It can be seen on looking on list of Commands that WinRAR.exe does not support a command l to output archive file contents to a console window because of being a graphic user interface application.

    So it is not really possible to determine from command line or within a batch file if an archive file contains at top level just a single directory on using WinRAR.exe.

    However, that does not really matter as it would be inefficient to first parse an archive file for file and directory names and then use the appropriate command to extract the archive file without or with specifying an extra extraction folder on command line.

    It is much more efficient to first extract all *.rar (and later also all *.zip) files using just one WinRAR call with switch -ad to extract each archive file into a subdirectory with name of the archive file and second eliminate each extraction directory not being necessary because the corresponding archive file contained just a single directory at top level.

    This smart approach is used in the batch file below which contains following additional features to make it useful for hopefully many WinRAR users:

    1. The working directory can be specified as first argument on calling the batch file which can be even a UNC path.

    2. The batch file finds out automatically where WinRAR.exe is installed working also for those use cases with 32-bit or 64-bit WinRAR not being installed in default program files directory (as on all of my computers).

    Note: The commented batch file as posted below does not check if in current or specified directory an existing archive file was extracted already before. So it is not advisable to run the batch file multiple times on a directory with archive files once processed not being removed from that directory.

    @echo off
    rem Change working directory if batch file was started with an argument.
    if not "%~1" == "" (
        pushd "%~1" 2>nul
        if errorlevel 1 (
            echo Specified directory "%~1" does not exist.
            echo/
            pause
            goto :EOF
        )
    )
    
    setlocal EnableExtensions DisableDelayedExpansion
    
    rem Does WinRAR exist in default program files folder?
    set "WinRAR=%ProgramFiles%\WinRAR\WinRAR.exe"
    if exist "%WinRAR%" goto StartExtraction
    
    rem Does WinRAR exist in default program files folder for x86 applications?
    set "WinRAR=%ProgramFiles(x86%\WinRAR\WinRAR.exe"
    if exist "%WinRAR%" goto StartExtraction
    
    rem Try to determine installation location of WinRAR.exe from registry.
    set "TypeToken=2"
    goto GetPathFromRegistry
    
    rem On Windows Vista and later REG.EXE outputs without version info:
    
    rem HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe
    rem    (Default)    REG_SZ    Full path to WinRAR\WinRAR.exe
    
    rem There are only spaces used to separate value name, value type and value string.
    
    rem But REG.EXE version 3.0 outputs on Windows XP with version info:
    
    rem ! REG.EXE VERSION 3.0
    rem
    rem HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe
    rem     <NO NAME>   REG_SZ  Full path to WinRAR\WinRAR.exe
    
    rem NOTE: There are 4 indent spaces and 2 separating tabs in REG 3.0 output line.
    
    rem So either token 2 or token 3 contains value type REG_SZ
    rem used to identify the line with the wanted information.
    
    :GetPathFromRegistry
    for /F "skip=1 tokens=%TypeToken%*" %%A in ('%SystemRoot%\System32\reg.exe QUERY "HKLM\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe" /ve 2^>nul') do (
        if "%%A" == "REG_SZ" (
            if exist "%%~fB" (
                set "WinRAR=%%~fB"
                goto StartExtraction
            )
        ) else if "%%A" == "NAME>" (
            set "TypeToken=3"
            goto GetPathFromRegistry
        )
    )
    
    endlocal
    if not "%~1" == "" popd
    echo Could not determine directory containing WinRAR.exe.
    echo/
    echo Please configure it manually in file: %~f0
    echo/
    pause
    goto :EOF
    
    
    rem WinRAR supports multiple archive types on extraction.
    rem Specify here the archive file extensions for extraction.
    
    :StartExtraction
    for %%I in (rar zip) do call :ExtractArchives %%I
    
    rem Restore previous command environment, restore previous current directory
    rem and exit this batch file without fall through to the subroutines below.
    endlocal
    if not "%~1" == "" popd
    goto :EOF
    
    
    rem The subroutine ExtractArchives processes all archive files in current
    rem directory with the file extension passed to subroutine as first argument.
    
    rem WinRAR is called once to extract all files with specified file extension
    rem for extraction into a subdirectory with name of the archive file.
    
    rem Then one more subroutine is called for each archive file to determine
    rem if it is safe to move the extracted archive file contents up one level.
    
    :ExtractArchives
    if not exist "*.%~1" goto :EOF
    "%WinRAR%" x -ad -cfg- -ibck -y -- "*.%~1"
    for %%A in ("*.%~1") do call :MoveUpExtracted "%%~nA"
    goto :EOF
    
    rem The subroutine MoveUpExtracted first checks if for the archive file
    rem passed to the subroutine as first argument a subdirectory exists at
    rem all, i.e. the extraction before was successful for that archive.
    
    rem Next it counts the subdirectories in the archive extraction directory.
    rem Nothing is moved up if there is more than 1 subdirectory in archive
    rem extraction directory.
    
    rem Also nothing is moved up if archive extraction directory contains
    rem 1 or more files.
    
    rem After verification of archive extraction directory really containing
    rem only a single subdirectory and nothing else, the name of the archive
    rem extraction directory is compared case-insensitive with the name of
    rem the single subdirectory in archive extraction directory. On equal
    rem directory names the archive extraction directory is renamed by
    rem appending _tmp to make it possible to move the subdirectory with same
    rem name up one level in directory hierarchy. There is hopefully by chance
    rem never a directory present in current directory with name of an archive
    rem file and _tmp appended.
    
    rem Next it is checked if in current directory there is not already existing
    rem a directory with name of the subdirectory from extracted archive in which
    rem case it is also not possible to move the directory up one level. In this
    rem special use case the archive extraction directory is kept containing just
    rem a single subdirectory with restoring original directory name.
    
    rem Then the single subdirectory in archive extraction directory is moved up
    rem one level which is very fast as just the file allocation table is updated
    rem and no data is really moved.
    
    rem The directory movement could fail if the extracted directory has hidden
    rem attribute set. In this case temporarily remove the hidden attribute,
    rem move the directory up one level in directory hierarchy and set the
    rem hidden attribute again on the directory.
    
    rem On a succesful moving up of the extracted directory the (renamed)
    rem extraction directory being now empty is deleted as not further needed.
    
    
    :MoveUpExtracted
    if not exist "%~1\" (
        echo Error: No folder for archive %~1
        goto :EOF
    )
    
    echo Processing archive folder "%~1"
    set FolderCount=0
    set "FolderName="
    for /F "delims=" %%D in ('dir "%~1\*" /AD /B 2^>nul') do (
        if defined FolderName goto :EOF
        set /A FolderCount+=1
        set "FolderName=%%D"
    )
    if not %FolderCount% == 1 goto :EOF
    
    for /F "delims=" %%F in ('dir "%~1\*" /A-D /B 2^>nul') do goto :EOF
    
    set "ParentRenamed=0"
    set "ParentFolder=%~1"
    if /I "%~1" == "%FolderName%" (
        ren "%~1" "%~1_tmp" 2>nul
        if errorlevel 1 (
            echo Failed to rename "%~1" to "%~1_tmp".
            goto :EOF
        )
        set "ParentFolder=%~1_tmp"
        set "ParentRenamed=1"
    )
    
    if exist "%FolderName%" (
        if %ParentRenamed% == 1 ren "%~1_tmp" "%~1"
        echo Error: Folder exists "%FolderName%"
        goto :EOF
    )
    
    move "%ParentFolder%\%FolderName%" "%FolderName%" >nul 2>nul
    if not errorlevel 1 (
        rd "%ParentFolder%"
        goto :EOF
    )
    
    %SystemRoot%\System32\attrib.exe -h "%ParentFolder%\%FolderName%" >nul
    move "%ParentFolder%\%FolderName%" "%FolderName%" >nul
    if errorlevel 1 (
        if %ParentRenamed% == 1 (
            ren "%ParentFolder%" "%~1"
            goto :EOF
        )
    )
    
    %SystemRoot%\System32\attrib.exe +h "%FolderName%"
    rd "%ParentFolder%"
    goto :EOF
    

    I'm using 32-bit Windows since Windows 95, but I ran myself never into the MAX_PATH limitation, i.e. absolute file/folder names being longer than 259 characters.

    So it was a really interesting and also a very time consuming challenge to rewrite the batch file to work also when archive file names are very long, for example exactly 256 characters for file name + file extension.

    During the development of the the batch file below I found out following:

    1. Some commands like DIR, FOR, RD and REN support short 8.3 names in path AND file/folder name while other commands like ATTRIB and MOVE support them only in path, but not in file/folder name (at least on Windows XP).
      So it is not possible to move a folder or change its attributes using its short 8.3 name.

    2. All commands fail on using just relative folder names with relative folder path when folder name with full path is longer than 259 characters. This means Windows command interpreter first determines folder name with complete path before executing any command. So the current directory should have a short path on processing archives with very long names or containing a directory with a very long name.

    3. I could not figure out how to get short name of a folder or its path using %~fs1 as explained by call /? or %%~fsI (in batch file) as explained by for /? when only a relative folder path is parsed by Windows command interpreter, i.e. just the long name of a folder without its path.

    4. On running command DIR with option /X to get short name of a directory, the third column contains the short name and the fourth column the long name. But short name in third column can be missing on very short folder names.

    Output of dir /AD /X on English Windows 7 SP1 x64 executed on an NTFS partition with Germany set in Windows Region and Language settings:

     Volume in drive C is System
     Volume Serial Number is 7582-4210
    
     Directory of C:\Temp\Test
    
    29.04.2017  22:39    <DIR>                       .
    29.04.2017  22:39    <DIR>                       ..
    29.04.2017  22:39    <DIR>          ARCHIV~1     archive_with_a_very_very_very_..._long_name_1
    29.04.2017  22:39    <DIR>                       Batch
    29.04.2017  22:39    <DIR>                       xyz
    

    Same command dir /AD /X executed on German Windows XP SP3 x86 on a FAT32 partition also with Germany set in Windows Region and Language settings:

     Datenträger in Laufwerk F: ist TEMP
     Volumeseriennummer: CAA5-41AA
    
     Verzeichnis von F:\Temp
    
    29.04.2017  22:39    <DIR>                       .
    29.04.2017  22:39    <DIR>                       ..
    29.04.2017  22:39    <DIR>          BATCH        Batch
    29.04.2017  22:39    <DIR>                       xxx
    29.04.2017  22:39    <DIR>          ARCHIV~1     archive_with_a_very_very_very_..._long_name_1
    

    Note: The very long directory name was truncated here by me with ... in name.

    Why directory Batch has on Windows XP computer short name BATCH but no short name on Windows 7 is not really explainable for me.

    Here is the batch script supporting also long archive names and long directory names in archive as long as the path of current directory is short.

    @echo off
    rem Change working directory if batch file was started with an argument.
    if not "%~1" == "" (
        pushd "%~1" 2>nul
        if errorlevel 1 (
            echo Specified directory "%~1" does not exist.
            echo/
            pause
            goto :EOF
        )
    )
    
    setlocal EnableExtensions DisableDelayedExpansion
    
    rem Does WinRAR exist in default program files folder?
    set "WinRAR=%ProgramFiles%\WinRAR\WinRAR.exe"
    if exist "%WinRAR%" goto StartExtraction
    
    rem Does WinRAR exist in default program files folder for x86 applications?
    set "WinRAR=%ProgramFiles(x86%\WinRAR\WinRAR.exe"
    if exist "%WinRAR%" goto StartExtraction
    
    rem Try to determine installation location of WinRAR.exe from registry.
    set "TypeToken=2"
    goto GetPathFromRegistry
    
    rem On Windows Vista and later REG.EXE outputs without version info:
    
    rem HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe
    rem    (Default)    REG_SZ    Full path to WinRAR\WinRAR.exe
    
    rem There are only spaces used to separate value name, value type and value string.
    
    rem But REG.EXE version 3.0 outputs on Windows XP with version info:
    
    rem ! REG.EXE VERSION 3.0
    rem
    rem HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe
    rem     <NO NAME>   REG_SZ  Full path to WinRAR\WinRAR.exe
    
    rem NOTE: There are 4 indent spaces and 2 separating tabs in REG 3.0 output line.
    
    rem So either token 2 or token 3 contains value type REG_SZ
    rem used to identify the line with the wanted information.
    
    :GetPathFromRegistry
    for /F "skip=1 tokens=%TypeToken%*" %%A in ('%SystemRoot%\System32\reg.exe QUERY "HKLM\Software\Microsoft\Windows\CurrentVersion\App Paths\WinRAR.exe" /ve 2^>nul') do (
        if "%%A" == "REG_SZ" (
            if exist "%%~fB" (
                set "WinRAR=%%~fB"
                goto StartExtraction
            )
        ) else if "%%A" == "NAME>" (
            set "TypeToken=3"
            goto GetPathFromRegistry
        )
    )
    
    endlocal
    if not "%~1" == "" popd
    echo Could not determine directory containing WinRAR.exe.
    echo/
    echo Please configure it manually in file: %~f0
    echo/
    pause
    goto :EOF
    
    
    rem WinRAR supports multiple archive types on extraction.
    rem Specify here the archive file extensions for extraction.
    
    rem But first delete temporary folder from a previous breaked execution.
    
    :StartExtraction
    rd /Q /S # 2>nul
    
    for %%I in (rar zip) do call :ExtractArchives %%I
    
    rem Restore previous command environment, restore previous current directory
    rem and exit this batch file without fall through to the subroutines below.
    endlocal
    if not "%~1" == "" popd
    goto :EOF
    
    
    rem The subroutine ExtractArchives processes all archive files in current
    rem directory with the file extension passed to subroutine as first argument.
    
    rem WinRAR is called once to extract all files with specified file extension
    rem for extraction into a subdirectory with name of the archive file.
    
    rem Then one more subroutine is called for each archive file to determine
    rem if it is safe to move the extracted archive file contents up one level.
    
    :ExtractArchives
    if not exist "*.%~1" goto :EOF
    "%WinRAR%" x -ad -cfg- -ibck -y -- "*.%~1"
    for %%A in ("*.%~1") do call :MoveUpExtracted "%%~nA" %1
    goto :EOF
    
    rem The subroutine MoveUpExtracted first checks if for the archive file
    rem passed to the subroutine as first argument a subdirectory exists at
    rem all, i.e. the extraction before was successful for that archive, and
    rem determines short 8.3 name of this directory.
    
    rem Next it counts the subdirectories in the archive extraction directory
    rem using short directory name. Nothing is moved up if there is more than
    rem 1 subdirectory in archive extraction directory.
    
    rem Also nothing is moved up if archive extraction directory contains
    rem 1 or more files.
    
    rem After verification of archive extraction directory really containing
    rem only a single subdirectory and nothing else, the current archive folder
    rem is renamed to # (single character folder name) using short folder name.
    
    rem This folder rename should work in general. The current archive folder
    rem is kept in case of this folder rename fails unexpected because it is
    rem not yet known if the current directory does not already contain the
    rem single directory extracted from current archive or rename failed
    rem because of a permission or a directory sharing access restriction.
    
    rem Next it is checked if in current directory there is not already existing
    rem a directory with name of the subdirectory from extracted archive in which
    rem case it is also not possible to move the directory up one level. In this
    rem special use case the archive extraction directory is kept containing just
    rem a single subdirectory with restoring original directory name. In case of
    rem restoring archive directory fails unexpected, the directory with name #
    rem is deleted and the archive is extracted once again into a directory with
    rem name of archive file.
    
    rem It is clear on this point that the single directory in archive extraction
    rem directory can be moved up to current directory from directory wit having
    rem now the temporary name #.
    
    rem Moving a directory with command MOVE is not possible if hidden attribute
    rem is set on directory. For that reason it is checked next if the directory
    rem to move up has hidden attribute set using its short directory name.
    
    rem In case of directory has hidden attribute is indeed set, it is removed
    rem which is also verified. The verification can't be done with errorlevel
    rem evaluation as external command ATTRIB does not set errorlevel on failed
    rem attribute change. So the attribute check is done once again after the
    rem hidden attribute is removed with ATTRIB.
    
    rem ATTRIB also fails to change the attribute if absolute folder path is
    rem longer than 259 characters. In this case the current extraction folder
    rem with temporary name # is deleted completely and the current archive is
    rem extracted once again to current directory without creation of an
    rem additional directory with name of archive file.
    
    rem Then the single subdirectory in archive extraction directory having
    rem now name # is also renamed to # using short directory name to avoid
    rem a problem on next command MOVE with an absolute folder path longer
    rem than 259 characters as much as possible.
    
    rem The directory extracted from archive with name # in directory # is
    rem moved up to current directory with suppressing all errors which could
    rem occur for example if path of current directory plus name of directory
    rem as extracted from archive file is too long.
    
    rem The directory # in current directory with its subdirectory # is deleted
    rem on a moving error and the current archive file is extracted once again
    rem into current directory without creation of an additional directory with
    rem name of archive file.
    
    rem But on successful movement of the folder with correct name to current
    rem directory the hidden attribute is set on folder if the extracted folder
    rem has it also set before moving the folder and the finally empty folder #
    rem is deleted before exiting subroutine.
    
    
    :MoveUpExtracted
    set "FolderToCheck=%~f1"
    set "FolderToCheck=%FolderToCheck:~0,258%"
    for /F "skip=5 tokens=4*" %%X in ('dir "%FolderToCheck%*" /AD /X 2^>nul') do (
        if "%%Y" == "%~1" set "ArchiveFolder=%%X" & goto Subfolders
        if "%%Y" == "" if /I "%%X" == "%~1" set "ArchiveFolder=%%X" & goto Subfolders
    )
    echo Error: No folder for archive %~1
    goto :EOF
    
    :Subfolders
    @echo off
    echo Processing archive folder "%~1"
    set FolderCount=0
    set "FolderName="
    for /F "delims=" %%D in ('dir "%ArchiveFolder%\*" /AD /B 2^>nul') do (
        if defined FolderName goto :EOF
        set /A FolderCount+=1
        set "FolderName=%%D"
    )
    if not %FolderCount% == 1 goto :EOF
    
    for /F "delims=" %%F in ('dir "%ArchiveFolder%\*" /A-D /B 2^>nul') do goto :EOF
    
    ren "%ArchiveFolder%" # 2>nul
    if errorlevel 1 (
        echo Error: Failed to rename "%~1"
        goto :EOF
    )
    
    set "FolderToCheck=%~dp1%FolderName%"
    set "FolderToCheck=%FolderToCheck:~0,258%"
    for /F "skip=5 tokens=4*" %%X in ('dir "%FolderToCheck%*" /AD /X 2^>nul') do (
        if "%%Y" == "%FolderName%" goto FolderExist
        if "%%Y" == "" if /I "%%X" == "%FolderName%" goto FolderExist
    )
    
    set "HiddenFolder=0"
    set "FolderToCheck=%~dp1#\%FolderName%"
    set "FolderToCheck=%FolderToCheck:~0,258%"
    for /F "skip=5 tokens=4*" %%X in ('dir "%FolderToCheck%*" /AD /X 2^>nul') do (
        if "%%Y" == "%FolderName%" set "FolderToMove=%%X" & goto CheckHidden
        if "%%Y" == "" if /I "%%X" == "%FolderName%" set "FolderToMove=%%X" & goto CheckHidden
    )
    
    :CheckHidden
    for %%X in ("#\%FolderToMove%") do (
        for /F "tokens=2 delims=h" %%H in ("%%~aX") do (
            if %HiddenFolder% == 1 goto ArchiveExtract
            set "HiddenFolder=1"
            %SystemRoot%\System32\attrib.exe -h "#\%FolderName%"
            goto CheckHidden
        )
    )
    
    ren "#\%FolderToMove%" # 2>nul
    move #\# "%FolderName%" >nul 2>nul
    if errorlevel 1 goto ArchiveExtract
    
    if %HiddenFolder% == 1 %SystemRoot%\System32\attrib.exe +h "%FolderName%"
    rd #
    goto :EOF
    
    :ArchiveExtract
    rd /Q /S #
    "%WinRAR%" x -cfg- -ibck -y -- "%~1.%~2"
    goto :EOF
    
    :FolderExist
    echo Error: Folder exists "%FolderName%"
    ren # "%~1" 2>nul
    if not errorlevel 1 goto :EOF
    rd /Q /S #
    "%WinRAR%" x -ad -cfg- -ibck -y -- "%~1.%~2"
    goto :EOF
    

    It would be definitely better to write a console application in C or C++ or C# being long path aware replacing subroutine MoveUpExtracted in above batch scripts.

    On Windows 10 version 1607 (Anniversary Update) or later Windows versions the MAX_PATH limit of 260 characters (259 characters plus terminating null byte) can be disabled via a group policy or by adding a registry value, see

    For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

    • attrib /?
    • call /?
    • dir /?
    • echo /?
    • endlocal /?
    • for /?
    • goto /?
    • if /?
    • move /?
    • pause /?
    • popd /?
    • pushd /?
    • rd /?
    • reg /?
    • reg query /?
    • rem /?
    • ren /?
    • set /?
    • setlocal /?

    Read also the Microsoft articles: