Search code examples
batch-filecmd

Why can the wmic data from a text file not be processed because of some weird character coding problem?


I've been working on a little script that determines which disk is usable in our specific system. There has to be basically less than 1 TB of space on the disk in order to be considered as usable.

This is the complete code:

@echo off

chcp 65001>nul
setlocal enabledelayedexpansion

wmic logicaldisk get caption,freespace>c:\cmd\1\getDiskInfo.txt

for /f "tokens=1,2 eol=C" %%I in (C:\cmd\1\getDiskInfo.txt) do (
    set diskCaption=%%I
    set diskFreeSpace=%%J
    set captionFreeSpace=!diskCaption! !diskFreeSpace!

    for /f "tokens=1,2 delims= " %%X in ("!captionFreeSpace!") do (
        if [%%Y] neq [] set usedDisks=%%X %%Y

        for /f "tokens=1,2 delims= " %%A in ("!usedDisks!") do (
            set freeSpaceFirstChar=%%B
            set /a freeSpaceFirstChar=!freeSpaceFirstChar:~0,1!

            if !freeSpaceFirstChar! gtr 1 set usableDisk=%%A
        )
    )
)

echo %usableDisk%

pause

But the output I get for %usableDisk% is always ECHO is off. which means that %usableDisk% does not even exist. I've done a little investigation and from my understanding it's because of a character coding related problem. I've copied the content of getDiskInfo.txt to another .txt file, and the batch file managed to give me the right output with that text file created by me. The contents of getDiskInfo.txt and the other text file was both:

Caption  FreeSpace      
A:                      
B:       1098552672256  
C:       40824201216    
D:                      
E:       1042498560000  
F:       40222941184    

The output of the original script created file was ECHO is off.. The output with the text file created by me was F: which is the correct output because we can't use the system drive C:.

So I tried echo END OF FILE>>getDiskInfo.txt and it added the line 久⁄䙏䘠䱉൅ to the script created file, but the same command added END OF FILE to my text file. I'm completely lost on this one.

Do you have any suggestions or probable solutions?


Solution

  • There are multiple problems to solve for this task to get the drive letter of a (local) hard disk drive with less than one TiB (1 099 511 627 776 bytes) free space which is not the system drive.

    1. Character encoding of WMIC output

    WMIC outputs data always with character encoding UTF-16 Little Endian with byte order mark abbreviated as UTF-16LE+BOM.

    So the data output

    Caption  FreeSpace      
    A:                      
    B:       1098552672256  
    C:       40824201216    
    D:                      
    E:       1042498560000  
    F:       40222941184    
    

    is as byte stream with byte offset left to : and ASCII representation right to ;:

    0000h: FF FE 43 00 61 00 70 00 74 00 69 00 6F 00 6E 00 ; ÿþC.a.p.t.i.o.n.
    0010h: 20 00 20 00 46 00 72 00 65 00 65 00 53 00 70 00 ;  . .F.r.e.e.S.p.
    0020h: 61 00 63 00 65 00 20 00 20 00 20 00 20 00 20 00 ; a.c.e. . . . . .
    0030h: 20 00 0D 00 0A 00 41 00 3A 00 20 00 20 00 20 00 ;  .....A.:. . . .
    0040h: 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 ;  . . . . . . . .
    0050h: 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 ;  . . . . . . . .
    0060h: 20 00 20 00 20 00 0D 00 0A 00 42 00 3A 00 20 00 ;  . . .....B.:. .
    0070h: 20 00 20 00 20 00 20 00 20 00 20 00 31 00 30 00 ;  . . . . . .1.0.
    0080h: 39 00 38 00 35 00 35 00 32 00 36 00 37 00 32 00 ; 9.8.5.5.2.6.7.2.
    0090h: 32 00 35 00 36 00 20 00 20 00 0D 00 0A 00 43 00 ; 2.5.6. . .....C.
    00a0h: 3A 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 ; :. . . . . . . .
    00b0h: 34 00 30 00 38 00 32 00 34 00 32 00 30 00 31 00 ; 4.0.8.2.4.2.0.1.
    00c0h: 32 00 31 00 36 00 20 00 20 00 20 00 20 00 0D 00 ; 2.1.6. . . . ...
    00d0h: 0A 00 44 00 3A 00 20 00 20 00 20 00 20 00 20 00 ; ..D.:. . . . . .
    00e0h: 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 ;  . . . . . . . .
    00f0h: 20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 ;  . . . . . . . .
    0100h: 20 00 0D 00 0A 00 45 00 3A 00 20 00 20 00 20 00 ;  .....E.:. . . .
    0110h: 20 00 20 00 20 00 20 00 31 00 30 00 34 00 32 00 ;  . . . .1.0.4.2.
    0120h: 34 00 39 00 38 00 35 00 36 00 30 00 30 00 30 00 ; 4.9.8.5.6.0.0.0.
    0130h: 30 00 20 00 20 00 0D 00 0A 00 46 00 3A 00 20 00 ; 0. . .....F.:. .
    0140h: 20 00 20 00 20 00 20 00 20 00 20 00 34 00 30 00 ;  . . . . . .4.0.
    0150h: 32 00 32 00 32 00 39 00 34 00 31 00 31 00 38 00 ; 2.2.2.9.4.1.1.8.
    0160h: 34 00 20 00 20 00 20 00 20 00 0D 00 0A 00       ; 4. . . . .....
    

    But Windows command processor expects a one byte per character encoding using the code page as output on running in an opened command prompt window the command chcp. The code page depends on which country is configured for the account used to run the command process processing the batch file.

    The command line chcp 65001>nul to change to Unicode encoding UTF-8 is of no help here.

    Processing UTF-16LE encoded output with FOR directly cause troubles as documented on Stack Overflow multiple times, see for example How to correct variable overwriting misbehavior when parsing output?

    A solution would be to redirect the output of WMIC into a temporary file, output this temporary file to handle STDOUT (standard output) of a command process started in background with %ComSpec% /c using the command TYPE with capturing this output by the command process executing the batch file, process this ASCII output line by line, and finally delete the temporary file.

    @echo off
    setlocal EnableExtensions DisableDelayedExpansion
    %SystemRoot%\System32\wbem\wmic.exe LOGICALDISK GET Caption,FreeSpace >"%TEMP%\%~n0.tmp"
    if not exist "%TEMP%\%~n0.tmp" goto EndBatch
    
    for /F "skip=1 tokens=1,2" %%I in ('type "%TEMP%\%~n0.tmp"') do echo %%I %%J
    del "%TEMP%\%~n0.tmp"
    
    :EndBatch
    endlocal
    

    In this case FOR processes the ASCII byte stream:

    000h: 43 61 70 74 69 6F 6E 20 20 46 72 65 65 53 70 61 ; Caption  FreeSpa
    010h: 63 65 20 20 20 20 20 20 0D 0A 41 3A 20 20 20 20 ; ce      ..A:    
    020h: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ;                 
    030h: 20 20 0D 0A 42 3A 20 20 20 20 20 20 20 31 30 39 ;   ..B:       109
    040h: 38 35 35 32 36 37 32 32 35 36 20 20 0D 0A 43 3A ; 8552672256  ..C:
    050h: 20 20 20 20 20 20 20 34 30 38 32 34 32 30 31 32 ;        408242012
    060h: 31 36 20 20 20 20 0D 0A 44 3A 20 20 20 20 20 20 ; 16    ..D:      
    070h: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 ;                 
    080h: 0D 0A 45 3A 20 20 20 20 20 20 20 31 30 34 32 34 ; ..E:       10424
    090h: 39 38 35 36 30 30 30 30 20 20 0D 0A 46 3A 20 20 ; 98560000  ..F:  
    0a0h: 20 20 20 20 20 34 30 32 32 32 39 34 31 31 38 34 ;      40222941184
    0b0h: 20 20 20 20 0D 0A                               ;     ..
    

    But it is in general always better to avoid the usage of a temporary file as there is never the guarantee that the temporary file can be created at all during the execution of the batch file.

    2. System drive is not always C:

    Windows is installed by default on drive C: and so the system drive is C:. But Windows can be installed also to a different drive in which case the system drive is not C:. Any code depending on default data instead of using the appropriate data is a not good written code.

    There is predefined the Windows environment variable SystemDrive with the drive letter and the colon of the drive on which active Windows is installed. The environment variable SystemRoot contains the path to the Windows directory which contains the directory System32 with all executables from the Windows Commands list which are not internal commands of cmd.exe.

    All those system environment variables can be seen with their values on opening a command prompt window and running set system. Running just set outputs all environment variables with their values defined for the current user account.

    3. Integer value range is limited to 32-bit signed integer

    The Windows command processor cmd.exe uses always only 32-bit signed integer on evaluation of an arithmetic expression with set /A and for comparing integer values with command IF on using the operators EQU, NEQ, LSS, LEQ, GTR, GEQ.

    Therefore the integer value range is −2147483648 to 2147483647. So maximum is one byte less than 2 GiB. The value is 1099511627776 requires 64-bit integer value range not supported by cmd.exe.

    BTW: if [%%Y] neq [] is never a good comparison because of [ and ] have no special meaning for Windows command processor and neq results here first in the approach to convert the left string into a 32-bit signed integer value which fails because of [ is an invalid character for an integer value and therefore running next a string comparison with condition being true if the string comparison returns not 0, i.e. the compared strings are not equal. Better would be if not "%%Y" == "" which runs directly and more safely a string comparison on not equal strings. See Symbol equivalent to NEQ, LSS, GTR, etc. in Windows batch files for details on how command IF executes a string comparison.

    4. Solution to get drives with less than one TiB free space

    It is a good idea to read the documentation of a class of which properties are accessed using the Windows Management Instrumentation Command-line utility. This is here the Win32_LogicalDisk class.

    There is in addition to FreeSpace of type uint64 and DeviceID of type string instead of Caption perhaps also useful the property DriveType of type uint32 to filter out drives of wrong type in addition to drives with too much free space and the system drive by using a where clause on wmic execution.

    @echo off
    setlocal EnableExtensions DisableDelayedExpansion
    set "UsableDrive="
    for /F "skip=1 tokens=1,2" %%I in ('""%SystemRoot%\System32\wbem\wmic.exe" LOGICALDISK where (DriveType=3 and FreeSpace^<1099511627776 and DeviceID!='C:') GET DeviceID,FreeSpace 2>nul"') do (
        echo Drive %%I has %%J free bytes.
        if not defined UsableDrive set "UsableDrive=%%I"
    )
    if defined UsableDrive echo Selected drive %UsableDrive%
    endlocal
    

    Important to know here is that FOR starts in background with %ComSpec% /c one more command process with the command line specified within ' appended as additional arguments. For that reason the command line with WMIC must fulfill the Windows command processor requirements described by help output on running cmd /? in a command prompt window on being in total three times parsed.

    The first parsing is done by cmd.exe processing the batch file before executing command FOR.

    The second parsing is done by cmd.exe instance on starting it in background by the cmd.exe instance processing the batch file with the following command line on Windows installed to C:\Windows.

    C:\Windows\System32\cmd.exe /c ""C:\WINDOWS\System32\wbem\wmic.exe" LOGICALDISK where (DriveType=3 and FreeSpace^<1099511627776 and DeviceID!='C:') GET DeviceID,FreeSpace 2>nul"
    

    The third parsing is done by the background command process before executing wmic.exe. The operator < in where clause must be interpreted as literal character and not as redirection operator which is the reason why < is escaped with ^ to run wmic.exe finally with:

    "C:\Windows\System32\wbem\wmic.exe" LOGICALDISK where (DriveType=3 and FreeSpace<1099511627776 and DeviceID!='C:') GET DeviceID,FreeSpace
    

    WMIC filters out with DriveType=3 all network drives, floppy disk drives, CD and DVD drives and other removable drives, RAM disks, etc. Hard disks connected to the computer using an external USB port or an eSATA port are not filtered out because of those drives have also value 3 for the drive type. Windows cannot determine if a hard disk is mounted inside the casing of the computer or outside. So a local hard disk is any hard disk connected to the computer, internal and external hard disks.

    The system drive is filtered out with second condition DeviceID!='%SystemDrive%'.

    The last condition FreeSpace<1099511627776 results in ignoring all drives with 1 TiB or more free space.

    So the list is reduced already to those drives which fulfill all three conditions.

    To understand the commands used and how they work, open a command prompt window, execute there the following commands, and read the displayed help pages for each command, entirely and carefully.

    • cmd /?
    • del /?
    • echo /?
    • endlocal /?
    • for /?
    • goto /?
    • if /?
    • set /?
    • setlocal /?
    • type /?
    • wmic /?
    • wmic logicaldisk /?
    • wmic logicaldisk get /?

    See also the Microsoft article about Using command redirection operators for an explanation of > and 2>nul.