Search code examples
windowsbatch-filecmdfilesizevariable-expansion

What exactly is “%~zI” expanded to for directories in FOR loops?


From FOR /?:

In addition, substitution of FOR variable references has been enhanced.
You can now use the following optional syntax:

    %~I         - expands %I removing any surrounding quotes (")
    %~fI        - expands %I to a fully qualified path name
    %~dI        - expands %I to a drive letter only
    %~pI        - expands %I to a path only
    %~nI        - expands %I to a file name only
    %~xI        - expands %I to a file extension only
    %~sI        - expanded path contains short names only
    %~aI        - expands %I to file attributes of file
    %~tI        - expands %I to date/time of file
    %~zI        - expands %I to size of file
    %~$PATH:I   - searches the directories listed in the PATH
                   environment variable and expands %I to the
                   fully qualified name of the first one found.
                   If the environment variable name is not
                   defined or the file is not found by the
                   search, then this modifier expands to the
                   empty string

I've run a Windows batch script that does @echo %~aI %~fI ^<%~zI byte^(s^)^> in a FOR loop looping through directories (the path of each of which gets stored in %I) and got this output:

d--hs------ J:\$RECYCLE.BIN  <0 byte(s)>
d---------- J:\Multimedia  <4096 byte(s)>
dr--------- J:\-C-\……\Desktop  <12288 byte(s)>
dr--------- J:\-C-\……\Documents  <28672 byte(s)>
dr--------- J:\-C-\……\Downloads  <81920 byte(s)>

Those “sizes” of the directories above have nothing to do with those of the files in them. What exactly does it mean by a “size” from %~zI? Were %I a normal file, it would be the size of it. But what if %I is a directory? I can't quite grasp it. Is it really meaningless?


Solution

  • That's the consumed space for the the directory entries

    A directory is actually a special file that contains other files and directories, so it has to store that list somewhere, along with necessary metadata. Some file systems will allocate normal clusters and store the the metadata in that data area

    NTFS will do the same for big folders. However in NTFS small files can also stay resident in the MFT entry, which is why you can see some zero-byte folders because they don't need separately allocated blocks for the directory metadata

    The name of the stream that contains those metadata is $I30

    In the case of directories, there is no default data stream, but there is a default directory stream. Directories are the stream type $INDEX_ALLOCATION. The default stream name for the type $INDEX_ALLOCATION (a directory stream) is $I30

    5.1 NTFS Streams

    You can check that with fsutil file layout <directory_path> and look at the $I30 streams. For example here is the output from my PC. Notice the same sizes in the %~zI and fsutil outputs. Folders that have size 0 only contain a tiny $INDEX_ROOT stream, whereas others have another $INDEX_ALLOCATION with the same size as the output from %~zI

    PS C:\> cmd /c "for /d %I in (*) do @echo %~aI %~fI  ^<%~zI byte^(s^)^>"
    d---------- C:\ESD  <0 byte(s)>
    d---------- C:\Intel  <0 byte(s)>
    d---------- C:\PerfLogs  <0 byte(s)>
    dr--------- C:\Program Files  <8192 byte(s)>
    dr--------- C:\Program Files (x86)  <4096 byte(s)>
    dr--------- C:\Users  <4096 byte(s)>
    d---------- C:\Windows  <16384 byte(s)>
    d---------- C:\Windows.old  <4096 byte(s)>
    
    PS C:\> foreach ($f in ls -Attr Directory) {
    >>     $fileLayout = (fsutil file layout $f) -join "`0"
    >>     $result = (([regex]'\$I30.*?(?=Stream|$)').Matches($fileLayout)) -split "`0" | Select-String -Pattern '\$I30|  Size'
    >>     echo "================================ $f"; $result
    >> }
    ================================ ESD
    
    $I30:$INDEX_ROOT
        Size                : 48
    ================================ Intel
    $I30:$INDEX_ROOT
        Size                : 368
    ================================ PerfLogs
    $I30:$INDEX_ROOT
        Size                : 48
    ================================ Program Files
    $I30:$INDEX_ROOT
        Size                : 168
    $I30:$INDEX_ALLOCATION
        Size                : 8,192
    $I30:$BITMAP
        Size                : 8
    ================================ Program Files (x86)
    $I30:$INDEX_ROOT
        Size                : 56
    $I30:$INDEX_ALLOCATION
        Size                : 4,096
    $I30:$BITMAP
        Size                : 8
    ================================ Users
    $I30:$INDEX_ROOT
        Size                : 56
    $I30:$INDEX_ALLOCATION
        Size                : 4,096
    $I30:$BITMAP
        Size                : 8
    ================================ Windows
    $I30:$INDEX_ROOT
        Size                : 432
    $I30:$INDEX_ALLOCATION
        Size                : 16,384
    $I30:$BITMAP
        Size                : 8
    ================================ Windows.old
    $I30:$INDEX_ROOT
        Size                : 56
    $I30:$INDEX_ALLOCATION
        Size                : 4,096
    $I30:$BITMAP
        Size                : 8
    

    The same thing happens on *nix when the size displayed by ls -l is not the total size of the files inside the directory:

    In C++17 there's std::filesystem::directory_entry to obtain directory info