I want to have code similar to the function of du
. I tried this by the use of stat()
function. I have learned st_blocks
reported from stat()
is the actual block number allocated in disk. The block size should be 512 bytes and st_blocks*512
should be the allocated byte number for the file. However I found confusing on Cygwin environment. First I create a file with 8KB using dd
command.
% dd if=/dev/urandom bs=4096 count=2 of=testfile
2+0 records in
2+0 records out
8192 bytes (8.2 kB, 8.0 KiB) copied, 0.00791222 s, 1.0 MB/s
Then I run a stat
command with the file:
% stat testfile
File: testfile
Size: 8192 Blocks: 8 IO Block: 65536 regular file
Device: 7727c30h/124943408d Inode: 25614222880771065 Links: 1
Access: (0664/-rw-rw-r--) Uid: (197881/ crystal) Gid: ( 513/ None)
Access: 2018-05-01 15:11:50.760626400 +0800
Modify: 2018-05-01 15:11:50.761626500 +0800
Change: 2018-05-01 15:11:50.761626500 +0800
Birth: 2018-05-01 15:11:50.760626400 +0800
I don't think there is a 'hole' on the generated file. I get the file with 8 blocks allocated, which implies the block size to be 1KiB rather than 512B. If I do C code with stat()
call, st_blocks
gets the same result.
So far all the articles say block size is 512B. Is there exception? If yes, how can I get the actual block size? Or, how can I get the actual disk space occupied by a file?
The stat
command line tool has a %B
format option, which displays the block size it is using. It appears stat
uses a 1024-byte block in Cygwin.
Also, it appears that the NTFS 4096-byte block size is actually what is being used under the hood, and stat
is just presenting 1024-byte blocks:
$ dd if=/dev/urandom of=foo count=1 bs=4095
$ stat -c '%B %b' foo
1024 4
$ dd if=/dev/urandom of=foo count=1 bs=4097
$ stat -c '%B %b' foo
1024 8
There is a discussion of where the 512-byte vs 1024-byte block sizes come from in https://unix.stackexchange.com/questions/28780/file-block-size-difference-between-stat-and-ls. Apparently it is to do with Linux kernel conventions vs GNU utility conventions.