dd: How to calculate optimal blocksize?

How do you calculate the optimal blocksize when running a dd? I've researched it a bit and I've not found anything suggesting how this would be accomplished.

I am under the impression that a larger blocksize would result in a quicker dd... is this true?

I'm about to dd two identical 500gb Hitachi HDDs that run at 7200rpm on a box running an Intel Core i3 with 4GB DDR3 1333mhz RAM, so I'm trying to figure out what blocksize to use. (I'm going to be booting Ubuntu 10.10 x86 from a flash drive, and running it from that.)

Solution

The optimal block size depends on various factors, including the operating system (and its version), and the various hardware buses and disks involved. Several Unix-like systems (including Linux and at least some flavors of BSD) define the st_blksize member in the struct stat that gives what the kernel thinks is the optimal block size:

#include <sys/stat.h>
#include <stdio.h>

int main(void)
{
    struct stat stats;

    if (!stat("/", &stats))
    {
        printf("%u\n", stats.st_blksize);
    }
}

The best way may be to experiment: copy a gigabyte with various block sizes and time that. (Remember to clear kernel buffer caches before each run: echo 3 > /proc/sys/vm/drop_caches).

However, as a rule of thumb, I've found that a large enough block size lets dd do a good job, and the differences between, say, 64 KiB and 1 MiB are minor, compared to 4 KiB versus 64 KiB. (Though, admittedly, it's been a while since I did that. I use a mebibyte by default now, or just let dd pick the size.)