Search code examples
linuxlinux-kernelkernelblockscsi

Linux: writes are split into 512K chunks


I have a user-space application that generates big SCSI writes (details below). However, when I'm looking at the SCSI commands that reach the SCSI target (i.e. the storage, connected by the FC) something is splitting these writes into 512K chunks.

The application basically does 1M-sized direct writes directly into the device:

fd = open("/dev/sdab", ..|O_DIRECT);
write(fd, ..., 1024 * 1024);

This code causes two SCSI WRITEs to be sent, 512K each.

However, if I issue a direct SCSI command, without the block layer, the write is not split. I issue the following command from the command line:

sg_dd bs=1M count=1 blk_sgio=1 if=/dev/urandom of=/dev/sdab oflag=direct

I can see one single 1M-sized SCSI WRITE.

The question is, what is splitting the write and, more importantly, is it configurable? Linux block layer seems to be guilty (because SG_IO doesn't pass through it) and 512K seems too arbitrary a number not to be some sort of a configurable parameter.


Solution

  • The blame is indeed on the block layer, the SCSI layer itself has little regard to the size. You should check though that the underlying layers are indeed able to pass your request, especially with regard to direct io since that may be split into many small pages and requires a scatter-gather list that is longer than what can be supported by the hardware or even just the drivers (libata is/was somewhat limited).

    You should look and tune /sys/class/block/$DEV/queue there are assorted files there and the most likely to match what you need is max_sectors_kb but you can just try it out and see what works for you. You may also need to tune the partitions variables as well.