Search code examples
linuxshellshfile-descriptorbusybox

Is there a way to arbitrarily chunk command line output into x amount of bytes per chunk?


Say I had a Linux command such as: shell_command > /dev/my_io

and said shell_command was unbuffered and expected to dump a large amount of data and my_io is a file descriptor processing said data. What I want to do is have something like:

shell_command | awk '{printf $0}' > /dev/my_io , where I force line-buffering since awk goes through line by line.

BUT, instead of using new-line delimited calls to my_io as a way of chunking up this massive data, I want to chunk this data into say 1000 byte chunks instead. Is there a way to do that using very simple commands such as awk {printf} or something of the sort (pretty much a generic busybox set of commands)?

For example and further clarity.

    shell_command => (4000 bytes)

    shell_command (cat file.txt) | fancy_command > command_b

    # command_b yields

    chunk 1 (1000 bytes)
    chunk 2 (1000 bytes)
    chunk 3 (1000 bytes)
    chunk 4 (1000 bytes)

Also, is there a way to force all output from stdout to be fully buffered?


Solution

  • instead of using new-line delimited calls to my_io as a way of chunking up this massive data, I want to chunk this data into say 1000 byte chunks instead.Is there a way to do that using very simple commands such as awk {printf}

    There is stdbuf.

    stdbuf -o1000 command | stdbuf -o1000 awk '{printf $0}' > /dev/my_io
    

    pretty much a generic busybox set of commands

    You can read 1000 characters with dd in a loop, something along:

    command | while a=$(dd bs=1 count=1000 status=none) && [[ -n "$a" ]]; do printf "$a"; done