Search code examples
linuxstringunixtextlarge-files

Build a file with large string


We need to build a large text file around 1MB in size. We tried with shell script using Echo to create file using do loop. It took a long time to build.

I am looking to build a file 1 line/record in unix/Linux. It could be a large string of size 1MB.

The content may look like this, but for a whole megabyte:

XXXXXXXXX............................................XXXX

If I build character by character, it takes longer time.

I would like to extend this to 10MB, 20MB .... 60MB later on.

Is a shell script the best option, or is there a faster option?


Solution

  • The unix dd command was made exactly for this purpose.

    http://en.wikipedia.org/wiki/Dd_%28Unix%29

    You could write a small program to continuously print your desired fill character (X in your example) to STDOUT without newlines. Pipe the result of that into dd and specify the bs and count parameters such that you get the exactly correct file size.

    You can then tweak the bs and count parameters to find the maximum throughput.

    EDIT: Example:

    yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024 2>/dev/null
    

    You can see that it's quite fast:

    time yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024 
    1024+0 records in
    1024+0 records out
    1048576 bytes (1.0 MB) copied, 0.123118 s, 8.5 MB/s
    
    real    0m0.127s
    user    0m0.144s
    sys         0m0.004s
    

    Moving the time through the different parts of the pipeline indicates to me that dd is taking whatever you give it, but the producer is not very fast. (Perhaps yes and awk aren't the best combination).

    If you need to go faster than that, perhaps you'll need to consider other interfaces, for example mmap.