Build a file with large string

We need to build a large text file around 1MB in size. We tried with shell script using Echo to create file using do loop. It took a long time to build.

I am looking to build a file 1 line/record in unix/Linux. It could be a large string of size 1MB.

The content may look like this, but for a whole megabyte:

XXXXXXXXX............................................XXXX

If I build character by character, it takes longer time.

I would like to extend this to 10MB, 20MB .... 60MB later on.

Is a shell script the best option, or is there a faster option?

Solution

The unix dd command was made exactly for this purpose.

http://en.wikipedia.org/wiki/Dd_%28Unix%29

You could write a small program to continuously print your desired fill character (X in your example) to STDOUT without newlines. Pipe the result of that into dd and specify the bs and count parameters such that you get the exactly correct file size.

You can then tweak the bs and count parameters to find the maximum throughput.

EDIT: Example:

yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024 2>/dev/null

You can see that it's quite fast:

time yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024 
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.123118 s, 8.5 MB/s

real    0m0.127s
user    0m0.144s
sys         0m0.004s

Moving the time through the different parts of the pipeline indicates to me that dd is taking whatever you give it, but the producer is not very fast. (Perhaps yes and awk aren't the best combination).

If you need to go faster than that, perhaps you'll need to consider other interfaces, for example mmap.