We need to build a large text file around 1MB in size. We tried with shell script using Echo to create file using do loop. It took a long time to build.
I am looking to build a file 1 line/record in unix/Linux. It could be a large string of size 1MB.
The content may look like this, but for a whole megabyte:
XXXXXXXXX............................................XXXX
If I build character by character, it takes longer time.
I would like to extend this to 10MB, 20MB .... 60MB later on.
Is a shell script the best option, or is there a faster option?
The unix dd
command was made exactly for this purpose.
http://en.wikipedia.org/wiki/Dd_%28Unix%29
You could write a small program to continuously print your desired fill character (X
in your example) to STDOUT without newlines. Pipe the result of that into dd
and specify the bs
and count
parameters such that you get the exactly correct file size.
You can then tweak the bs
and count
parameters to find the maximum throughput.
EDIT: Example:
yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024 2>/dev/null
You can see that it's quite fast:
time yes X | awk '{ printf("%s", $0)}' | dd of=out.txt bs=1024 count=1024
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.123118 s, 8.5 MB/s
real 0m0.127s
user 0m0.144s
sys 0m0.004s
Moving the time
through the different parts of the pipeline indicates to me that dd
is taking whatever you give it, but the producer is not very fast. (Perhaps yes
and awk
aren't the best combination).
If you need to go faster than that, perhaps you'll need to consider other interfaces, for example mmap
.