I am working on taking backup my server data.
Some folders have data around 600GB
, I need to tar it as 6 files for 100GB
each.
I have google it got some idea to do it.(similar topic#1, similar topic#2 and so). we can achive it by
tar cvzf - data/ | split --bytes=100GB - sda1.backup.tar.gz.
Also we can untar it with
cat sda1.backup.tar.gz.* | tar xzvf -
My question is, Is there any way to do this job parallel (each tar as a separate process)? because it take long time to complete!
Or is there any other way to do this?
EDIT
Experiment:
# date;tar czf - ../saravana | split --bytes=1073741824 - data_bkp.;date
Wed May 18 09:28:32 MDT 2016
tar: Removing leading `../' from member names
tar: ../saravana: file changed as we read it
Wed May 18 09:51:08 MDT 2016
Result
-rw-r--r-- 1 root root 1073741824 May 18 09:31 data_bkp.aa
-rw-r--r-- 1 root root 1073741824 May 18 09:34 data_bkp.ab
-rw-r--r-- 1 root root 1073741824 May 18 09:38 data_bkp.ac
-rw-r--r-- 1 root root 1073741824 May 18 09:41 data_bkp.ad
-rw-r--r-- 1 root root 1073741824 May 18 09:49 data_bkp.ae
-rw-r--r-- 1 root root 904246985 May 18 09:51 data_bkp.af
# du -h data*
1.1G data_bkp.aa
1.1G data_bkp.ab
1.1G data_bkp.ac
1.1G data_bkp.ad
1.1G data_bkp.ae
863M data_bkp.af
This take 22 minutes and 36 seconds to complete!!
I was wondered during tar process only one cpu process is full out of four. Tar process only takes much cpu.
So I tried with parallel processing pigz
I found two parallel process tools PIGZ and PBZIP2 , for me PIGZ works great,
For 22 GB
test files ( 10MB files mostly, high in count not in size ) notmal tar
took 23~24 Minutes, pbzip2
also tooks same time(I don't take much research on this) and pigz took 8 minutes!!! So I choose pigz
.
Once I have done with pigz, all of my cpu goes to 95%
to 100%
, this makes other process slow, After some google I found a solution to limit this cpu usage, CPULIMIT
Finally I have used like this!!
$CPULIMIT_PATH -i -l $CPU_LIMIT_VALUE $TAR_PATH -I $PIGZ_PATH \
--ignore-failed-read -c sda1.backup.tar.gz
-i - all child process, important - otherwise cpu process will same
-l limit of the cpu in percentage
for this I used
CPU_LIMIT_VALUE=$(echo "$(nproc)*45" | bc);
This will give 45%
percent of all core, ie for 2 cores 90 and 4 cores 180 like that.