I have a very large .tar.gz
file which I can't extract all together because of lack of space. I would like to extract half of its contents, process them, and then extract the remaining half.
The archive contains several subdirectories, which in turn contain files. When I extract a subdirectory, I need all its contents to be extracted with it.
What's the best way of doing this in bash? Does tar
already allow this?
OK, so based on this answer, I can list all contents at the desired depth. In my case, the tar.gz file is structured as follows:
archive.tar.gz:
archive/
archive/a/
archive/a/file1
archive/a/file2
archive/a/file3
archive/b/
archive/b/file4
archive/b/file5
archive/c/
archive/c/file6
So I want to loop over subdirectories a, b, c
and, for instance extract the first two of them:
parent_folder='archive/'
max_num=2
counter=0
mkdir $parent_folder
for subdir in `tar --exclude="*/*/*" -tf archive.tar.gz`; do
if [ "$subdir" = "$parent_folder" ];
then
echo 'not this one'
continue
fi
if [ "$counter" -lt "$max_num" ];
then
tar zxvf archive.tar.gz $subdir -C ./${parentfolder}${subdir}
counter=$((counter + 1))
fi
done
Next, for the remaining files:
max_num=2
counter=0
mkdir $parent_folder
for subdir in `tar --exclude="*/*/*" -tf files.tar.gz`; do
if [ "$subdir" = "$parent_folder" ];
then
echo 'not this one'
continue
fi
if [ "$counter" -ge "$max_num" ];
then
tar zxvf files.tar.gz $subdir -C ./${parent_folder}$subdir
fi
counter=$((counter + 1))
done