bash: extract only part of tar.gz archive

I have a very large .tar.gz file which I can't extract all together because of lack of space. I would like to extract half of its contents, process them, and then extract the remaining half.

The archive contains several subdirectories, which in turn contain files. When I extract a subdirectory, I need all its contents to be extracted with it.

What's the best way of doing this in bash? Does tar already allow this?

Solution

OK, so based on this answer, I can list all contents at the desired depth. In my case, the tar.gz file is structured as follows:

archive.tar.gz:
archive/
archive/a/
archive/a/file1
archive/a/file2
archive/a/file3
archive/b/
archive/b/file4
archive/b/file5
archive/c/
archive/c/file6

So I want to loop over subdirectories a, b, c and, for instance extract the first two of them:

parent_folder='archive/'
max_num=2
counter=0
mkdir $parent_folder
for subdir in `tar --exclude="*/*/*" -tf archive.tar.gz`; do
    if [ "$subdir" = "$parent_folder" ];
    then
        echo 'not this one'
        continue        
    fi
    if [ "$counter" -lt "$max_num" ];
    then
        tar zxvf archive.tar.gz $subdir -C ./${parentfolder}${subdir}
        counter=$((counter + 1))
    fi
done

Next, for the remaining files:

max_num=2
counter=0
mkdir $parent_folder
for subdir in `tar --exclude="*/*/*" -tf files.tar.gz`; do
    if [ "$subdir" = "$parent_folder" ];
    then
        echo 'not this one'
        continue        
    fi
    if [ "$counter" -ge "$max_num" ];
    then
        tar zxvf files.tar.gz $subdir -C ./${parent_folder}$subdir
    fi
    counter=$((counter + 1))
done