Search code examples
linuxextractglobtar

Excluding files from top level directory when extracting tar archives


I have a .tar.gz archive on my Linux machine that I want to extract. When I do tar -tf archive.tar.gz, I get the following output:

test1
a/ 
a/test1

What I'm trying to do is exclude the file test1 in the top-level-directory of the archive. However, I haven't found a working solution...

What I've tried to do is tar -xvf archive.tar.gz --exclude=test1. However, this excludes all occurences of files with that name, so it also excludes a/test1 which I do not want to exclude. Using tar -xvf archive.tar.gz --exclude=./test1 also doesn't work (this excludes neither files). I suppose this is because there doesn't seem to be a 'top level directory'. However, I do not have control over the structure of the archive. And since the --exclude options seems to use glob patterns, I also can't use an expression like --exclude=^test1. I'm not really sure what to do now as I can't find a way to exclude only the file I want to exclude.

Help is greatly appreciated.


Solution

  • I don't know if this is the best way to do it but here it is anyway.

    1. Exclude a file pattern (test1) and list all directories & files
    tar -tf archive.tar.gz --exclude test1
    
    1. Use this list as input for extracting
    tar -xf archive.tar.gz $(tar -tf archive.tar.gz --exclude test1)
    
    # or
    
    tar -tf archive.tar.gz --exclude test1 | xargs tar -xf archive.tar.gz
    

    For the following directory heirarchy in my own archive

    test1
    b/b_2/
    c/
    b/b_1.txt
    c/test1
    a/test1
    a/
    b/b_1/
    b/
    

    I was able to use this approach to exclude just test1 from the toplevel.

    b/b_2/
    c/
    b/b_1.txt
    c/test1
    a/test1
    a/
    b/b_1/
    b/