Search code examples
bashdirectoryzipsubdirectorydirectory-structure

GNU/Linux - proper use of the 'zip' command to flatten the subdirectories?


I'm looking for some help in properly using the zip command in GNU/Linux systems. Let's say I have this directory structure (the actual use case is more complex, but this is a simplified example):

Documents
└── foo1
    └── foo2
        ├── foo3a 
        │   └──foo4
        │      ├──file1.txt
        │      └──file2.py
        └── foo3b
            └──foo4
               ├──file3.txt
               └──file4.py

I want to make a zip file that only includes the txt files. And ideally I want it to be part of a bash script running from the Documents location. So, referencing the documentation of the zip function I naively start with:

cd ~/Documents
zip -r foo . -i foo1/foo2/*/foo4/*.txt 

This does zip up all the txt files, but the resulting foo.zip file has the full directory structure (foo1 -> foo2 -> foo3a/foo3b, etc...).

zip folder structure

I'd like to "flatten" it so that the foo.zip structure starts at foo2, even though the command is being run in ~/Documents. The examples in the documentation all use the . as the inpath variable. So I try changing it to where I want foo.zip to start:

zip -r foo ./foo1/foo2 -i foo1/foo2/*/foo4/*.txt

But foo.zip still has the full directory path of foo1 -> foo2, etc... Even if I run

zip -r foo ./foo1/foo2/foo3a -i foo1/foo2/*/foo4/*.txt

It will only zip up file1.txt from foo3a, but the full directory structure remains. So the inpath variable only determines where the search for files starts from, not where the eventual zip directory starts from. I've tried other various versions of the zip command but I can't get the outcome I'm looking for. Is there any way of using the zip command to result in a foo.zip of:

foo.zip
├── foo3a 
│   └──foo4
│      ├──file1.txt
│      └──file2.py
└── foo3b
    └──foo4
       ├──file3.txt
       └──file4.py

Or even possibly flattening the downstream folders (since foo4 is common to both) to something like:

foo.zip
├── foo3a 
│   └──file1.txt
│   └──file2.py
└── foo3b
    └──file3.txt
    └──file4.py

How would one do this? I also tried using the -j option:

zip -rj foo . -i foo1/foo2/*/foo4/*.txt
>> zip warning: zip file empty

Solution

  • Does this do what you want?

    cd ~/Documents
    (cd foo1/foo2; zip -r - . -i '*.txt') >foo.zip
    
    • (...) creates a subshell where a cd won't affect the top-level shell.
    • Specifying an archive path of - causes the archive to be written to standard output.
    • The >foo.zip causes the archive date to be written to the file foo.zip in the current directory for the top-level shell (~/Documents).