Search code examples
unixarchive7zipbulk

Extract exactly one file (any) from each 7zip archive, in bulk (Unix)


I have 1,500 7zip archives, each archive contains 2 to 10 files, with no subdirectories.

Each file has the same extension, however the filename varies.

I only want one file out of each archive, but I'd like to perform this in bulk. I do not care which file is taken out, as long as only one file is taken out. It can be the first file, the newest, the biggest, the smallest, it doesn't matter.

Here's an example:

aa.7z {blah 56.smc, blah 57.smc, 1 blah 58.smc}
ab.7z {xx.smc, xx 1.smc, xx_2.smc}
ac.7z {1.smc}

I want to run something equivalent to:

7z e *.7z # But somehow only extract one file

Thank you!


Solution

  • Ultimately my solution was to extract all files and run the following in the directory:

    for n in *; do echo "$n"; done > files.txt
    

    I then imported that list into excel, and split the files by a special character that divided the title of the file with the qualifying data inside the filename (for example: Some Title (V1) [X2].smc), specifically I used a brackets delimiter.

    Then I removed all duplicates, leaving me with only one edition of each from the zip. I finally remerged the columns (unfortunately the bracket was deleted during the splitting so wrote a function to add it back on the condition of whether there was content in the next column) and then resaved files.txt, after a bit of reviewing StackOverflow for answers, deleted files based on an input file (files.txt). A word of warning on this, spaces in filenames cause problems with rm and xargs so I had to encapsulate the variable with quotes.

    Ultimately this still didn't serve me well enough so I just used a different resource entirely.

    Posting this answer so others who find themselves in a similar predicament find an alternative resolution.