Search code examples
tarlarge-data

I want tar extract to stop after a certain file range


I have a enormous tar archive that I'm pulling large sections out of to process one at a time. I don't want to have to babysit it to keep it from filling the disk and crashing other applications.

I know I can start from an arbitrary file in the archive using --starting-file= but there doesn't seem to be a --stopping-file= command.

It looks like I could write an inverted exclusion pattern to have it ignore all the files after that, but it seems like it will still try to cycle through all the indexes (of the top level folders at least) checking them, consuming resources and preventing early termination.

Is there a better way to stop it from continuing after the section I want?


Solution

  • You can write out the full file list with tar -t -f my.tar > my.list

    Chaining filters like grep or modifying the file can constrain the list. So to extract all the files and folders under the path 'my/folder/path', run tar -t -f my.tar | grep "my/folder/path" > my.list

    Then extract the listed files with tar x -T my.list -f my.tar

    Thanks to Jens for the strategy suggestion.