Search code examples
linux7zipmalware-detection

How to specifically extract files that are in a 7z archive


I have a 7z archive that i downloaded from practicalsecurityanalytics.com that contains malware files and benign files of 117GB. The compressed size of this file is 43.8GB which is large and i do not want to extract the whole archive at once.

Is there a way so that i can specifically extract a few selected files The selected files are not sequential so that i can't really rely on GUI and select individual files.

File details metric
Samples 201,549
Legitimate 86,812
Malicious 114,737
Compressed Size 43.8GB
Uncompressed Size 117GB

There is a csv file called samples.csv that shows which file is malware and which is not and the entropy of the file

The file is encrypted so it asks for a password every time I want to extract something.

I am working in linux.


Solution

  • A quick way I extracted the specific files is first add all the file names into a text file like this

    228161
    213960
    200290
    210832
    230546
    257545
    ....
    

    and wrap the file names around like this by using any method (i used a python script to quickly do it) and save it a file - here f1.txt

    pe-machine-learning-dataset/samples/228161
    pe-machine-learning-dataset/samples/213960
    pe-machine-learning-dataset/samples/200290
    pe-machine-learning-dataset/samples/210832
    pe-machine-learning-dataset/samples/230546
    pe-machine-learning-dataset/samples/257545
    ....
    

    and now executing

    7z e foo.7z -o"path to save the files" $(cat f1.txt)