Search code examples
linuxgrepunzip

How to grep on the content of a zipped non-standard textfile


On my Windows-10 PC, I have installed Ubuntu app. There I'd like to grep on the content of a group of zipfiles, but let's start with just 1 zipfile. My zipfile contains two files: a crashdump and an errorlog (textfile), containing some information. I'm particularly interested in information within that error logfile:

<grep_inside> zipfile.zip "Access violation"

Until now, this is my best result:

unzip -c zipfile.zip error.log

This shows the error logfile, but it shows it as a hexdump, which makes it impossible to launch a grep on it.

As proposed on different websites, I've also tried following commands: vim, view, zcat, zless and zgrep, all are not working for different reasons.

Some further investigation

This question is not a duplicate of this post, a suggested, I believe the issue is caused by the encoding of the logfile, as you can see in following results of other basic Linux commands, after unzipping the error logfile:

emacs error.log
... caused an Access Violation (0xc0000005)

cat error.log
. . . c a u s e d   a n   A c c e s s   V i o l a t i o n   ( 0 x c 0 0 0 0 0 0 5 )

Apparently the error.log file is not being recognised as a simple textfile:

file error.log
error.log : Little-endian UTF-16 Unicode text, with very long lines, with CRLF line terminators

Solution

  • In this post on grepping non-standard text files, I found the answer:

    unzip -c zipfile.zip error.log | grep -a "A.c.c.e.s.s"
    

    Now I have something to start from.

    Thanks, everyone, for your cooperation.