Search code examples
grep

How do you invert a grep -P command?


I have a bunch of output files that also printed the entire text of the script that made them. Let's say each of these files, which are the output of "oops.sh", looks like this:

Hello world!
This script is called #!bash
scriptName=$(echo $0)
echo """Hello world!
This script is called $scriptName."""
# Wait, WTF just happened?
echo "Done.".
Done.

I'd like to use a grep or sed command to remove the text of the original script. The output should look like this:

Hello world!
This script is called .
Done.

I've successfully matched what I want to remove -- grep -Pazo "(?s)\#\!bash.*?Done\.\"" outputOops.txt -- but I can't seem to invert the results. (I don't actually know Perl, BTW, and regrettably I don't currently have time to learn it. I got this far based on some StackOverflow answers, but didn't understand other relevant answers well enough to use them.) Each of the following returns nothing --

grep -Pazov "(?s)\#\!bash.*?Done\.\"" outputOops.txt
grep -Pazo -v "(?s)\#\!bash.*?Done\.\"" outputOops.txt
grep -Pazo --invert-match "(?s)\#\!bash.*?Done\.\"" outputOops.txt
grep -Pazo "(?s)(?!)\!bash.*?Exiting\.\"" outputOops.txt
 grep -Pazo "(?s)(?!)(\!bash.*?Exiting\.\")" outputOops.txt

-- except the last two lines, which both return -bash: !: event not found.

What am I doing wrong? Does negation just not work with the -P flag? If not, what can I do instead?


Solution

  • grep works on a line-by-line basis. Its output is based on whether the line matches pattern (or based on which part of the line was matched when -o is used).

    Unfortunately, you need to output based on whether a different line matched. grep matches each line independently, so it can't do that.


    But what if we change the definition of a line using -z. In a file without NUL characters, the whole file becomes a line.

    There are three things you can print with grep:

    • The lines that match
    • The lines that don't match (-v)
    • The part of each line that matched (-o)

    Unfortunately, none of those suit your needs since you want to print two disjoint parts of the "line". grep only attempts to match each line once, so only one part of a line can match.


    A simple Perl one-liner will do, however.

    perl -0777pe's/#!bash.*?Done\."//s'
    

    Specifying file to process to Perl one-liner