Search code examples
awksedgrep

Extract the phrase containing "FAILED" without a parenthesis following it


How to extract the phrase containing "FAILED" without a parenthesis ')' following it and also remove anything which has "Linux x86_64)" preceding it? e.g. in:

Chrome Headless 117.0.5938.132 (Linux x86_64): Executed 5 of 270[32m SUCCESS[39m (0 secs / 0.018 secs)
WARN: 'Spec 'zip combines one value of each' has no expectations.'
[1A[2KChrome Headless 117.0.5938.132 (Linux x86_64): Executed 6 of 270[32m SUCCESS[39m (0 secs / 0.021 secs)
[1A[2K[31mChrome Headless 117.0.5938.132 (Linux x86_64) interval should emit every second FAILED[39m
    Expected $[0].frame = 1007 to equal 1000.
    Expected $[1].frame = 2007 to equal 2000.
    Expected $[2].frame = 3007 to equal 3000.
    Expected $[3].frame = 4007 to equal 4000.
    Expected $[4].frame = 5007 to equal 5000.
    Expected $[5].frame = 5007 to equal 5000.
        at <Jasmine>
        at TestScheduler.assertDeepEqual (src/misc/test_scheduler.ts:4:20)
        at filter (node_modules/rxjs/dist/esm/internal/testing/TestScheduler.js:139:22)
        at Array.filter (<anonymous>)
        at TestScheduler.flush (node_modules/rxjs/dist/esm/internal/testing/TestScheduler.js:137:43)
Chrome Headless 117.0.5938.132 (Linux x86_64): Executed 7 of 270[31m (1 FAILED)[39m (0 secs / 0.027 secs)

Expected output:

interval should emit every second FAILED

I tried this, which worked. However, how to improve it? i.e. without piping the output to sed.

grep ' FAILED[^)]' in | sed -e 's#.*Linux x86_64)##' -e 's# ##'

Solution

  • If you have gnu-grep, and not literally taking Linux x86_64) into account but prevent matching the parenthesis at all, you could make chars other than ) before FAILED, and then assert no more parenthesis after it:

    grep -oP "\h*\K[^)]*FAILED(?=[^()]*$)" file
    

    The pattern matches:

    • \h*\K Match optional horizontal whitespace chars and the forget what is matched so far
    • [^)]* Match optional characters other than )
    • FAILED Match literally
    • (?=[^()]*$) Positive lookahead to assert no more parenthesis until the end of the string

    Output

    interval should emit every second FAILED