Search code examples
awk

awk in pipeline to match patterns from multiple lines and print summary on trigger


There is a log file from a imaging session where I need to extract parameters that are in multiple lines preceding a certain trigger. Once the trigger is found I want to print the collected data and start again.

Using grep I extract the lines prior to the trigger. Trigger shall be 'AutoFocus completed'. This is what I get and wish to pipe further into awk for parameter extraction and printing.

$ grep -B4 'AutoFocus completed' 20240831-195957-3.1.1.9001.10824-*.log
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:19:16.2135|INFO|FocuserVM.cs|MoveFocuserInternal|212|Moving Focuser to position 21611
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:19:20.4192|INFO|CameraVM.cs|Capture|737|Starting Exposure - Exposure Time: 5s; Filter: ; Gain: 100; Offset 50; Binning: 1x1;
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:19:27.9594|INFO|StarDetection.cs|Detect|244|Average HFR: 2.545578870303149, HFR σ: 0.3919148934340784, Detected Stars 609, Sensitivity High, ResizeFactor: 0.33
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:19:27.9713|INFO|FocuserMediator.cs|BroadcastSuccessfulAutoFocusRun|45|Autofocus notification received - Temperature 28.5
20240831-195957-3.1.1.9001.10824-202408.log:2024-08-31T20:19:27.9719|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus completed
--
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:55:18.0098|INFO|FocuserVM.cs|MoveFocuserInternal|212|Moving Focuser to position 21573
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:55:22.1916|INFO|CameraVM.cs|Capture|737|Starting Exposure - Exposure Time: 5s; Filter: ; Gain: 100; Offset 50; Binning: 1x1;
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:55:29.6817|INFO|StarDetection.cs|Detect|244|Average HFR: 2.4623768995206703, HFR σ: 0.37028813097450053, Detected Stars 832, Sensitivity High, ResizeFactor: 0.33
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T20:55:29.6834|INFO|FocuserMediator.cs|BroadcastSuccessfulAutoFocusRun|45|Autofocus notification received - Temperature 27.6
20240831-195957-3.1.1.9001.10824-202408.log:2024-08-31T20:55:29.6839|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus completed
--
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T22:01:56.2682|INFO|FocuserVM.cs|MoveFocuserInternal|212|Moving Focuser to position 21589
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T22:02:00.4053|INFO|CameraVM.cs|Capture|737|Starting Exposure - Exposure Time: 5s; Filter: ; Gain: 100; Offset 50; Binning: 1x1;
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T22:02:08.0464|INFO|StarDetection.cs|Detect|244|Average HFR: 2.337959785686696, HFR σ: 0.43475116635610034, Detected Stars 972, Sensitivity High, ResizeFactor: 0.33
20240831-195957-3.1.1.9001.10824-202408.log-2024-08-31T22:02:08.0476|INFO|FocuserMediator.cs|BroadcastSuccessfulAutoFocusRun|45|Autofocus notification received - Temperature 26
20240831-195957-3.1.1.9001.10824-202408.log:2024-08-31T22:02:08.0477|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus completed

I want to capture:

  • The number after 'Moving Focuser to position'
  • The number after 'Average HFR:'
  • The number after 'Detected Stars'

Once I encounter 'AutoFocus completed' I want to print the time and date that is encoded in the part of the line between .log and |INFO followed by position hfr and stars

I have tried a partial code but it failed to print out hfr and the date

$ grep -B4 'AutoFocus completed' 20240831-195957-3.1.1.9001.10824-*.log \
| awk '/Moving Focuser to/ { pos= $5 }; /Average HFR/ { hfr=$3}; /AutoFocus completed/ { print $1,pos,hfr }'
 2.545578870303149,.1.9001.10824-202408.log:2024-08-31T20:19:27.9719|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21611
 2.4623768995206703,1.9001.10824-202408.log:2024-08-31T20:55:29.6839|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21573
 2.337959785686696,.1.9001.10824-202408.log:2024-08-31T22:02:08.0477|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21589
 2.360274405865274,.1.9001.10824-202408.log:2024-08-31T23:09:24.1184|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21558
 2.3450609716393003,1.9001.10824-202409.log:2024-09-01T00:16:57.9708|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21566
 2.361727006131884,.1.9001.10824-202409.log:2024-09-01T01:09:04.0561|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21568
 2.561488030855142,.1.9001.10824-202409.log:2024-09-01T02:12:22.2001|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21561
 2.3263163204824595,1.9001.10824-202409.log:2024-09-01T03:19:34.1076|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21610
 2.371915558447258,.1.9001.10824-202409.log:2024-09-01T04:26:36.0461|INFO|AutoFocusVM.cs|StartAutoFocus|277|AutoFocus 21588

Desired output:

2024-08-31T20:19:27.9719 21611  2.545578870303149  609
2024-08-31T20:55:29.6839 21573  2.4623768995206703 832
2024-08-31T22:02:08.0477 21589  2.337959785686696  972

Running awk in WSL.

awk --version
GNU Awk 5.1.0, API: 3.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)
Copyright (C) 1989, 1991-2020 Free Software Foundation.

Update to address Ed Morton's answer

The awk script works only in conjunction with preceeding grep. Seems there is some part of the pattern that hooks only to the file name prepended by grep. Without grep it does not work. That is perfectly acceptable though and I will mark it as chosen answer.

This works:

grep -B4 'AutoFocus completed' 20240831-195957-3.1.1.9001.10824-*.log | awk -f AF_stats.awk
2024-08-31T20:19:27.9719 21611 2.545578870303149 609
2024-08-31T20:55:29.6839 21573 2.4623768995206703 832
2024-08-31T22:02:08.0477 21589 2.337959785686696 972
2024-08-31T23:09:24.1184 21558 2.360274405865274 999
2024-09-01T00:16:57.9708 21566 2.3450609716393003 1076
2024-09-01T01:09:04.0561 21568 2.361727006131884 1067
2024-09-01T02:12:22.2001 21561 2.561488030855142 1017
2024-09-01T03:19:34.1076 21610 2.3263163204824595 1021
2024-09-01T04:26:36.0461 21588 2.371915558447258 1008

This does not work:

cat  20240831-195957-3.1.1.9001.10824-*.log | awk -f AF_stats.awk

Solution

  • Using GNU awk for the 3rd arg to match():

    $ cat tst.awk
    match($0, /Moving Focuser to position ([0-9]+)/, a) {
        vals["focus"] = a[1]
    }
    match($0, /Average HFR: ([0-9.]+)/, a) {
        vals["hfr"] = a[1]
    }
    match($0, /Detected Stars ([0-9.]+)/, a) {
        vals["stars"] = a[1]
    }
    match($0, /\.log:([^|]+).*AutoFocus completed/, a) {
        print a[1], vals["focus"], vals["hfr"], vals["stars"]
        delete vals
    }
    

    $ awk -f tst.awk file
    2024-08-31T20:19:27.9719 21611 2.545578870303149 609
    2024-08-31T20:55:29.6839 21573 2.4623768995206703 832
    2024-08-31T22:02:08.0477 21589 2.337959785686696 972
    

    I expect that will work on your original input file without you needing to run grep on it, but without seeing it that's just a guess.