I'm using sed to process a very long PeopleSoft trace file. The trace file is full of information, but I am interested in how long each step took to process, and the number of rows affected in the database by the step.
I have a fairly ugly bash script which is basically a one-line series of sed pipes. It works pretty quickly, so it is OK. Of course I would welcome any suggestions to clarify the code.
I must say that while it is ugly, it is clear (to me) what sequential steps it is going through. Often sed/awk one liners are completely unintelligible to a non-expert.
#!/bin/bash
sed '/./=' "$1" | sed '/./N; s/\n/ /' | sed '/--/!d' | sed '/-- B/d' | sed '/-- /d' | sed '/Instance/d' | sed '1,/Application Engine ended normally/!d' | awk '/Row/ {n = $0; getline; print $0 n; next } 1' > "$1".txt
Timings
Most lines contain timing information, in the format for HH:MM:SS. Many statements are running subsecond so the timing field often doesn't change. I'd like to produce a new field which is the offset of the previous line. Only the seconds need to be considered as most operations are subsecond anyway.
246 -- 14.54.43 .(TL_TIMEADMIN.MAIN.Step040) (PeopleCode)238 -- Row(s) affected: 1
247 -- 14.54.43 Program Skipping Step due to non-zero return code from PeopleCode at TL_TIMEADMIN.MAIN.Step040
249 -- 14.54.43 .(TL_TIMEADMIN.MAIN.Step050) (Call Section TL_TIMEADMIN.DISPATCH)
251 -- 14.54.45 ..(TL_TIMEADMIN.DISPATCH.Step02a) (PeopleCode)
253 -- 14.54.45 ..(TL_TIMEADMIN.DISPATCH.Step02a) (SQL)
266 -- 14.54.45 ..(TL_TIMEADMIN.DISPATCH.Step02b) (Call Section TL_TA000200.TA000200)258 -- Row(s) affected: 1
268 -- 14.54.46 ...(TL_TA000200.TA000200.Step001) (PeopleCode)
270 -- 14.54.46 ...(TL_TA000200.TA000200.Step001) (Call Section FUNCLIB_TLTA.STEPMSG)
I would like to see something like this:
246 -- 14.54.43 0 .(TL_TIMEADMIN.MAIN.Step040) (PeopleCode)238 -- Row(s) affected: 1
247 -- 14.54.43 0 Program Skipping Step due to non-zero return code from PeopleCode at TL_TIMEADMIN.MAIN.Step040
249 -- 14.54.43 0 .(TL_TIMEADMIN.MAIN.Step050) (Call Section TL_TIMEADMIN.DISPATCH)
251 -- 14.54.45 2 ..(TL_TIMEADMIN.DISPATCH.Step02a) (PeopleCode)
253 -- 14.54.45 0 ..(TL_TIMEADMIN.DISPATCH.Step02a) (SQL)
266 -- 14.54.45 0 ..(TL_TIMEADMIN.DISPATCH.Step02b) (Call Section TL_TA000200.TA000200)258 -- Row(s) affected: 1
268 -- 14.54.46 1 ...(TL_TA000200.TA000200.Step001) (PeopleCode)
270 -- 14.54.46 0 ...(TL_TA000200.TA000200.Step001) (Call Section FUNCLIB_TLTA.STEPMSG)
OK with an awk solution?
{
split($3, time, ".")
if (NR == 1) prev = time[3]
$3 = $3 " " time[3] - prev
prev = time[3]
print
}
output:
$ awk -f time.awk input
246 -- 14.54.43 0 .(TL_TIMEADMIN.MAIN.Step040) (PeopleCode)238 -- Row(s) affected: 1
247 -- 14.54.43 0 Program Skipping Step due to non-zero return code from PeopleCode at TL_TIMEADMIN.MAIN.Step040
249 -- 14.54.43 0 .(TL_TIMEADMIN.MAIN.Step050) (Call Section TL_TIMEADMIN.DISPATCH)
251 -- 14.54.45 2 ..(TL_TIMEADMIN.DISPATCH.Step02a) (PeopleCode)
253 -- 14.54.45 0 ..(TL_TIMEADMIN.DISPATCH.Step02a) (SQL)
266 -- 14.54.45 0 ..(TL_TIMEADMIN.DISPATCH.Step02b) (Call Section TL_TA000200.TA000200)258 -- Row(s) affected: 1
268 -- 14.54.46 1 ...(TL_TA000200.TA000200.Step001) (PeopleCode)
270 -- 14.54.46 0 ...(TL_TA000200.TA000200.Step001) (Call Section FUNCLIB_TLTA.STEPMSG)