I just posted a question about using grep on multi-line shell variable, but I just realized that what I needed was slightly different. grep multiline shell variable from output of executable file
What I tried to do was this: I have a grep/awk result (I'll name this as result1):
blahblah ID1 blahblah aaa
blahblah ID2 blahblah bbb
blahblah ID3 blahblah ccc
...
blahblah ID(m) blahblah mmm
blahblah ID(n) blahblah nnn
And I have another awk result from a execution output (run | awk ~~~) (I'll name this as result2):
ID1 (some sentence 1)
ID2 (some sentence 2)
ID3 (some sentence 3)
...
IDn (some sentence n)
I'm trying to get the ID1~n and the last part of result1 (aaa~nnn) from result1 and add it to result2. what I want to make looks like this:
ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn
I somehow succeeded getting
ID1 aaa
ID2 bbb
from result1, so I only have the IDn's that I have in result2, but I have no idea how to separate it and put it exactly with matching lines of result2, so I can match ID1-aaa, ID2-bbb...and so on, so I can get
ID1 (sentence) aaa
ID2 (sentence) bbb
...
IDn (sentence) nnn
something like this.
plus, those ID1 ~ IDn may not be always in order.
Assumptions:
result1
has space-separated columns and the strings aaa
... nnn
are in the last columns.IDn
in result1
consists of literal string ID
followed by
digits.IDn
in result2
are located in the first column.Then would you please try:
awk '
NR==FNR {
if (match($0, /ID[0-9]+/)) {
id = substr($0, RSTART, RLENGTH)
a[id] = $NF
}
next
}
{
print $0, a[$1]
}
' result1 result2
NR==FNR { .. ; next}
block is an idiom to be exectuted
for the file only in the first argument (result1 in this case).match($0, /ID[0-9]+/)
returns true if a substring
in the record matches a string ID
followed by digits, assigining
awk variables RSTART
and RLENGTH
to the starting position and
the length of the match, individually.substr($0, RSTART, RLENGTH)
extracts the substring IDn
where
n
is the digits.a[id] = $NF
associates the last part (e.g. aaa
) to the id.{print $0, a[$1]}
block is executed for result2
only.If result1
is the output of command1 ..
and result2
is of command2 ..
,
you can say:
awk '
(the same lines as above)
' <(command1 ..) <(command2 ..)
instead of specifying the filenames.