Search code examples
regexawkmatchip-address

If line starts with a specific string, print only the ip addresses contained within that line using awk, one per line


I have the following text output from a command (cmdagent -i) that I am parsing with awk:

Component: McAfee Agent  
AgentMode: 1  
Version: 5.0.6.491  
GUID: f0bcc8de-1aa6-00a4-01b9-00505af06706  
TenantId: N/A  
LogLocation: /var/McAfee/agent/logs  
InstallLocation: /opt/McAfee/agent  
CryptoMode: 0  
DataLocation: /var/McAfee/agent  
EpoServerList: 10.0.25.15|epo1|epo1.example.com|10.0.25.20|epo2|epo2.example.com 
EpoPortList: 443  
EpoServerLastUsed: 10.0.25.15  
LastASCTime: N/A  
LastPolicyUpdateTime: 0  
EpoVersion: 5.3.1  
Component: McAfee Agent

I would like to match on the line that starts with the string "EpoServerList", and print only the IP addresses contained within this line, using only 1 awk command.

If I use two awk commands, I can make it work, but I know it can be done with only one.

For example:

# ./cmdagent -i | awk '/^EpoServerList/' | awk -v RS='([0-9]+\\.){3}[0-9]+' 'RT{print RT}'

Which gives the following (desired) output:

10.0.25.15
10.0.25.20

I've tried the following so far:

# ./cmdagent -i | awk -v RS='([0-9]+\\.){3}[0-9]+' '$0 ~ /^EpoServerList/ RT{print RT}'

Which doesn't return any matches

And

./cmdagent -i | awk -v RS='([0-9]+\\.){3}[0-9]+' '$0 ~ /^EpoServerList/; RT{print RT}'

Which returns the version number from an un-wanted line:

 5.0.6.491
 10.0.25.15
 10.0.25.20

And seems to not consider "/^EpoServerList/" which I am trying to use as criteria to exclude the line containing the version string "5.0.6.491"

How can I match on EpoServerList and still use the record separator with the regular expression to match and print the IP addresses using just 1 awk statement?

This is GNU Awk 4.0.2 on RHEL 7 x86_64, using the bash shell.


Solution

  • First match lines, then iterate over fields matching the IPv4 pattern:

    awk -F '[|: ]' '/^EpoServerList: / { for (i=1; i<NF; i++) { if (match($i, "([0-9]+\.){3}[0-9]+")) { print $i; } } }'
    

    Using multiple field separators allows for treating the label as a column, and therefore skipping it when matching for IP addresses.

    I know there are many here who will say that awk > grep + some-other-tool, but I find that combo produces far more readability upon a glance:

    grep '^EpoServerList: ' | grep -oP '([0-9]+\.){3}[0-9]+'
    

    Note that -o and -p here are GNU extensions, which I'm using because of your reliance on RHEL.