I am trying to parse the output of the command
ip netns exec vpn_ns ipsec stroke statusall
(example pasted below).
The command provides multiple lines for each service (oof-#n-#i) terminator (#n) and instance using that terminator (#i), so
oof-2-1 is terminator server oof-2 instance 1.
How do I declare a match that collects all the lines prefixed by the same id?
From the example I am trying to get to something like this dict:
results = {
'connections':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Connections" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Connections" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Connections" ]
},
'sec_assocs':
{
'oof-1-1': [ 3 lines starting with oof-1-1 in section "Security Associations" ],
'oof-1-2': [ 3 lines starting with oof-1-2 in section "Security Associations" ]
'oof-2-1': [ 3 lines starting with oof-2-1 in section "Security Associations" ]
}
}
Where each id contains a list of the lines that start with it.
This is the full output from the StrongSwan command.
sample = """
Status of IKE charon daemon (strongSwan 5.9.1, Linux 4.15.0-162-generic, x86_64):
uptime: 25 hours, since Mar 23 15:23:53 2022
worker threads: 11 of 16 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 10
loaded plugins: charon aesni
Listening IP addresses:
169.254.123.2
192.168.51.254
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
And this is the sample that is used in the parsing solution:
sample = """
Connections:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
Security Associations:
oof-1-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-1: remote: [server] uses public key authentication
oof-1-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-1-2: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-1-2: remote: [server] uses public key authentication
oof-1-2: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart
oof-2-1: %any...10.1.0.242 IKEv2, dpddelay=30s
oof-2-1: remote: [server] uses public key authentication
oof-2-1: child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd
"""
Post-processing is the most direct way to go with this kind of handling of the parsed data. Here is the BNF for the structuring you are trying to parse:
group ::= label ':' line...
label ::= word...
line ::= prefix ':' rest_of_line
prefix ::= word '-' int '-' int
where word and int are just a Word of alphas or nums, and '...' indicates repetition.
This translates to pyparsing as:
import pyparsing as pp
COLON = pp.Suppress(":")
label = pp.Combine(
pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" "
)
prefix = pp.Combine(
pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums)
)
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])
group = pp.Group(label("group_label") + COLON + lines("subgroups"))
Pyparsing will generate this railroad diagram for you:
This parses your text, but to regroup the lines by their prefixes, we can add a parse action that uses itertools.groupby
:
def regroup_lines(t):
from itertools import groupby
from operator import itemgetter
ret = pp.ParseResults([])
parsed_lines = t[0]
for prefix, subgroup in groupby(parsed_lines, key=itemgetter("prefix")):
# each line in subgroup has the prefix and the rest of the line after the ':'
# repackage the multiple lines into a single group that is labeled with
# the common prefix, and contains the line contents
ret.append(pp.ParseResults.from_dict(
{
'prefix': prefix,
'lines': [line[1] for line in subgroup],
}
))
return ret
lines.add_parse_action(regroup_lines)
By using a parse action, the regrouping is done at parse time, so no additional post-parsing processing is needed.
Now we can parse your sample and get the regrouped results:
results = group[...].parseString(sample)
Here is a short function to print out the parsed groups:
def print_groups(parsed):
for group in parsed:
print(group.group_label)
for subgroup in group.subgroups:
print(f"- {subgroup.prefix}")
for line in subgroup.lines:
print(f" {line!r}")
print()
print_groups(results)
Which gives:
Connections
- oof-1-1
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'
Security Associations
- oof-1-1
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-1-2
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restart'
- oof-2-1
' %any...10.1.0.242 IKEv2, dpddelay=30s'
' remote: [server] uses public key authentication'
' child: dynamic === 0.0.0.0/0 TUNNEL, dpdaction=restartd'
Here is the full source for the working example:
import pyparsing as pp
COLON = pp.Suppress(":")
label = pp.Combine(pp.Word(pp.alphas)[1, ...], adjacent=False, joinString=" ")
label.setName("label")
prefix = pp.Combine(pp.Word(pp.alphas) + "-" + pp.Word(pp.nums) + "-" + pp.Word(pp.nums))
prefix.setName("prefix")
post_prefix = COLON + pp.restOfLine
line = pp.Group(prefix("prefix") + post_prefix)
lines = pp.Group(line[...])
def regroup_lines(t):
from itertools import groupby
from operator import itemgetter
ret = pp.ParseResults([])
for prefix, subgroup in groupby(t[0], key=itemgetter("prefix")):
ret.append(pp.ParseResults.from_dict(
{
'prefix': prefix,
'lines': [line[1] for line in subgroup],
}
))
return ret
lines.add_parse_action(regroup_lines)
group = pp.Group(label("group_label") + COLON + lines("subgroups"))
pp.autoname_elements()
group.create_diagram("groupby_1.html", show_results_names=True)
results = group[...].parseString(sample)
def print_groups(parsed):
for group in parsed:
print(group.group_label)
for subgroup in group.subgroups:
print(f"- {subgroup.prefix}")
for line in subgroup.lines:
print(f" {line!r}")
print()
print_groups(results)