I'd like to extract hostnames and datetime from a text file using Python. Below is the text and I need to extract the date behind 'notAfter=' and the hostname behind 'UnitId:' into a dictionary where the datetime is attached to the hostname.
- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/1
- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/0
- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/2
A pretty simple regex will do it notAfter=(.*)\n\s+UnitId: (.*)
import re
content = """- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/1
- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/0
- Stdout: |
notAfter=Jun 2 10:15:03 2031 GMT
UnitId: octavia/2"""
results = [{'datetime': dt, 'hostname': host}
for dt, host in re.findall(r"notAfter=(.*)\n\s+UnitId: (.*)", content)]
print(results)
# [{'datetime': 'Jun 2 10:15:03 2031 GMT', 'hostname': 'octavia/1'},
# {'datetime': 'Jun 2 10:15:03 2031 GMT', 'hostname': 'octavia/0'},
# {'datetime': 'Jun 2 10:15:03 2031 GMT', 'hostname': 'octavia/2'}]