Search code examples
pythonepochpython-re

python update epoch value with additional time value


I'm processing this data in Python, which has first four fields separated by "|", Fifth field onwards separated by space.

VER:1|long=|lat=|device=D3052|eventid=31007311 status=Active time=1528496310749 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3010|eventid=31007312 status=Active time=1528496310765 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3094|eventid=31007313 status=Active time=1528496315380 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3052|eventid=31007314 status=Active time=1528496317513 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3010|eventid=31007315 status=Active time=1528496329604 priority=1 desitnationHost= group=cluster1 

Time field contains epoch time value, need to update this value by 1 year

This data is contained in multiple text files in a directory which needs to be processed by reading each text file, line by line.

My approach in Python -

#import required python library
import os
import re

#read a text file (later need to loop through multiple text files)
h = open('C:/directory/new_1.txt', 'r')
  
# Reading from the file 
content = h.readlines()
  
# Iterating through the content 
# Of the file 
for line in content:
    milli_second_in_year = 31536000000
    l = re.sub(r'time=(\d+)',r'\1d','milli_second_in_year')
    print(l)

In my above approach, I cannot sum up extracted time value with the 'milli_second_in_year'

I tried below changes, but unable to get the expected output -

for line in content:
    m = re.search(r'time=(\d+)',line)    
    match = m.group(1)
    match = int(match)+31536000000
    print(match)

getting the desired time value, unable to write back to the file again

Expected output (updated time values) -

VER:1|long=|lat=|device=D3052|eventid=31007311 status=Active time=1560032310749 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3010|eventid=31007312 status=Active time=1560032310765 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3094|eventid=31007313 status=Active time=1560032315380 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3052|eventid=31007314 status=Active time=1560032317513 priority=1 desitnationHost= group=cluster1
VER:1|long=|lat=|device=D3010|eventid=31007315 status=Active time=1560032329604 priority=1 desitnationHost= group=cluster1

Solution

  • If I understood correctly what you want to do, you can do something like this:

    milli_second_in_year = 31536000000
    with open('C:/directory/new_1.txt', 'r') as f:
        with open('C:/directory/new_1_adapted.txt', 'w+') as fnew:
            for line in f:
                m = re.search(r'time=(\d+)', line)
                time_value = m.group(1)
                new_time_value = str(int(time_value) + milli_second_in_year)
                newline = line.replace(time_value, new_time_value)
                fnew.write(newline)
    

    A couple of things to note:

    • using the context manager to open files (with open...) ensures that files are always correctly closed
    • no need to use readlines - you can just iterate over the lines using the file handle
    • I'm not sure if you want to overwrite the same file: in this case you have to either first write to another file, then delete the first one and rename the second, or collect the lines in an array and write them back after the file is closed (I added a version below)
    • your usage of re.sub is incorrect - look up the documentation, if you want to use it (I didn't here)
    • I did not add any error handling - if your file has the wrong format, this will probably just crash
    • lastly: I haven't tested this, so it may have bugs...

    Here is a version that will overwrite the same file:

    milli_second_in_year = 31536000000
    file_path = 'C:/directory/new_1.txt'
    new_lines = []
    with open(file_path, 'r') as f:
        for line in f:
            m = re.search(r'time=(\d+)', line)
            time_value = m.group(1)
            new_time_value = str(int(time_value) + milli_second_in_year)
            new_line = line.replace(time_value, new_time_value)
            new_lines.append(new_line)
    
    with open(file_path, 'w') as f:
        f.writelines(new_lines)