Search code examples
pythonregexparsingpython-re

How to isolate timestamp and expression from log file using regex


I have a log file which looks something like this:

04:26:24.664149 [PHY1 ] [I] [ 4198] PUCCH: cc=0; rnti=0x46, f=1a, n_pucch=12, dmrs_corr=0.995, snr=13.2 dB, corr=0.974, ack=1, ta=0.1 us

04:26:24.665067 [PHY0 ] [D] [ 4199] Worker 0 running

04:26:24.665166 [PHY0 ] [D] [ 4199] Sending to radio

04:26:24.666220 [PHY1 ] [I] [ 4200] PUCCH: cc=0; rnti=0x46, f=1, n_pucch=0, dmrs_corr=0.270, snr=-4.3 dB, corr=0.000, sr=no, ta=-9.0 us

04:26:24.666288 [PHY1 ] [D] [ 4200] Sending to radio

04:26:24.667305 [PHY0 ] [I] [ 4201] PUCCH: cc=0; rnti=0x46, f=2, n_pucch=0, dmrs_corr=0.989, snr=15.4 dB, corr=0.998, cqi=15 (cc=0), ta=0.2 us

04:26:24.667338 [MAC ] [D] [ 4201] ra_tbs=72/144, tbs_bytes=15, tbs=144, mcs=2

I am wanting to isolate the lines where there is a snr={value} entry, and also to copy the timestamp associated with that entry. I have put in bold the parts of the example I am wanting to extract with regex.

I have tried many different regex expressions to try and extract these two bits of information from my log file (on the lines in which they are present). It is important to note that the snr value can be positive or negative, and can vary from -999.9 dB to 999.9 dB. The timestamp is present on every line of the log file.

An example of my expected output would be: 04:26:24.664149 snr=13.2

Any help would be greatly appreciated!


Solution

  • Here is one approach using re.findall:

    inp = """04:26:24.664149 [PHY1   ] [I] [ 4198] PUCCH: cc=0; rnti=0x46, f=1a, n_pucch=12, dmrs_corr=0.995, **snr=13.2** dB, corr=0.974, ack=1, ta=0.1 us
    04:26:24.665067 [PHY0   ] [D] [ 4199] Worker 0 running
    04:26:24.665166 [PHY0   ] [D] [ 4199] Sending to radio
    04:26:24.666220 [PHY1   ] [I] [ 4200] PUCCH: cc=0; rnti=0x46, f=1, n_pucch=0, dmrs_corr=0.270, **snr=-4.3** dB, corr=0.000, sr=no, ta=-9.0 us
    04:26:24.666288 [PHY1   ] [D] [ 4200] Sending to radio"""
    
    matches = re.findall("(\d{2}:\d{2}:\d{2}\.\d{6})[^\r\n]*(snr=-?\d+(?:\.\d+)?)", inp)
    print(matches)
    

    This prints:

    [('04:26:24.664149', 'snr=13.2'), ('04:26:24.666220', 'snr=-4.3')]