Search code examples
pythonfor-loopif-statementlinestext-processing

If statements inside a for loop over lines in a file


I have a file data.txt:

./path1
 * WITH LDLDLDLDLDLD                 *
  KDKDKDKDKD
  LDLDLDLDLDLDLDLD
  LDFLFLFLFLFLFLF
['-2.6993']
['-2.6983']
['-2.4490']
  LSLSLSLSLSL
['-2.6993']
['-2.6983']
['-2.4490']
  KKKGKGKGKGKGKGKG
['-79.7549']
  LDLDLDLDLDLDLDLDL
['-126.6208']
['-93.9881']
  KDKDKDKDKDKDKDKD
['-156.9296']
['-135.3548']
  LDLDLDLDDLDDLDLDLD
['-178.3941']
['-162.8602']
['-42.7064']
  KDKDKDKDKDLDLDLDLDLD
['-193.3335']
['-181.9782']
['-68.6555']

./path2
 * WITH DLLDLDLDLDLLDLD                 *
  LDLDLDLDLDLDLD
  BEBEBEBEBEBEL
  LSLSLSLSLSLSL
['-2.6993']
['-2.6983']
['-2.4490']
  OSOSOSOSOSOSOSOS
['-2.6993']
['-2.6983']
['-2.4490']
  KDKDKDKDKDKDKDKDKD
['-156.9296']
['-135.3548']
  MDMDMDMDMDMDDMDM
['-178.3941']
['-162.8602']
['-42.7064']
  KFKFKFKFPKLDLDLD
['-193.3335']
['-181.9782']
['-105.4751']
['-96.2342']

From which I would like to print the path and the negative values on that path.

The following code achieves this goal:

import re
import os
import numpy as np

f = open('data.txt', 'r')
All_aux = []

for line in f:
         if re.match(r"^\.", line):
          print line

         if re.match(r"^\[", line):

                 target2 = line.translate(None, "[]'',")    
                 aux = target2.split()
                 All_aux.append(aux)
                 flat_list = [item for sublist in All_aux for item in sublist]

print 'len(negatives) = ' , len(flat_list)

But the information printed is the following:

./path1

./path2

len(negatives) =  32

Once the first if re.match(r"^\.", line): is matched, it prints the line, but it does not print the first 17 negative values. Instead, this value is saved and summed to the 15 negative values found on the 2nd path.

I would like to obtain the following:

./path1

len(negatives) =  17

./path2

len(negatives) =  15

Is there a way to achieve this?


Solution

  • This is what I meant by the comment. I have made a few other improvements as well, for example using a string method which is generally simpler and more efficient than a regular expression.

    After a little thought and discussion with @tripleee I have dispensed with the flat_list since all you were doing is counting the length.

    I have commented, but please ask if you don't understand anything:

    # None of the imports are required
    # We only need a count
    negatives = 0
    
    # Previously you were not closing the file
    # This closes it automagically
    with open('data.txt', 'r') as f:
        for line in f:
            # No need for a regular expression here
            if line.startswith("./"):
                if negatives:
                    print 'len(negatives) = ' , negatives, '\n'
                    negatives = 0
                print line
    
            # Used "else if" since both `startswith` can't be true
            elif line.startswith("["):
                target2 = line.translate(None, "[]'',")
                # simplified
                negatives += len(target2.split())
    
    if negatives:
        print 'len(negatives) = ' , negatives
    

    This gives:

    ./path1
    
    len(negatives) =  17
    
    ./path2
    
    len(negatives) =  15