Search code examples
pythonfile-processing

Handle Commas While Reading Text?


My code is trying to read all log files throughout the specified directory in rootDir and write certain pieces of information from that log file to an outputFile

The issue I'm having is searchObj_Archive_date.group(), fullpath,zDiscsVar,zCopiesVar, and searchObj_Year_3or6.group() aren't being read into my file from certain lines within the log files. This happens for only about 10% of the total outputted lines of text, so I'm confused why it's only happening some of the time, so instead of E:\filepath\text.txt | 5/23/2015 12:00 | C:\anotherFilePath\text.txt | 23 | 23 | 5Year, I get E:\filepath\text.txt | | | | |

Any insight as to why this error is occuring would be greatly appreciated. My code is below:

After doing some researched, I found that what's causing my error is that whenever a line has a comma , in it. It stops reading the line at that comma and skips to the next line, does anybody know a workaround to this?

An example of my input text that's giving me problems: 11/23/2015 12:34:58 Adding file D:\fp\fp1\fp2\text, text, text.txt

Normally these lines don't have commas, so does anyone know of a way to handle commas when reading in lines of text?

import os
import re

fo = open('outputFile', 'w')
fo.write("Col|Col|Col|Col|Col|Col \n")
# 1.walk around directory and find log file in one of folders
rootDir = "C:\\Users\\"
for path, dirs, files in os.walk(rootDir, topdown=False):
for filename in files:
    fullpath = os.path.join(path, filename)
    if (filename=="text.txt"):
        # 2.open file. read from file
        fi2 = open(fullpath, 'r+')
        fi2Content = fi2.read()
        zDiscs = re.search(r'(\sNumber of copies: (\d{1,2}))', fi2Content, re.M|re.I)
        if zDiscs:
            zDiscsVar = str(zDiscs.group(2))
        zCopies = re.search(r'(Number of Discs in Set: (\d{1,2}))', fi2Content, re.M|re.I)
        if zCopies:
            zCopiesVar = str(zCopies.group(2))
        fi = open(fullpath, 'r')
        # 3.parse text in incoming file and use regex to find PATH
        for line in fi:
            #4.write path and info to outgoing file
            m = re.search(r'(Adding file(.*))',line)
            if m:
                searchObj_Adding_file = re.search(r'[A-Z]:\\.+', line, re.M|re.I)
                searchObj_Archive_date = re.search(r'^\d{2}\/\d{2}\/\d{4}\s\d{2}:\d{2}:\d{2}', line, re.M|re.I)
                searchObj_Year_3or6 = re.search(r'\dyear', line, re.M|re.I)
                if searchObj_Adding_file:
                   fo.write(searchObj_Adding_file.group() + "|")
                   fo.write(searchObj_Archive_date.group() + "|")
                   fo.write(fullpath + "|")
                   fo.write(zDiscsVar + "|")
                   fo.write(zCopiesVar + "|")
                   fo.write(searchObj_Year_3or6.group() + '\n')
#5. close file
fo.close()
fi.close()
fi2.close()

Solution

  • I removed my commas before searching the line of text. To do this, I inserted lineWoCommas = line.replace(',', '') after if: m