Search code examples
pythonsplitioreadlinemodulus

How to read then parse with split and write into a text file?


I'm struggling to get readline() and split() to work together as I was expecting. Im trying to use .split(')') to cut down some data from a text file and write some of that data to a next text file.

I have tried writing everything from the line. I have tried [cnt % 2] to get what I expected.

   line = fp.readline()
   fw = open('output.txt', "w+")
   cnt = 1
   while line:
       print("Line {}: {}".format(cnt, line.strip()))
       line = fp.readline()
       line = line.split(')')[0]
       fw.write(line + "\n")
       cnt += 1

Example from the text file im reading from.

WELD 190 Manufacturing I Introduction to MasterCAM (3) 1½ hours lecture - 4½ hours laboratory Note: Cross listed as DT 190/ENGR 190/IT 190 This course will introduce the students to MasterCAM and 2D and basic 3D modeling. Students will receive instructions and drawings of parts requiring 2- or 3-axis machining. Students will design, model, program, set-up and run their parts on various machines, including plasma cutters, water jet cutters and milling machines. WELD 197 Welding Technology Topics (.5 - 3)

I'm very far off from actually effectively scraping this data but I'm trying to get a start.

My goal is to extract only class name and number and remove descriptions.

Thanks as always!


Solution

  • I believe to solve your current problem, if you're only attempting to parse one line, you will simply need to move your second line = fp.readline() line to the end of the while loop. Currently, you are actually starting the parsing from the second line, because you have already used a readline in the first line of your example code.

    After the change it would look like this:

       line = fp.readline() # read in the first line
       fw = open('output.txt', "w+")
       cnt = 1
       while line:
           print("Line {}: {}".format(cnt, line.strip()))
           line = line.split(')')[0]
           fw.write(line + "\n")
           cnt += 1
           line = fp.readline() # read in next line after parsing done
    

    Output for your example input text:

    WELD 190 Manufacturing I Introduction to MasterCAM (3