Search code examples
pythonpython-3.xtextcopy-paste

Trying to copy/paste text between Start and End points, transpose, and swap data points


I have some code that does a copy/paste from a large file into the parsed file that I need. Here is a working script.

with open('C:\\Users\\Excel\\Desktop\\test_in.txt') as infile, open('C:\\Users\\Excel\\Desktop\\test_out.txt', 'w') as outfile:
    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
        elif line.strip() == "End":
            copy = False
        elif copy:
            outfile.write(line)

Now, I am trying to figure out how to transpose each block of test, and swap adjacent data points multiple times. Maybe this would require a dta frame, I'm not really sure.

Here is a before image.

enter image description here

Here is an after image.

enter image description here

Here is my sample text.

file name
file type
file size
Start
        - data_type: STRING
          name: Operation
        - data_type: STRING
          name: SNL_Institution_Key
        - data_type: INTEGER
          name: SNL_Funding_Key
End
        - data_type: STRING
          name: Operation
        - data_type: STRING
          name: SNL_Institution_Key
        - data_type: INTEGER
          name: SNL_Funding_Key
Start
        - data_type: STRING
          name: SEDOL_NULL
        - data_type: STRING
          name: Ticker
        - data_type: DATETIME
          name: Date_of_Closing_Price
End 

It seems to me that this would be pretty hard to do in Python. If it's too difficult to do all of this, please let me know. Python may not be the right tool for the job. I don't know enough about Python to say for sure if this is the right approach or not. Thanks for your time.


Solution

  • split lines by colon, then merge them in different order. I added few flags to implement punctuation exactly as in your file, yet for mid sized data I usually use iterated with several regex or string replace

    with open('C:\\Users\\Excel\\Desktop\\test_in.txt') as infile, 
        file_start = True
        line = line.strip()
        next(infile)
        next(infile)
        next(infile)
        for line in infile:
            if line.strip() == "Start":
                if file_start:
                    file_start = False # write nothing first time
                else:
                   outfile.write('\n')
                line_start = True  # starting new line in the output file
            elif not line.strip() == "End":
                if not line_start:  
                    outfile.write(", ")
    
                linestart = False
    
                line = line.strip(" -")
                s = line.split(": ")
                outfile.write(": ".join(s[::-1]))