I have some code that does a copy/paste from a large file into the parsed file that I need. Here is a working script.
with open('C:\\Users\\Excel\\Desktop\\test_in.txt') as infile, open('C:\\Users\\Excel\\Desktop\\test_out.txt', 'w') as outfile:
copy = False
for line in infile:
if line.strip() == "Start":
copy = True
elif line.strip() == "End":
copy = False
elif copy:
outfile.write(line)
Now, I am trying to figure out how to transpose each block of test, and swap adjacent data points multiple times. Maybe this would require a dta frame, I'm not really sure.
Here is a before image.
Here is an after image.
Here is my sample text.
file name
file type
file size
Start
- data_type: STRING
name: Operation
- data_type: STRING
name: SNL_Institution_Key
- data_type: INTEGER
name: SNL_Funding_Key
End
- data_type: STRING
name: Operation
- data_type: STRING
name: SNL_Institution_Key
- data_type: INTEGER
name: SNL_Funding_Key
Start
- data_type: STRING
name: SEDOL_NULL
- data_type: STRING
name: Ticker
- data_type: DATETIME
name: Date_of_Closing_Price
End
It seems to me that this would be pretty hard to do in Python. If it's too difficult to do all of this, please let me know. Python may not be the right tool for the job. I don't know enough about Python to say for sure if this is the right approach or not. Thanks for your time.
split lines by colon, then merge them in different order. I added few flags to implement punctuation exactly as in your file, yet for mid sized data I usually use iterated with several regex or string replace
with open('C:\\Users\\Excel\\Desktop\\test_in.txt') as infile,
file_start = True
line = line.strip()
next(infile)
next(infile)
next(infile)
for line in infile:
if line.strip() == "Start":
if file_start:
file_start = False # write nothing first time
else:
outfile.write('\n')
line_start = True # starting new line in the output file
elif not line.strip() == "End":
if not line_start:
outfile.write(", ")
linestart = False
line = line.strip(" -")
s = line.split(": ")
outfile.write(": ".join(s[::-1]))