I m trying to transfer data from a text file to csv. My text file contains lots of rows delimited by /n.
a: data
b: data2
$e = data3
number
a: data4 and so on
I need a column for a,b
, one for $ starting rows and a column for data after the =
or :
sign.
Can someone help me with a starting point?:)
txt = """a: data
b: data2
$e = data3
number
a: data4"""
txt_lines = txt.split("\n")
df_dict = dict()
for line in txt_lines:
if ":" in line or "$" in line:
# remove special character
column = line.split()[0].replace("$", "").replace(":", "")
# if column exists get row, if not, get empty list
row = df_dict.get(column, list())
# think as last word to data and update row
row.append(line.split()[-1])
# update dictionary
df_dict.update({column: row})
else:
# if ":" or "$" not exists in text, then think as number
column = 'number'
# if column exists get row, if not, get empty list
row = df_dict.get(column, list())
# update row
row.append(line)
# update dictionary
df_dict.update({column: row})
df = pd.DataFrame(dict([(k, pd.Series(v)) for k,v in df_dict.items()]))
df looks like it.
a b e number
0 data data2 data3 number
1 data4 NaN NaN NaN
It'll be better people help you if you question after read How to create a Minimal, Reproducible Example