I have a text file as follows.
Wednesday,01,September,2021
Wednesday,01,September,2021,00:15
cod,10,1=0,2=2
cod,18,1=27,2=18,3=20
Wednesday,01,September,2021
Wednesday,01,September,2021,00:30
cod,10,1=5,2=6
cod,18,1=20,2=19,3=15
I want to make separate CSV files as output in each of which the row is cod10 and cod18 respectively and the column labels are the keys (1=0 in cod,10 means the row name is cod10, the column name is 1 and the value is 0). First CSV file:
id 1 2
cod10 0 2
cod10 5 6
Second CSV file:
id 1 2 3
cod18 27 18 20
cod18 20 19 15
I would be thankful for any idea.
You could open and read the text file and save only the rows that start with "cod". For those rows, you split on "=" and save the result in a nested dictionary:
with open('file.txt') as f:
lines = f.read().split('\n')
out = {}
for line in lines:
first, id_num, *splits = line.split(',')
if first == 'cod':
for s in splits:
k, v = s.split('=')
out.setdefault(first + id_num, {}).setdefault(int(k), []).append(v)
Then this produces a dictionary that looks like:
{'cod10': {1: ['0', '5'], 2: ['2', '6']},
'cod18': {1: ['27', '20'], 2: ['18', '19'], 3: ['20', '15']}}
Now since you tagged this as pandas, you could use pandas to create a DataFrame object for each key-value pair of out
(produced above) and then save the values at each id as a csv as below. The main job here is to explode
the columns (since each cell houses a list):
for k, v in out.items():
df = pd.DataFrame.from_dict({k:v}, orient='index').rename_axis(index=['id'])
df = df.explode(df.columns.tolist()).reset_index()
df.to_csv(f'file_{k}.csv')
print(df, end='\n\n\n')
Output:
id 1 2
0 cod10 0 2
1 cod10 5 6
id 1 2 3
0 cod18 27 18 20
1 cod18 20 19 15