I have a text file with a structure like:
0 1.23
1 2.76
2 2.46
3 6.23
0 1.33
1 2.57
2 2.87
3 5.34
.
.
.
I would like to arrange a new file like:
0 1.23 1.33 ...
1 2.76 2.57 ...
2 2.46 2.87 ...
3 6.23 5.34 ...
I can do it in a very primitive way with:
# Number of data group
numberofdatagroup = 5
# Number of data in each group
data = 4
arr = [[0 for col in range(2*numberofdatagroup)] for row in range(data)]
f = open(file, 'r')
lines = f.readlines()
f.close()
a=0
for i in range(0, numberofdatagroup, 1):
b = 0
for a in range (0, data, 1):
fields = lines[a].split()
arr[b][2*i] = fields[0]
arr[b][2*i+1] = fields[1]
b = b + 1
a = a + 2
# writing to output file
f = open(output, 'w+')
stringline = ""
for i in range(0, data, 1):
stringline = stringline + arr[i][0] + " " + arr[i][1] + " "
for j in range(1, numberofdatagroup, 1):
stringline = stringline + arr[i][2*j+1] + " "
f.write(stringline + "\n")
stringline = ""
f.close()
However, it is not always working. It is very sensible to empty lines. Is there any way to make it in a more clever way?
Here is an example how you could read the file into a Pandas DataFrame:
import pandas as pd
current, all_groups = [], []
with open('data.txt', 'r') as f_in:
for line in map(str.strip, f_in):
if line == "" and current:
all_groups.append(pd.DataFrame(current)[1])
current = []
else:
current.append(line.split(maxsplit=1))
if current:
all_groups.append(pd.DataFrame(current)[1])
final_df = pd.concat(all_groups, axis=1)
final_df.columns = range(len(final_df.columns))
print(final_df)
Prints:
0 1
0 1.23 1.33
1 2.76 2.57
2 2.46 2.87
3 6.23 5.34
EDIT: Without pandas
library:
current, all_groups = [], []
with open("data.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "" and current:
all_groups.append(current)
current = []
else:
current.append(line.split(maxsplit=1))
if current:
all_groups.append(current)
for g in zip(*all_groups):
print('{} {} {}'.format(g[0][0], g[0][1], ' '.join(v for _, v in g[1:])))