I've got a .csv file which I wanna save as .txt Here's my original data: Org-data
I save this file as .txt with the following rules for newlines, coments, etc:
np.savetxt(r'/test/text.txt', df, newline=',\n', comments='',fmt='%f', header=''.join(f'{col}\t' for col in df.columns)[:-1])
The problem is that I need all lines to have "," and the end of them except for the first line But in this situation, the newline rule which I specified in the code above applies to all the lines!
Know any way to prevent this to happen?
Or do you know another way to create the desired text file?
Example:
Consider this as the original data:
df = pd.DataFrame({'NumberOfPages:float': {0: 96.0, 1: 96.0, 2: 144.0},
'bid:token': {0: 3, 1: 3, 2: 5}})
the output should look like this:
bid:token NumberOfPages:float
3.000000 96.000000,
3.000000 96.000000,
5.000000 144.000000,
But I get this:
bid:token NumberOfPages:float,
3.000000 96.000000,
3.000000 96.000000,
5.000000 144.000000,
*Note the "," symbol after float in the first line.
You could remove the character afterwards
import pandas as pd
import numpy as np
df = pd.DataFrame({'NumberOfPages:float': {0: 96.0, 1: 96.0, 2: 144.0},
'bid:token': {0: 3, 1: 3, 2: 5}})
np.savetxt('test.txt', df, newline=',\n', comments='',fmt='%f', header=''.join(f'{col}\t' for col in df.columns)[:-1])
with open("test.txt", 'r+') as f:
lines = f.readlines()
lines[0] = lines[0].replace(',', '') # Only modify header
f.seek(0)
f.writelines(lines)
Output:
NumberOfPages:float bid:token
96.000000 3.000000,
96.000000 3.000000,
144.000000 5.000000,
Note that this could be slow for very large files, since f.readlines()
should read all lines of the file. If it is possible to overwrite the comma with a space, you can also use this, which does not load the complete file into memory:
import pandas as pd
import numpy as np
df = pd.DataFrame({'NumberOfPages:float': {0: 96.0, 1: 96.0, 2: 144.0},
'bid:token': {0: 3, 1: 3, 2: 5}})
np.savetxt('test.txt', df, newline=',\n', comments='',fmt='%f', header=''.join(f'{col}\t' for col in df.columns)[:-1])
with open("test.txt", 'r+') as f:
header = f.readline()
f.seek(0)
f.write(f"{header[:-2]} ")
Output:
NumberOfPages:float bid:token <--- beware this space
96.000000 3.000000,
96.000000 3.000000,
144.000000 5.000000,